natural language processing figure caption generation
Abstract:
Figures present an essential role in scientific papers as they convey critical messages to the audiences; however, low-quality captions frequently appear in scientific papers, which can create misunderstanding among readers and some trouble understanding. Natural language processing can help generate a concise and clear caption to improve a better understanding of scientific captions. In this thesis, we aim to create a baseline to test whether the existing natural processing language model can outperform the heuristic method of creating captions; such a method is to get the information from the X-Y graph using the EasyOCR and put them into our 45 caption templates. We use the SCICAP dataset as the main dataset to extract X-Y axis information and create the caption template. The caption template is created by going through the SCICAP dataset and writing down the common pattern that can be found. As a result, the dataset of captions is created and can be used as a baseline to compare with other machine generated-text by using human evaluation.