Scientific Image Captioning with Visual Attention Models
Open Access
Author:
Chen, Bill
Area of Honors:
Engineering Science
Degree:
Bachelor of Science
Document Type:
Thesis
Thesis Supervisors:
C Lee Giles, Thesis Supervisor Lucas Jay Passmore, Thesis Honors Advisor
Keywords:
Deep Learning Machine Learning Information Retrieval Computer Vision Natural Language Processing Image Captioning
Abstract:
Figures are an essential tool for researchers to communicate their complex scientific narratives.
However, low-quality captions can cause confusion and misunderstanding among readers which
leads to a lost opportunity to convey important research impact. With scientific figures being
increasingly important and abundant, automatic figure captioning can enhance the sharing of
knowledge between researchers and the community. In this work, we introduced a new dataset
based on the Association of Computational Linguistics (ACL) that contains rich scientific figures
and captions in large quantity. We analyzed the large-scale dataset and demonstrated the ability of
novel attention-based deep neural networks to caption real-world scientific figures. Our experiment
results showed the various opportunities and challenges in generating captions for a diverse set of
figures.