Effective Document-Element Synopsis Generation
Open Access
- Author:
- Harnett, Brendan Augustine
- Area of Honors:
- Computer Science
- Degree:
- Bachelor of Science
- Document Type:
- Thesis
- Thesis Supervisors:
- Prasenjit Mitra, Thesis Supervisor
Prasenjit Mitra, Thesis Supervisor
Dr. John Joseph Hannan, Thesis Honors Advisor - Keywords:
- document-element
search - Abstract:
- In order to keep abreast of many scientific developments in their field, modern researchers spend significant amounts of time reading those academic papers which report or summarize the conclusions of experiments similar to their own. To quickly determine the results of these experiments, researchers heavily rely on document-element entities such as figures, tables, and algorithms which are not part of the running text of a paper itself, but are instead pictorial representations of the results or conclusions described in the paper. However, such document-elements are almost always dicult to interpret or understand without an accompanying synopsis - a series of sentences selected from the paper itself for the purpose of describing the document element. In our experiments, we manually identify ideal synopses for 160 document-elements. We then test the effectiveness of algorithmic methods proposed by Bhatia et al. to automatically generate synopses by comparing the generated synopses to the ideal ones [1]. Interestingly, our experiments produced results very similar to those outlined in Bhatia et al., leading us to believe that our findings are fairly consistent for papers in different conferences, regardless of the subject matter. But although synopsis generation for the collection of all document-elements was consistent, effectiveness varied when comparing each document-element type individually with synopses for figures being the most complete, and those for algorithms being the least.