Examining the Efficiency of Rule-Based Machine Translation

Open Access
Author:
Burns, Kyle Joseph
Area of Honors:
Computer Science (Behrend)
Degree:
Bachelor of Science
Document Type:
Thesis
Thesis Supervisors:
  • Richard Zhao, Thesis Supervisor
  • Meng Su, Honors Advisor
Keywords:
  • Machine Translation
  • Hidden Markov Models
  • Rules-Based Machine Translation
Abstract:
Regardless of the advancements in technology, humans will always need language to communicate with one another. There are over 7,000 recognized languages around the world, and while not all of them are widely used, this still presents trouble for communicating across language barriers.1 Machine Translation has existed for decades, and the field has split into different schools of translation, mainly statistical machine translation and rules-based machine translation. Statistical machine translation, a method based heavily on probabilities adapted from pre-compiled language texts, is a faster approach but is more prone to errors, while rule-based translations, which require native speakers to notate all the rules of a language in a computer digestible format, are traditionally more accurate, with the tradeoff of being slower. The purpose of this experiment is to examine if a rules-based machine translation system can be improved enough to match a statistical based translation system in the amount of time needed for translations, while still retaining the accuracy of a rules-based translation system. This experiment, using various modified versions of the Apertium English to Spanish translation system which were ran on a testing corpus of 200 sentences, shows that a rules-based system can be moderately improved in time, but not to the extent of matching or overtaking a statistical system.