The E2B machine translation: a new approach to HLT
by Goutam Kumar Saha
This article describes a new approach to machine translator that translates English text into Bangla text with disambiguation. The translated Bengali text in English scripts is also useful for learning Bengali or Bangla language as a foreign language. At the same time the Bengali rural people who do not know English language well can understand the English matter with the translated output. The proposed approach is a new one that uses both the rule-based and transformation-based machine translation schemes along with three level parsing approaches. This is a significant contribution towards creation of a low-cost Human Language Technology (HLT). About two hundred million people in the West Bengal, Tripura (two states in India) and in Bangladesh (a country), speak and write Bangla as their first language. This English to Bangla (E2B)-ANUBAD or translator system or E2B takes a paragraph of English sentences as input sentences and produces equivalent Bangla sentences. The E2B-ANUBAD system compries of a preprocessor, morphological parser, semantic parser using English word ontology for context disambiguation, an electronic lexicon associated with grammatical information and a discourse processor. It also employs a lexical disambiguation analyzer. This system does not rely on a stochastic approach. Rather, it is based on a special kind of hybrid architecture of transformer and rule-based NLE architectures along with various linguistic knowledge components of both English and Bangla for creation of a low-cost HLT.