Translate the words and phrases in each clause using a dictionary. Words that have several, different meanings or ‘senses’ require more work, as we need to figure out the right ‘sense’ from the context (known as ‘Word Sense Disambiguation’). However, it is not easy to determine the right ‘sense’ of a word. For instance, WordNet ([1],[2]) defines multiple ‘senses’ for the word ‘park’, including ‘parkland’, ‘ballpark’, ‘parking lot’, ‘gear position’, and ‘commons or green’. The MT system may occasionally pick out the wrong 'sense', especially if the words in the sentence (i.e. the immediate context) are not helpful in determining the appropriate ‘sense’. This can result in severe errors in translation.
Even a simple preposition such as ‘by’ may indicate either an Agent (for e.g., ‘He was bitten by the dog’), or a Location (for e.g., ‘He was seen by the lake’). Both these examples are passive sentences, hence we can use the Animacy feature of the Head Noun inside the Preposition Phrase (i.e. ‘dog’ and ‘lake’ respectively) to determine that 'the lake' cannot indicate an Agent (i.e. 'by' is equivalent to 'beside' in that example). Please note that 'by' can also indicate a Time (for e.g., 'By the time he finished his breakfast, it was noon'). In Hindi, different postpositions are used when 'by' indicates an Agent, a Location, or a Time.
Translating the words in the clauses, we get the following intermediate forms (Note that the application of Case, Number, Tense, and Gender inflections of the verb and its arguments will result in significant changes in subsequent steps):
Hindi: | ek | AdmI | udyAn | meN | kuttA | dwArA | kAT__ thA |
Gloss: | one | man-M-sg | park-M-sg | in | dog-M-sg | by | bite-incomplete was-incomplete |
Hindi: | vakIl | vah | AdmI | shahar | par | nahIN | mukadmA | lagA__ | salAh | de__ |
Gloss: | lawyer-M-sg | that | man-M-sg | city-M-sg | on | not | suit-M-sg | put-incomplete | advice-F-sg | give-incomplete |