You are here

Sanskrit example

A sample translation output from the English to Sanskrit Machine Translation software is provided below. Explanations are provided in the Notes section for the abbreviations shown in the sample output.



The lawyer advised the man bitten by the dog in the park not to sue the city .




पुरं न अभियोक्तुं विधिज्ञ:अमुं नरममन्त्रयत य उपवने कुकुरेणादश्यत



Gloss
1:puram-city-(pura:ACC)-3S-n
2:na abhiyoktum-sue-(abhi-yuj:7U)-3S-m-Infinitive-Neg.
4:vidhijnaHa-lawyer-(vidhijna:NOM)-3S-m
5:amum-Rel.pro-(adas:ACC)-3S-m
6:naram-man-(nara:ACC)-3S-m
7:amantrayata-advise-(mantr:10A)-3S-m-past-Imperfect
8:yaHa-Rel.pro-(yad:NOM)-3S-m
10:upavane-park-(upvana:LOC)-3S-n
11:in-LOC
13:kukureNNa-dog-(kukura:INS)-3S-m
14:by-INS
15:adashyata-bite-(daNsh:1P)-3S-m-past-Passive
Panini Ashtadhyayi sutras:
S1:kukureNNAdashyata === kukureNNa _ adashyata {6.1.101}
S2:naramamantrayata === naram _ amantrayata {8.3.23}
S3:abhiyoktuM === abhiyoktum _ vidhijnaHa {8.3.23}
S4:ya === yaHa _ upavane {8.3.17,8.3.19,8.3.22}
S5:amuM === amum _ naram {8.3.23}
S6:puraM === puram _ na {8.3.23}


As shown above, the software also provides a gloss under each translation, that may be useful for someone who is not an expert in Sanskrit.

In the example above, the entries S1-S6 shown under the head ‘Panini Ashtadhyayi sutras’ list all the terms to which one or more ‘euphonic combination’ rule has been applied. The applicable Paninian sutra is shown in curly brackets (there are over 50 sutras that deal with 'euphonic combinations', as well as a fair number of exceptions to the rules). For instance, {6.1.101} indicates the sutra ‘अकः सवर्णे दीर्घः ‘ (Chapter 6 of the Ashtadhyayi), that is a rule to lengthen a simple vowel when it is followed by a similar simple vowel.

Let us now consider the combination ‘कुकुरेणादश्यत (kukureNNAdashyata)’ in the above Sanskrit translation. We can see from S1 above, that this term is a result of the application of the Paninian rule {6.1.101} to ‘कुकुरेण (kukureNNa)’ and ‘अदश्यत (adashyata)’.

Next we can look up the gloss to understand the derivation and roots of each of these sub-terms. The gloss tells us that the term ‘कुकुरेण (kukureNNa)’ corresponds to the word ‘dog’ (the term marked ’13’), that the root for the noun ‘dog’ is ‘कुकुर (kukura)’, and that it has been declined in the 3rd Person Singular, Masculine, Instrumental Case (further inspection of the gloss reveals that this INS Case was assigned by the preposition ‘by’). Similarly, the gloss tells us that the term ‘अदश्यत (adashyata)’ corresponds to the verb ‘bite’ (the term marked ’15’), that the root for the verb is ‘दंश् (daNsh)’, and that it has been conjugated in Class 1, Parasmaipada, Passive Voice, Past Tense, 3rd Person Singular. A similar analysis can be done for the ‘नरममन्त्रयत (naramamantrayata)’ combination (see the rule for S2 above).

In addition to the above Sanskrit translation, the following Hindi translation can also be provided for each sentence, making it simple for a Hindi-speaker to understand the clause structure of the Sanskrit sentence (as they have been produced with similar clause structures). This is particularly useful in order to understand the handling of Relative Pronouns (the terms ‘5: अमुं ‘ and ‘8: य ‘ in the gloss for the Sanskrit translation i.e. ‘that’ and ‘who’ in English).



शहर पर मुक़दमा नहीं लगाने की वकील ने उस आदमी को सलाह दी जो उद्यान में कुत्ते के द्वारा काटा गया था





Please note that ‘समास (samasa)’ (i.e. compound words) has not been implemented in the Sanskrit translations at present, because of difficulties in doing this reliably. The composition of some types of ‘समास (samasa)’ compounds requires an understanding of semantic relations between terms (present in, or absent from, the phrase or clause). Being almost an ‘art form’, the composition of ‘समास (samasa)’ is probably better left to human translators. However, ‘canned’ forms may be provided in the lexicon for commonly used terms.

Perhaps Sanskrit translations could be provided at the following three levels of complexity in a future version:

Novice: with neither ‘sandhi’ nor ‘samasa’
Intermediate: with ‘sandhi’ but without ‘samasa’ (e.g. the example above)
Expert: with both ‘sandhi’ and ‘samasa’ (but unreliable ‘समास’)


NOTES:

A. The transliteration symbols used by the software (shown above) are slightly different from the Harvard-Kyoto convention for representing Sanskrit. For e.g., ‘kukureNNa’ would be ‘kukureNa’ in the Harvard-Kyoto convention, while ‘vidhijnaHa’ would be ‘vidhijnaH’. The transliteration symbol for the ‘avagraha’ in Sanskrit, is represented as ‘B’ (or an apostrophe) in the present transliteration scheme.

B. In due course, we plan to adopt IPA symbols for the phonological form shown in the gloss.

C. The serial numbers mentioned in the gloss facilitate sequential references from the Sanskrit translation, and roughly correspond to the word order in the Sanskrit translation. There are some gaps in the numbers due to certain hidden elements (such as Negation in the example above) that may have been subsumed into another element (or elided).

D. Another point to note is that the Prepositions in the gloss above (e.g., ‘in:LOC’, ‘by:INS’) display only the Case. The Noun Phrase Complement of the Prepositional Phrase has the relevant Case assigned by the Preposition (for e.g., the Noun ‘उपवने (park)’ is in the Locative Case, while the Noun ‘कुकुरेण (dog)’ is in the Instrumental Case, as assigned by the Prepositions governing these phrases respectively. However, it must be noted that other prepositions may be attached to verbs (known as ‘उपसर्गः (upasarga)') in a manner akin to verb+particle constructions in English (but in the reverse order, and several of them can be stacked together, for e.g. upa+sam+gam), or as stand-alone terms alongside the Noun Phrase they are associated with. Prepositions are particularly tricky in Sanskrit because the same word (for e.g., ‘in’) may assign different Cases to its noun phrase complement, depending on the verb and the context.

E. Other abbreviations in the Gloss include Case (i.e., ‘NOM’:Nominative, ‘ACC’:Accusative, ‘INS’:Instrumental, ‘DAT’:Dative, ‘ABL’:Abative, ‘LOC’:Locative, ‘VOC’:Vocative), Person and Number (e.g., ‘3S’:3rd Person Singular, ‘3D’:3rd Person Dual, ‘3P’:3rd Person Plural, etc.), Gender (‘m’:Masculine, ‘f’:Feminine, ‘n’:Neuter). Verbs display the root, the conjugation class (e.g., ‘1P’:1 Parasmaipada, ’10A’:10 Atmanepada, ‘7U’:7 Ubhayapada, etc.), and the tense/aspect/mood/voice of the verb.

F. As seen from S1-S6 in the above example, Paninian rules relating to ‘euphonic combinations’ are shown in the gloss below each translated sentence, to make their decoding easy. Work is currently underway to handle more of the Morphological rules and exceptions dealt with in Panini’s Ashtadhyayi. The Ashtadhyayi is a work of genius, best described as a ‘meta-language’ to describe Sanskrit. It has its own syntax and semantics, distilled from a study of extant texts c. 500 BCE. However, many of these rules and exceptions apply only to a relatively small number of terms in the lexicon. These relatively rare terms can be replaced in the lexicon with simpler, but semantically near-equivalent alternatives. For instance, an entry in the lexicon for the verb ‘speak’ can contain the regular verb root ‘वद’, instead of the verb roots ‘ब्रू’ or ‘वच्’ (which belong to verb class 2, and have irregular conjugations). However, these rare terms (and associated Paninian rules) must be handled correctly by the Sanskrit to English Machine Translation software which is currently under development (the critical component of sandhi-splitting is at an advanced stage of development).