Sandhi Statistics

Fri, 11/09/2018 - 14:46 — antardhvani

Presented below are the results of an automated sandhi analysis of Chapter 2 of the Srimad Bhagavad Gita. All the sandhi sutras applicable to Chapter 2, as well as one example of each sutra, are listed on the next page.

As discussed in the previous section, a grammatical analysis of a Sanskrit sentence or stanza must analyze every word in a sentence, as Paninian sandhi sutras transform (or choose not to transform) underlying terms (declensions, conjugations, and indeclinables) into 'single words' as well as 'combination words'. Hence, we use the broader term 'sandhi analysis' instead of 'sandhi splitting'.

#	Description	INPUT	OUTPUT
				FALSE			FALSE
			CORRECT	NEGATIVE	TOTAL	%	POSITIVE
A	Combination words (two or more terms)	190	463	2	465	48.2	2
B	Changed single-word terms	171	169	2	171	17.7	4
C	Unchanged Vocatives / Special terms	80	80		80	8.3
D	Unchanged non-special single-word terms	248	246	2	248	25.7
	TOTAL	689	958	6	964	100.0	6
	Errors			0.6%			0.6%

The Columns labelled 'False Negative' and 'False Positive' are explained in the Sandhi Statistics of Chapter 1.

As will be noted from the above table, the number of errors in the 'sandhi analysis' is small (False Negatives < 1% and False Positives < 1%). Most of the errors are due to inherent ambiguity, and some of these errors can be rectified by a syntactic parser that is run as the next step in the grammatical analysis. Roughly half the errors are attributed to 'single-word' terms.

The following is a summary of the False Negatives and False Positives discussed in the table above.

Stanza	False Negatives	False Positives
2.19	yas	ye
2.21	yas	ye
2.39	yuktas	ayuktas
2.41	vyavasAyAtmikA	vyavasAyAtmikAs
2.44	vyavasAyAtmikA	vyavasAyAtmikAs
2.53	achalA	achalAs

NOTE: We can confirm that our syntactic parser does indeed rectify 5 out of the above 6 defects (except the semantic defect highlighted) from the preceding 'sandhi analysis' stage if the ambiguous terms are marked as such (for e.g., 'yas/ ye'). Note that the syntactic parser is run after the 'sandhi analyzer' identifies the underlying words for each stanza (i.e. the input to the syntactic parser is the output of the 'sandhi analyzer'). As mentioned in Sandhi Analysis of Chapter 1, the output of the 'sandhi analysis' stage was edited to mark all the above ambiguous terms (of course, this needs to be done programmatically by the 'sandhi analysis' software in future). The syntactic parser uses syntactic constraints (for e.g., feature agreement between subject and verb, etc.) to figure out the most appropriate term when it processes a term marked as ambiguous (such as 'yas/ye') (see Parsing Statistics for Chapter 2 for the results of the syntactic parser). The task of the 'sandhi analyzer' is limited to figuring out which Paninian sandhi sutras may have been applied to which underlying terms in order to result in a specific 'combination term' seen in its input.

A. Row (A) of the table above shows the result of splitting 190 'combination words', of which 2 combinations were split incorrectly (partially). These were both cases of syntactic ambiguity. Of these, in one 'combination word' there were alternate valid ways to split these combinations, and the 'sandhi analyzer' made the decision that was not the correct one (but valid nonetheless). These are presented below:

The first 'combination word' ('buddhyAyukto') to be split incorrectly was in Stanza 2.39. It was split incorrectly because there were two valid alternatives ('buddhyA+ayuktas' vs. 'buddhyA+yuktas') facing the 'sandhi analyser', of which it chose the incorrect one. Notice that one alternative ('ayuktas:not endowed') is the negation of the other ('yuktas:endowed'). The choice of the correct alternative ('yuktas') requires a semantic understanding of the sentence, and cannot be handled by a syntactic parser.

Stanza 2.39

एषा तेऽभिहिता सांख्ये बुद्धिर्योगे त्विमां श्रृणु।
बुद्ध्यायुक्तो यया पार्थ कर्मबन्धं प्रहास्यसि

Printed: eSHA te'bhihitA sANkhye buddhiryoge tvimAM shruNNu , buddhyAyukto yayA pArtha karmabandhaM prahAsyasi .
Underlying: eSHA te abhihitA sANkhye buddhis yoge tu imAm shruNNu , buddhyA yuktas yayA pArtha karmabandham prahAsyasi .

The second 'combination word' ('samAdhAvachalA') was split correctly, but one of the resulting underlying terms was wrongly assumed to have undergone a further transformation by a chain of sandhi sutras. In Stanza 2.53, 'samAdhAvachalA' was correctly split into 'samAdhO achalA' at first, but after that, due to syntactic ambiguity, the term 'achalA' was wrongly assumed to have been derived from the underlying term 'achalAs' by the application of the 8.3.17-bho bhago agho apUrvasya yo'shi sutra (with the 8.3.19-lopaHa shAkalyasya elision sutra). However, in this case, the underlying term was actually 'achalA'. This defect can be rectified during syntactic parsing (the next step in the grammatical analysis of the stanza), because the number of 'achalAs' (plural) is unexpected in the local context (vis-a-vis the singular form 'achalA'). We marked this term as being unresolved ('achalAs/ achalA') after which the syntactic parser successfully identified the correct term 'achalA' based on the local context. This problem is very similar to the problems faced in 'single-words' discussed in (B) below.

Stanza 2.53

श्रुतिविप्रतिपन्ना ते यदा स्थास्यति निश्चला
समाधावचला बुद्धिस्तदा योगमवाप्स्यसि

Printed form: shrutivipratipannA te yadA sthAsyati nishchalA , samAdhAvachalA buddhistadA yogamavApsyasi .
Underlying: shrutivipratipannA te yadA sthAsyati nishchalA , samAdhO achalA buddhis tadA yogam avApsyasi .

The above 2 'combination words' account for 2 False Positives ('ayuktaHa' and 'achalAHa') and 2 False Negatives ('yuktaHa' and 'achalA') shown in the table above.

B. Rows (B), (C), and (D) must be read together to understand the remaining errors in the sandhi analysis. These rows account for the remaining 4 False positives and 4 False negatives. As mentioned in the previous section, single-word terms are not easy to get correct due to inherent syntactic ambiguity. The nature of these errors is discussed below:

In Stanza 2.41, the word 'vyavasAyAtmikA' was to have been treated as an unchanged word (i.e. already in its underlying form). However, the sandhi analysis wrongly assumed, due to syntactic ambiguity, that the term was derived from the underlying term 'vyavasAyAtmikAs' by the application of the 8.3.17-bho bhago agho apUrvasya yo'shi sutra (with the 8.3.22-hali sarveSHAm elision sutra).

Stanza 2.41

व्यवसायात्मिका बुद्धिरेकेह कुरुनन्दन
बहुशाखा ह्यनन्ताश्च बुद्धयोऽव्यवसायिनाम्

Printed form: vyavasAyAtmikA buddhirekeha kurunandana , bahushAkhA hyanantAshcha buddhayo'vyavasAyinAm .
Underlying: vyavasAyAtmikAs buddhis ekA iha kurunandana , bahushAkhAs hi anantAs cha buddhayas avyavasAyinAm .

In Stanza 2.44, the word 'vyavasAyAtmikA' was to have been treated as an unchanged word, and the sandhi analysis faced an identical situation as in the above stanza (and made the same error).

Stanza 2.44

भोगैश्वर्यप्रसक्तानां तयापहृतचेतसाम्
व्यवसायात्मिका बुद्धिः समाधौ न विधीयते

Printed form: bhogEshvaryaprasaktAnAM tayApahRutachetasAm , vyavasAyAtmikA buddhiHa samAdhO na vidhIyate .
Underlying: bhogEshvaryaprasaktAnAm tayA apahRutachetasAm , vyavasAyAtmikAs buddhis samAdhO na vidhIyate .

In Stanza 2.19, 'ya' was wrongly assumed to have been derived from 'ye' instead of from 'yas'. In other words, the sandhi analysis should have assumed the application of the 8.3.17-bho bhago agho apUrvasya yo'shi sutra (with the 8.3.19-lopaHa shAkalyasya elision sutra) instead of assuming the 6.1.78-echo'yavAyAvaHa sutra (with the 8.3.19-lopaHa shAkalyasya elision sutra). A syntactic parser can rectify this error because one difference between 'yas' and 'ye' is number (i.e. 'he who' vs. 'those who' or 'those two who'), and the parser will not be able to obtain agreement with the 3rd Person Singular verb 'vetti' (Class 2 root 'vid' -- 'to know') if it chooses 'ye' (either dual or plural forms). We will mark this term as an unresolved 'yas/ ye' and leave it to the parser to choose between the two alternatives based on other syntactic cues. In this case, we will see that the parser was able to resolve this ambiguity successfully, choosing 'yas' using other syntactic cues available to it.

Stanza 2.19

य एनं वेत्ति हन्तारं यश्चैनं मन्यते हतम्
उभौ तौ न विजानीतो नायं हन्ति न हन्यते

Printed form: ya enaM vetti hantAraM yashchEnaM manyate hatam , ubhO tO na vijAnIto nAyaM hanti na hanyate .
Underlying: yas enam vetti hantAram yas cha enam manyate hatam , ubhO tO na vijAnItas na ayam hanti na hanyate .

In Stanza 2.21, the word 'ya' was wrongly assumed to have been derived from 'ye' instead of 'yas', and the sandhi analysis faced an identical situation as in the above stanza (and made the same error). Once again, we will mark this term as an unresolved 'yas/ ye' and leave it to the parser to choose between the two alternatives based on other syntactic cues. Again, in this case, we will see that the parser was able to resolve this ambiguity successfully, choosing 'yas' using other syntactic cues available to it.

Stanza 2.21

वेदाविनाशिनं नित्यं य एनमजमव्ययम्
कथं स पुरुषः पार्थ कं घातयति हन्ति कम्

Printed form: vedAvinAshinaM nityaM ya enamajamavyayam , kathaM sa puruSHaHa pArtha kaM ghAtayati hanti kam .
Underlying: veda avinAshinam nityam yas enam ajam avyayam , katham sas puruSHas pArtha kam ghAtayati hanti kam .

It must be noted that most of the ambiguity discussed above in sandhi analysis is also faced by the human reader, but a human reader combines sandhi analysis with grammatical analysis to identify the underlying terms. This is a process that requires a sound understanding, not only of sandhi sutras, but more importantly, of grammatical analysis. We follow a similar two-stage process i.e. sandhi analysis followed by grammatical analysis.

Navigation

You are here

एषा तेऽभिहिता सांख्ये बुद्धिर्योगे त्विमां श्रृणु।
बुद्ध्यायुक्तो यया पार्थ कर्मबन्धं प्रहास्यसि

श्रुतिविप्रतिपन्ना ते यदा स्थास्यति निश्चला
समाधावचला बुद्धिस्तदा योगमवाप्स्यसि

व्यवसायात्मिका बुद्धिरेकेह कुरुनन्दन
बहुशाखा ह्यनन्ताश्च बुद्धयोऽव्यवसायिनाम्

भोगैश्वर्यप्रसक्तानां तयापहृतचेतसाम्
व्यवसायात्मिका बुद्धिः समाधौ न विधीयते

य एनं वेत्ति हन्तारं यश्चैनं मन्यते हतम्
उभौ तौ न विजानीतो नायं हन्ति न हन्यते

वेदाविनाशिनं नित्यं य एनमजमव्ययम्
कथं स पुरुषः पार्थ कं घातयति हन्ति कम्

Navigation

You are here

Sandhi Statistics

एषा तेऽभिहिता सांख्ये बुद्धिर्योगे त्विमां श्रृणु। बुद्ध्यायुक्तो यया पार्थ कर्मबन्धं प्रहास्यसि

श्रुतिविप्रतिपन्ना ते यदा स्थास्यति निश्चला समाधावचला बुद्धिस्तदा योगमवाप्स्यसि

व्यवसायात्मिका बुद्धिरेकेह कुरुनन्दन बहुशाखा ह्यनन्ताश्च बुद्धयोऽव्यवसायिनाम्

भोगैश्वर्यप्रसक्तानां तयापहृतचेतसाम् व्यवसायात्मिका बुद्धिः समाधौ न विधीयते

य एनं वेत्ति हन्तारं यश्चैनं मन्यते हतम् उभौ तौ न विजानीतो नायं हन्ति न हन्यते

वेदाविनाशिनं नित्यं य एनमजमव्ययम् कथं स पुरुषः पार्थ कं घातयति हन्ति कम्