You are here

Sandhi Statistics

Presented below are the results of an automated sandhi analysis of Chapter 3 of the Srimad Bhagavad Gita. All the sandhi sutras applicable to Chapter 3, as well as one example of each sutra, are listed on the next page.

As discussed in a Sandhi Statistics of Chapter 1, a grammatical analysis of a Sanskrit sentence or stanza must analyze every word in a sentence, as Paninian sandhi sutras transform (or choose not to transform) underlying terms (declensions, conjugations, and indeclinables) into 'single words' as well as 'combination words'. Hence, we use the broader term 'sandhi analysis' instead of 'sandhi splitting'.

# Description INPUT OUTPUT
FALSE FALSE
CORRECT NEGATIVE TOTAL % POSITIVE
A Combination words (two or more terms) 118 282 4 286 51.8 4
B Changed single-word terms 107 104 3 107 19.4 5
C Unchanged Vocatives / Special terms 45 42 3 45 8.2
D Unchanged non-special single-word terms 114 114 114 20.7 1
TOTAL 384 542 10 552 100.0 10
Errors 1.8% 1.8%

The Columns labelled 'False Negative' and 'False Positive' are explained in the Sandhi Statistics of Chapter 1.

As will be noted from the above table, the number of errors in the 'sandhi analysis' is reasonably small (False Negatives < 2% and False Positives < 2%, but are larger than the errors in preceding Chapters). Most of the errors are due to inherent ambiguity, and some of these errors can be rectified by a syntactic parser that is run as the next step in the grammatical analysis. A majority of the errors are attributed to 'single-word' terms.

The following is a summary of the False Negatives and False Positives discussed in the table above.

Stanza False Negatives False Positives
3.1 matA matAs
3.3 dvividhA, proktA dvividhAs, proktAs
3.4 anArambhAt anArambhAn
3.6 yas, sas ye, se
3.11 bhAvayata bhAvayatA
3.19 param parama
3.22 varte varta
3.42 parA parAs

NOTE: We can confirm that our syntactic parser does indeed rectify 9 out of the above 10 defects (except the defect highlighted i.e. 'varta/varte' which required a semantic understanding) from the preceding 'sandhi analysis' stage if the ambiguous terms are marked as such (for e.g., 'yas/ ye'). Note that the syntactic parser is run after the 'sandhi analyzer' identifies the underlying words for each stanza (i.e. the input to the syntactic parser is the output of the 'sandhi analyzer'). As mentioned in Sandhi Analysis of Chapter 1, the output of the 'sandhi analysis' stage was edited to mark all the above ambiguous terms (of course, this needs to be done programatically by the 'sandhi analysis' software in future). The syntactic parser uses syntactic constraints (for e.g., feature agreement between subject and verb, etc.) to figure out the most appropriate term when it processes a term marked as ambiguous (such as 'yas/ye') (see Parsing Statistics for Chapter 3 for the results of the syntactic parser). The task of the 'sandhi analyzer' is limited to figuring out which Paninian sandhi sutras may have been applied to which underlying terms in order to result in a specific combination term seen in the input.


A. Row (A) above shows that 118 'combination words' were split into 286 underlying terms. Of these 286 underlying terms, 4 were incorrect (matching False Negatives and False Positives).

In Stanza 3.3, 'loke'smindvividhA' was correctly split into 'loke asmin dvividhA', but the term 'dvividhA' was wrongly assumed to have been derived from the underlying term 'dvividhAs' by the application of the 8.3.17-bho bhago agho apUrvasya yo'shi sutra (with the 8.3.22-hali sarveSHAm elision sutra). However, in this case, the underlying term was actually 'dvividhA'. This error has a chance of being rectified during syntactic parsing (the next step in the grammatical analysis of the stanza), because of the number (plural) of 'dvividhAs' being unexpected (vis-a-vis the singular form 'dvividhA').

In Stanza 3.4, we see another case of syntactic ambiguity that has been discussed in Chapter 1 (Stanza 1.39). In this stanza, 'karmaNNAmanArambhAnnESHkarmyaM' was split into 'karmaNNAm anArambhAn nESHkarmyam', whereas the middle term should have been 'anArambhAt'. As in Chapter 1, in this case, it was not clear to the sandhi analyzer whether or not the 8.4.45 sutra (8.4.45-yaro'nunAsike'nunAsiko vA) had transformed an underlying 'anArambhAt' into 'anArambhAn', or whether the underlying term was an unchanged 'anArambhAn' in the first place. Here, it assumed an unchanged underlying term of 'anArambhAn' which turned out to be the wrong decision. Please note that the term 'anArambhAn' is a valid declined form in itself.

In Stanza 3.11, 'devAnbhAvayatAnena' was split into the underlying terms 'devAn bhAvayatA anena ' whereas the middle underlying term should have been 'bhAvayata'. This is another case of syntactic ambiguity, where there are multiple valid ways of splitting a 'combination word' and the sandhi analyzer chose the wrong one. It is possible that this ambiguity can be fixed by the parser if it is provided with the unresolved 'bhAvayatA/bhAvayata' as input. However, this is difficult because INS-S 'bhAvayatA' is a Nominal (derived by the kRudanta shatRu process from the verb 'bhU:1:P:to be'), whereas 'bhAvayata' is a verb (Imperative+Causative).

In Stanza 3.19, 'paramApnoti' was split into the underlying terms 'parama Apnoti' whereas the first underlying term should have been 'param'. This is another case of syntactic ambiguity, where there are multiple valid ways of splitting a 'combination word' and the sandhi analyzer chose the wrong one. The parser will be expecting an Accusative term for the transitive verb 'Apnoti' (i.e. 'param'), but will instead see the Vocative term 'parama'. This ambiguity can be fixed by the parser.


B. In Row (B) of the table above, we see that there were 5 False Positives shown as changed single-word underlying terms. Of these, in 3 stanzas, the underlying term should have been an unchanged single-word term (i.e. should have been classified under Row C). These include 'matA' (Stanza 3.1), 'proktA' (Stanza 3.3), and 'parA' (Stanza 3.42). In these stanzas, the sandhi analyzer wrongly assumed the printed form was derived from the underlying term by the application of the 8.3.17-bho bhago agho apUrvasya yo'shi sutra (with the 8.3.22-hali sarveSHAm elision sutra, or the 8.3.19-lopaHa shAkalyasya elision sutra as the case may be).

The remaining 2 False Positives in Row B occur in Stanza 3.6, where the printed forms 'ya' and 'sa' were assumed to have been derived from the underlying terms 'ye' and 'se' respectively by the application of the 6.1.78-echo'yavAyAvaHa sutra (with the 8.3.19-lopaHa shAkalyasya elision sutra). However, they should instead have been derived from the underlying terms 'yas' and 'sas' by the application of the 8.3.17-bho bhago agho apUrvasya yo'shi sutra (with the 8.3.19-lopaHa shAkalyasya elision sutra). This form of syntactic ambiguity has been seen earlier in Chapter 1.


C. In row (C) of the table above, all 3 False Negatives were due to ambiguity discussed in point B (classified as False Positives in B). These include the underlying terms 'matA' (Stanza 3.1), 'proktA' (Stanza 3.3), and 'parA' (Stanza 3.42), that were wrongly assumed to have been derived by the application of the 8.3.17-bho bhago agho apUrvasya yo'shi sutra (with the 8.3.22-hali sarveSHAm elision sutra, or the 8.3.19-lopaHa shAkalyasya elision sutra as the case may be).


D. In row (D) of the table above, there was 1 False Positive in Stanza 3.22, where the printed form 'varta' was assumed to be unchanged, whereas it was actually derived from the underlying term 'varte' by the application of the 6.1.78-echo'yavAyAvaHa sutra (with the 8.3.19-lopaHa shAkalyasya elision sutra).


A closer look at the errors in this Chapter (and in previous Chapters) will indicate that most of the errors are due to the assumption of an elision sutra (either the 8.3.22-hali sarveSHAm elision sutra, or the 8.3.19-lopaHa shAkalyasya elision sutra) in the derivation (from 'printed form' to underlying term). The application of these two elision sutras results in ambiguous 'printed forms' that cannot be easily traced back to their underlying terms without a complete grammatical analysis.