You are here

Parsing challenges

Natural language understanding (NLU) is one of the major research areas in artificial intelligence. However, even after decades of work, the first step in NLU, i.e. parsing of text, continues to present a major stumbling block. The following example shows the difficulty faced by a parser when it faces two similar-looking sentences that have very different underlying structures:

  • John is eager to please ... [i.e. John is eager (for John) to please someone] [1][2]
  • John is easy to please ... [i.e. It is easy (for someone) to please John] [1][2]
  • Most parsers will produce identical structures for the above two sentences, as they cannot make a distinction between 'eager' and 'easy'.

    Some linguistic theories, for e.g. the Minimalist Theory (for e.g. see [3],[4]), hold that parsing can be compared with putting together a jigsaw puzzle that contains odd-shaped pieces that fit together only in very specific ways. The Lexicon holds all the words that people learn, along with their usage, exceptions and specific features (i.e. odd-shaped pieces).

    Nouns in the Lexicon have a bundle of features including animacy, abstractness, gender, number, etc..

    Verbs have a bundle of features such as number, gender, etc. in addition to selectional restrictions and theta roles that constrain their arguments. For instance, selectional restrictions on the verb 'sleep' may specify that it can only have an animate, non-abstract subject. Theta roles associated with each argument of the verb specify the role played by each argument (for e.g., an Agent role for the semantic subject of the transitive verb 'kick').

    However, there are tens of thousands of words in the Lexicon of each language - it may be impossible to 'comprehensively' capture all the different features of each entry in the Lexicon.


    References

    1. [Chomsky1964] Chomsky N.. Current Issues in Linguistics. The Hague, The Netherlands: Mouton de Gruyter; 1964.
    2. [FodorGarrett1967] Fodor J., Garrett M.. Some syntactic determinants of sentential complexity. Perception & Psychophysics. 1967;2.
    3. [Chomsky1995] Chomsky N.. The Minimalist Program. Cambridge, MA: MIT Press; 1995.
    4. [Chomsky2000] Chomsky N.. New horizons in the study of language and mind. Cambridge, UK: Cambridge University Press; 2000.