You are here

Argument Structure

Subcategorization restrictions [1] describe each verb in terms of the categories of its arguments (both mandatory and optional). Hence, the verb 'kick' may specify that it requires a Noun Phrase (NP) as its Direct Object and another Noun Phrase as its Subject. If the parser was informed about the subcategorization requirements of a verb, it could rule out sentence 1 below while accepting sentence 2.

  • *1. That boy kicked on the ball. (contains a Prepositional Phrase 'on-the-ball' in the Direct Object position, whereas a Noun Phrase was expected)
  • 2. That boy kicked the ball.
  • However, as always, things are never so simple in the real world. For instance, sentence 3 and 4 show that there may be a case where the verb 'kick' apparently takes a Prepositional Phrase (PP) as its Direct Object (i.e. a different subcategorization frame with a PP as the Direct Object).

  • 3. The cops kicked down the door.
  • 4. He kicked up a storm.
  • The parser must now evaluate two subcategorization frames for the verb 'kick' and it is no longer possible for the parser to rule out sentence 1 (as the verb 'kick' licences a PP Direct Object). Both subcategorization frames must be evaluated before a conclusion can be drawn that a sentence must be rejected. The parser must now look for some other methods to rule out sentence 1.

    Now let us consider another possibility; a verb+particle complex (for e.g., 'kicked+down') could take an NP as its argument (such particles are frequently indistinguishable from prepositions, and result in further complexity for parsers).

    There is yet another possibility that we must consider; 'kick up a storm' could be an idiom (i.e. normal rules may not apply).

    The presence of a large number of alternate subcategorization frames makes parsing more complex and time consuming as various alternatives must be evaluated.

    The task is further complicated by the presence of a large number of verbs that have intransitive forms as well as transitive forms. For instance, the verb 'race' is used as an intransitive verb in sentence 5 and as a transitive verb in sentence 6.

  • 5. The horse raced to the finish line.
  • 6. The horse raced by an experienced jockey won the Ascot.
  • Both sentences appear to have the same verb 'raced'. However, in sentence 5 the verb 'raced' is used in the active voice in the matrix clause, while in sentence 6 'raced' is used in the passive voice inside a Reduced-form Restrictive Relative Clause. The possibility of an intransitive form of the verb may lead parsers up the garden path i.e. to explore the intransitive alternative first before having to backtrack when it encounters 'won the Ascot'.

    The VerbNet Project [2] is a major exercise that has been undertaken to document subcategorization frames, selectional restrictions, and theta roles for verbs in English. Several thousand verbs have been documented in great detail, but many more thousands still remain to be documented. In practice, the selectional restrictions and theta roles documented in VerbNet are very difficult to use as filters in parsers, as a near-impossible amount of detail needs to be added to the lexicon. Consider the many ways in which one could describe 'the moon'; as an inanimate object, as an object in motion, as a place, for its gravitational 'pull' on tides, etc.. The WordNet Project [3][4] is a major project that tries to capture some of the relations between entries in the lexicon.


    References

    1. [Chomsky1965] Chomsky N.. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press; 1965.
    2. [Kipper2008] Kipper K., Korhonen A., Ryant N., Palmer M.. A Large-scale Classification of English Verbs. Language Resources and Evaluation Journal. 2008;42:21-40.
    3. [Miller1995] Miller G.. WordNet: A Lexical Database for English. Communications of the ACM. 1995;38:39-41.
    4. [Fellbaum1998] WordNet: An Electronic Lexical Database. Fellbaum C., editor. Cambridge, MA: MIT Press; 1998.