WMScore

Tue, 09/03/2013 - 15:01 — antardhvani

Unlike widely-used Readability indicators (see [1] for a review) such as the Flesch-Kincaid Grade Level indicator ([2],[3],[4]), WMScore analyzes the structure of a sentence to assess the Working Memory 'Span Effort' required to process it. Sentences with high WMScores are undesirable when a document is intended for individuals with average- or below-average Working Memory spans.

As discussed previously, most Readability indicators indicate 'High' Readability for the well-known garden-path sentence #1. However, garden-path sentences have been studied by cognitive psychologists for decades, and are known to place a very heavy load on Working Memory (see for e.g. [5], [6] that have conducted studies of Event-related potentials (ERP) related to 'garden path' and other sentences that require 'reanalysis'). In this garden-path sentence, the reader realizes that he/ she was 'led up the garden path' only when the verb 'fell' is reached, and that the sentence has to be reprocessed from scratch.

#	Sentence	Flesch-Kincaid Grade Level	WMScore
1	The horse raced past the barn fell.	0.0	17

Similarly, it can be seen that sentences 2 and 3 below have very similar, low Flesch-Kincaid Grade Level indicators (indicating ease of readability) but sentence 3 is clearly much more complex (the higher 'effort' required to parse it is reflected in the higher WMScore for sentence 3). This is largely because sentence 3 uses a complex grammatical construction called a Reduced-Form Restrictive Relative Clause (in Passive Voice) that makes it more complex to process (this was revealed in a number of tests conducted on university students since the late 1960s - see [7] and [8] for example) due to a garden-path effect. Sentence 2, on the other hand, has a simpler syntactic structure.

#	Sentence	Flesch-Kincaid Grade Level	WMScore
2	The florist sent the flowers to the performer who was pleased.[8]	4.8	9
3	The performer sent the flowers by the florist was pleased.[8]	4.8	23

Notice from 1-rev and 2-rev below how a simple transformation improves the readability (and the WMScore) of these two sentences.

#	Sentence	Flesch-Kincaid Grade Level	WMScore
1-rev	The horse that was raced past the barn fell.	4.8	12
3-rev	The performer who was sent the flowers by the florist was pleased.	4.8	18

However, the above transformation may be easier said than done because Word Processors do not report the presence of such Reduced-form Restrictive Relative clauses in Passive Voice. This grammatical construction can be found even in modern textbooks, as authors may ignore complex sentences once they judge the Readability indicators for the passage as a whole to be reasonably low.

The WMScore scoring mechanism takes into account most of the findings from experimental studies conducted by cognitive psychologists during the past five decades (mentioned on preceding pages). It is a variant of the functional completeness/ segmentation (see [9], [10], [11], [12], [13], [14], [15]) and Locality Theory models ([16]), and is broadly based on the duration for which phrases must be kept in Working Memory before the clause is 'functionally completed'. It is computed for each sentence from a parse-tree generated by a Deep Parser. However, parser reliability degrades as sentence length increases beyond a level; this is because errors get compounded and one wrong decision leads to another. WMScore uses an approximation algorithm to handle defective parse-trees; it computes a score that may err on the higher side. Of course, one does not need an advanced algorithm to reason that extremely long sentences need to be reworked.

The WMScore needs to calibrated empirically - once this is done, we should be able to identify sentences that cannot be processed by an individual, given that individual's Working-Memory Span. As Nagel ([17]) famously argued, it is impossible for anyone to 'know' what it is like for another person to experience some phenomenon.

References

[DuBay2004] DuBay W.. The Principles of readability. Costa Mesa, CA: Impact Information; 2004.
[Flesch1948] Flesch R.. A new readability yardstick. Journal of Applied Psychology. 1948;32:221-233.
[KincaidEtAl1975] Kincaid J., Fishburne R., Chissom B.. Derivation of new readability formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy enlisted personnel. CNTECHTRA Research Branch Report. 1975:8-75.
[KincaidEtAl1981] Kincaid J., Aagard J., Cottrell L.. Computer readability editing system. IEEE transactions on professional communications. 1981.
[OsterhoutEtAl1994] Osterhout L., Holcomb P., Swinney D.. Brain potentials elicited by garden-path sentences: Evidence of the application of verb information during parsing. Journal of Experimental Psychology: Learning, Memory and Cognition. 1994;20:786-803.
[GouveaEtAl2010] Gouvea A., Phillips C., Kazanina N., Poeppel D.. The linguistic processes underlying the P600. Language and Cognitive Processes. 2010;25:149-188.
[FodorGarrett1967] Fodor J., Garrett M.. Some syntactic determinants of sentential complexity. Perception & Psychophysics. 1967;2.
[RaynerCarlsonFrazier1983] Rayner K, Carlson M, Frazier L.. The Interaction of Syntax and Semantics during Sentence Processing: Eye Movements in the Analysis of Semantically Biased Sentences. Journal of Verbal Learning and Verbal Behavior. 1983;22:358-374.
[Yngve1961] Yngve V.. The Depth Hypothesis. In: Jakbson R., editor. Structure of Language and its Mathematical Aspects. American Mathematical Society; 1961.
[ChomskyMiller1963] Chomsky N., Miller G.. Introduction to the formal analysis of languages. Handbook of Mathematical Psychology. 1963;2:269-321.
[MillerChomsky1963] Miller G., Chomsky N.. Finitary models of language users. Handbook of Mathematical Psychology. 1963;2:419-491.
[CarrollTanenhaus1978] Carroll J., Tanenhaus M.. Functional Clauses and Sentence Segmentation. Journal of Speech and Hearing Research. 1978;21:793-808.
[FodorBeverGarrett1974] Fodor J., Bever T., Garrett M.. The Psychology of Language. New York: McGraw-Hill; 1974.
[Gibson1991] Gibson E.. A Computational Theory of Human Linguistic Processing:Memory Limitations and Processing Breakdown. Carnegie-Mellon University; 1991.
[StineMorrowEtAl2010] Stine-Morrow E., Shake M., Miles J., Lee K., Gao X., McConkie M.. Pay Now or Pay Later: Aging and the Role of Boundary Salience in Self-Regulation of Conceptual Integration in Sentence Processing. Psychology and Aging. 2010;25:168-176.
[Gibson1998] Gibson E.. Linguistic complexity: locality of syntactic dependencies. Cognition. 1998;68:1-76.
[Nagel1974] Nagel T.. What is it like to be a bat? Philosophical Review. Submitted;83:435-450.

Navigation

You are here

References