Dimensions of Ambiguity

Peter Norvig

UC Berkeley

Introduction

Traditionally, there have been four main dimensions along which ambiguous sentences, phrases, and words have been classified:

Linguistic/Referential
Ambiguity/Vagueness
Lexical/Structural
Polysemy/Homonomy

The linguistic/referential dimension accounts for deictic references. The sentence ``I'll do it tomorrow'' is referentially vague, but linguistically unambiguous. A particular usage of this sentence will often make it clear who ``I'' is, when ``tomorrow'' is, and what is required to ``do it.''

Language necessarily describes the world incompletely, and thus must leave some things vague. ``I'll do it tomorrow'' is vague as to whether the act is to be done tomorrow morning or tomorrow afternoon, but this, as [Zwicky 1975] point out, shouldn't count as true ambiguity. In contrast, ``I'll do it next Friday,'' when spoken on Sunday, is ambiguous between 5 and 12 days hence (at least in some dialects).

Structural ambiguity occurs when a phrase has several distinct syntactic parses. For example, ``spaghetti with meatballs and wine'' would be parsed differently than ``spaghetti with bread and butter,'' given the standard gastronomic assumptions. This is opposed to lexical ambiguity, where the constituent analysis is constant across interpretations, but the choice of lexical units changes. For example, ``I went to the bank'' is lexically ambiguous because bank has (at least) two readings as a noun.

There is a further distinction to be made between lexical items that are polysemes and those that are homonyms. The money-bank and river-bank senses of bank seem unrelated, and are called homonyms, while the senses of, say, ``cake'' - chocolate cake, fish cake, cake of soap - are closely related and are called polysemes.

The traditional dimensions are useful in the context of a generative theory which sees a language as a set of grammatical strings, each with a set of possible interpretations. Under this view, the rules and lexicon of a generative grammar determine a mapping between strings of words and interpretations, or meanings. A string with exactly one interpretation is unambiguous, one with no interpretation is anomalous, and one with multiple interpretations is ambiguous. To enumerate the possible meanings is the proper job of a linguist; to then choose from the possibilities the one ``correct'' meaning is a problem in pragmatics, or in Artificial Intelligence.

Unfortunately, this characterization of language, and of meaning, fails to account for certain psychological facts of human language usage. It is a psychological fact that we often listen to ambiguous sentences and choose a single interpretation, without being consciously aware of considering the complete set of possibilities. In other cases, we may notice ambiguity, but have a clear preference for one interpretation over another, and in still other cases we may be genuinely confused. To compound the problem, we can change our mind about the meaning of a sentence - ``at first I thought it meant this, but actually it means that.'' Finally, our affectual reaction to ambiguity is variable. Ambiguity can be humorous, confusing, or perfectly harmonious. To a psycholinguist or AI researcher, these facts are crying out to be explained. I will attempt an explanation by investigating a variety of ambiguous sentences, and citing the problems they pose. As a grand finale, I will try to draw some conclusions about an extended theory of language use, understanding, and meaning.

Garden Path Sentences

A garden path sentence invites the listener to consider one possible parse, and then at the end forces him to abandon this parse in favor of another. Listeners are conscious of this switch, and often have difficulty with it. A well-known example is 1.1, where raced is initially treated as a past tense verb. This analysis fails when the verb fell is encountered; after some difficulty raced can be re-analyzed as a past participle. For most informants, there is a distinct feeling of having to re-parse the sentence; it does not feel like both parses were being built up simultaneously, and the second one was tested after the first was ruled out.

Most informants find 1.2 to be much less of a garden path than 1.1. An explanation for this is that while the listener is parsing ``fell down and broke its leg'' as a verb phrase, he or she is also trying to re-analyze ``the horse raced past the barn'' as a noun phrase. There is sufficient time to do this re-analysis before the end of the sentence in 1.2, but not in 1.1. Thus, 1.1 is very confusing, while 1.2 is not. We can try to quantify this analysis by asking how long does it take to re-analyze a sentence. If you agree that even 1.3 is much better than 1.1, then the answer may be as little as one word.

1.1 The horse raced past the barn fell.
1.2 The horse raced past the barn fell down and broke its leg.
1.3 The horse raced past the barn fell down.
1.4 The horse raced at the Belmont died.

Now consider 1.4, which is easier to understand than 1.1, despite having identical syntactic structure. One explanation for this is that ``raced past the barn'' is just not as good a descriptive attribute as ``raced at the Belmont.'' The first modifier could be true of any healthy horse, while the second describes only a top race horse. Thus, while both make good past tense verb phrases, only 1.4 makes a good past participle modifier.

Another example of this distinction is illustrated in 2.1 and 2.2. These examples are complex because ``got'' is highly polysemous. It can mean `received' (as in got a raise), `became' (got old), `undergo' (got arrested) or `cause/achieve' (got them arrested). In 2 the initial interpretation is `the boy became obese', with ``fat'' interpreted as an adjective. When the final word is processed, this initial interpretation has to be abandoned.

2.1 The boy got fat spattered.
2.2 The boy got fat spattered on his arm.

An informal experiment in [Sch 84] shows that 2.1 is a quite difficult garden path sentence, while 2.2 is not a garden path. One explanation for this is that having fat spattered on one's arm is the kind of experience one might be described as having undergone, while `fat spattered' is not something one undergoes, nor is it the kind of thing one normally strives to achieve. Also, as pointed out above, 2.1 is difficult because it asks the listener to abandon the `became obese' interpretation and come up with a new one in the course of one word.

Selection Restrictions

It seems that many ambiguous interpretations are not consciously considered because they violate selection restrictions. For example, in ``I drink port,'' the noun ``port'' is unambiguously interpreted as a kind of fortified wine, even though it also has senses meaning a harbor, and the left side of a ship. Thus, we might be tempted to formulate a principal stating that senses violating selection restrictions are not considered when there is another sense that satisfies the restrictions. Unfortunately, the sentences in 3 are evidence against this principle as it stands:

3.1 The astronomer married a star.
3.2 The plumber lit his pipe.
3.3 The rabbi was hit on the temple.
3.4 The hay farmer drank through a straw.

In each of these, the final noun has one meaning that satisfies the selection restrictions. However, there is another meaning that is suggested first, and which stubbornly refuses to go away.

Another problem is that selection restrictions are not really restrictions at all, but are more like preferences. Consider 4.1 below, which is ambiguous between the chicken being the agent and the object of eating. 4.2 prefers the agent interpretation, because dogs in our culture eat but aren't eaten. 4.3 prefers the object interpretation, because clams are eaten, and while they may eat, we are reluctant to attribute to them sufficient faculties to be in a state of mental readiness. Still, with the proper context, the preferred meaning of any of these sentences can be reversed.

4.1 The chicken is ready to eat.
4.2 The dog is ready to eat.
4.3 The clams are ready to eat.

Syntactic Preferences

[Sch 84] provides a good summary of previous research on syntactic reasons for preferring one interpretation over another. The principal of Right Association says that PP's tend to attach to the most recent VP or NP they could possibly modify. Thus, in 5.1, the preferred reading is that ``for Mary'' modifies ``selected,'' not ``book'' or ``bought.'' In 5.2, however, the preferred reading has the PP modifying ``carried,'' not ``groceries.'' This is explained by the principal of Minimal Attachment, which prefers parses that use the longest rewrite rules, and thus result in a parse with fewer nodes. If we assume a grammar which includes the rules VP -> V NP PP and NP -> NP PP, then attaching the PP to the V rather than the NP minimizes the number of nodes. This analysis presupposes that Minimal Attachment takes precedence over Right Association, and that ``carried'' subcategorises for the VP -> V NP PP rule, while ``bought'' does not.

5.1 John bought the book which I had selected for Mary.
5.2 John carried the groceries for Mary.
5.3 John met the girl that he married at a dance.
5.4 John saw the bird with the powerful beak.
5.5 John met the girl that he saw at a dance.
5.6 John saw the bird with the powerful binoculars.

If we compare 5.1 with 5.3 and 5.2 with 5.4, we see that the preferences can be reversed with the proper semantic context (and can be reversed back again in 5.5 and 5.6). At best then, these syntactic preferences are only one factor that must be considered in arriving at the best interpretation.

Mutually Compatible Interpretations

Consider the following quote from Richard Parsons, of the American Fur Industry Inc., on their new advertising slogan ``Fur is for Life'': ``it has a good sound, a good connotation. Yes, they last a long time. Yes, they're a good product. Yes, furs support wildlife conservation.'' Parsons (although not a professional linguist) is making a claim about language use: that the proper or intended meaning of a phrase can be a combination of a number of interpretations, rather than a selection of one unique interpretation. In all the ambiguous phrases we have seen so far, interpretations seem to compete with one another. We can switch back and forth between two interpretations, but cannot accept both at once. This is similar to the Necker cube effect in visual perception. But Parsons is saying that the phrase ``fur is for life'' is different. Five interpretations of the slogan are listed below. Presumably, Parsons would like the public to accept 6.1-4 as mutually compatible, and rule out 6.5 as incompatible, or better yet, to never consciously consider 6.5 at all.

6.1 Fur is durable.
6.2 The fur industry is pro-conservation.
6.3 Fur wearers are lively.
6.4 The recipient of a fur may become indebted to the giver for life.
6.5 Fur, while on an animal, protects its life.

While Parson's claim is a radical departure from the `one string/one interpretation' theory of meaning discussed in the introduction, it is in fact the norm in rhetoric, in poetry, and, it seems, in advertising. To support this claim, I opened a poetry anthology at random to Dylan Thomas, finding the opening line of his poem In the Beginning: ``In the beginning was the three-pointed star.'' As the rest of the poem makes clear, the three-pointed star should be taken as referring simultaneously to a stellar body in primordial space, to the concept of light as in God's performative speech act ``Let there be light'', to the star of Bethlehem, and to the Holy Trinity.

To take another example, a pop song by the Talking Heads proclaims ``We are creatures of love.'' This can be taken as having the three interpretations listed in 7. Thus, not only does the genitive have three mutually compatible interpretations, but the word ``love'' has two.

7.1 We are born as a consequence of sexual love.
7.2 We have souls that contain or are composed largely of love.
7.3 We are possessed by the force of love.

There are also cases of multiple interpretations which don't involve poetic license. Consider the use of ``book'' in ``This book, although beautifully bound, contains only one new idea in 500,000 words.'' The use of ``beautifully bound'' refers to a physical object, ``one new idea'' refers to the abstract content, and ``500,000 words'' refers to a particular (abstract) instantiation of the content. (Presumably if the book were translated into another language, it would have a different number of words, but still only one new idea.) All three polysemous interpretations of ``book'' are used simultaneously.

Jokes and Puns

[Freud 1916] presents a definition of joking as the ability to find hidden similarities between dissimilar things. This is amended to allow for the discovery of differences, or just ``to bind into a unity, with surprising rapidity, several ideas which are in fact alien to one another.'' He cites as an example the joke ``I met Baron Rothschild, and he treated me quite as his equal- quite famillionairely.'' This is funny because of the unexpected ease of combining `familiarly' with `millionaire' to create a new word meaning `as familiarly as is possible for a millionaire.' Another example is the pun ``She criticized my apartment, so I knocked her flat.'' Here a single phrase conveys both `disparaged her quarters' and `beat her horizontal.'

The problem is to explain why these are funny, while something like ``the chicken is ready to eat'' is not. Why is it that, to my ears at least, ``the rabbi was hit on the temple'' is funny, while ``the plumber lit his pipe'' is merely confusing?

Anomalous Strings

We have seen that it is an error to assume that strings with multiple interpretations must pick exactly one as their meaning. But what of strings with no valid parses? Chomsky has argued that strings like ``colorless green ideas'' are grammatical yet semantically anomalous. It seems there are also ungrammatical strings which can be assigned semantic interpretations. Consider 8.1:

8.1 John and me is running a race.
8.2 John and I are running a race.
8.3 John beat me in running a race.

As it stands, 8.1 has no parses, but we nevertheless understand it as a corruption of 8.2, and not of 8.3, even though either of these could be derived by changing two words in 8.1. We know from the ?/* notation that there are degrees of grammaticality, but it seems that even clearly ungrammatical strings can nevertheless have intended meanings.

Conclusions

Below I list the major principals to be extracted from each section of this paper. The points outlined here derive from the study of ambiguity, but they have implications for any theory of language interpretation.

Garden Path Sentences: The listener builds up partial interpretations as the sentence progresses. Some possible interpretations are never considered at all. Only the most promising interpretation(s) are considered. Syntax and semantics both help determine an interpretation's promise. Some interpretations are rejected after they have been considered. It takes time to reject an interpretation and start a new one. The listener expects to have a valid interpretation at the end.

Selection Restrictions: Some interpretations are not considered because of selection restrictions. Others are considered, but are rejected. Semantic relatedness can trigger consideration, regardless of syntax. Restrictions are more like preferences than strict rules.

Syntactic Preferences: One preference is for Right Association (or some variant). Another is for Minimal Attachment (or some variant). These preferences can be over-ruled by other considerations.

Mutually Compatible Interpretations: A phrase can ``mean'' several things at once. It may not be possible to clearly delimit what a phrase means.

Jokes and Puns: Some ambiguous sentences trigger a humor response. Others are merely confusing. Humor has to do with unexpected ease of expression.

Anomalous Strings: Grammaticality is a continuous, not binary, feature. Some strings are ungrammatical, but still have interpretation(s). We interpret them by by looking at what was considered. We don't do it by suggesting interpretations that weren't considered before.

Much remains to be done to work these principals into a full-fledged theory of language use and interpretation, but I hope the examples and principals shown here have at least been thought-provoking.

References