Systematic polysemy in lexicology and lexicography1

Geoffrey Nunberg, Xerox Palo Alto Research Center / Stanford University
Annie Zaenen, Xerox Palo Alto Research Center / Stanford University
 

Abstract

The phenomenon of systematic polysemy offers a fruitful domain for examining the theoretical differences between lexicological and lexicographic approaches to description. We consider here the process that provides for systematic conversion of count to mass nouns in English (a chicken Æ chicken, an oak Æ oak etc.). From the point of view of lexicology, we argue, standard syntactic and pragmatic tests suggest the phenomenon should be described by means of a single unindividuated transfer function that does not distinguish between interpretations (rabbit = "meat" vs. "fur"). From the point of view of lexicography, however, these pragmatically determined"sense precisions" are made part of explicit description via the inclusion of semantic "licenses," a mechanism distinct from lexical rules.
 

1. Systematic Polysemy 

It is well known that we can make productive generalizations about the relations among word uses, say in the form of implicational statements like: "If a word has a use of type s, it also has a use of type s'." Thus a word that denotes a place or kind of place can be used to refer to the people who live there (The city /county /state voted for Jones); a word that denotes a periodical publication or kind of periodical publication can be used to refer to its publisher (The newspaper / The Times opposed the project); and so on. In recent years these regularities have been increasingly prominent in lexical research.2 In this paper, we will describe the general phenomenon of transfer as "systematic polysemy," and we will use the term "transfer functions" to describe the mappings from one class of words to another.

    Systematic polysemy raises a number of problems for lexicographers, particularly if dictionaries are to be modified to accommodate new types of users and new applications. Some of the difficulties are essentially structural or organizational. Conventional, item-based formats provide no obvious place for listing regularities like these. And even if devices are introduced to represent them, say via the kinds of codes that are used to represent syntactic classes, it is not a simple matter to accommodate their use to ordinary conceptions of sense-structure, or to coordinate their treatment in the defining process.3
For the purposes of this discussion, however, we will assume that lexicographers have available a format in which such rules can be represented, whether as designed for print or computational presentation. We will also assume some mechanism for achieving consistency across items. We will be interested in some more general questions: when should transfer functions be given explicit representations, and at what level of abstraction? These questions are of obvious interest for their own sake, given the ubiquity and importance of systematic lexical polysemy. But we are also interested in showing how the way we answer these questions may depend on whether the broad approach we take to lexical description is that of lexicography or lexicology, as Atkins and Fillmore have drawn    s distinction - that is, of dictionary preparation as a large-scale practical activity, or of the theoretical investigation of the lexicon carried out by linguists.
 

2. Transfer functions and their individuation

By way of example, we will be discussing the uses of the underlined mass nouns in sentences like (1) - (6): All of these mass terms appear to be derived from count nouns by a highly general and productive process, which is available in many other languages (most of these examples permit direct translation into French, Italian, Dutch, Finnish, and so on). At the most general level, then, we might want to say that these usages are all generated by a single transfer function, which takes any count noun C into a mass term M that denotes a substance that stands in a salient correspondence to the denotations of C.
By itself, however, this approach leaves several questions unresolved. First, the function does not apply with equal felicity to all count nouns. For some nouns (eg., speck, gram) it is difficult to find any mass reading at all; for many others, like shopping center in (6), the mass reading seems to emerge only in unusual circumstances. Certainly we would be surprised to find a standard dictionary listing the mass uses of shopping center, whereas a failure to list the mass uses of rabbit in (1) and (2) or oak in (4) might count as a more serious omission.

    The second problem involves the specification of the range of the function. In some cases, such as (1)-(4), the denotation of M is a substance derived from instances of things denoted by the associated count noun. Thus rabbit in (1)-(3) refers to stuff that has at one time or another been part of a rabbit or rabbits. But sun in (5) denotes something like "sunlight," not stuff that was ever part of the sun, and shopping center in (6) is a measure term, analogous to "shopping center footage."

    A natural response to these considerations is to distinguish a number of distinct transfer functions, each defined over a more specific domain and range. As a first pass, we may want to distinguish cases where the stuff denoted by the mass term is derived from instances of the things denoted by the count noun, as in (1)-(4). We can accomplish this by introducing a more specific rule of "universal grinding" (see Pelletier and Schubert (1986), Copestake and Briscoe (1991)), which takes the names of kinds of individual objects into terms that denote substances derived from them. By itself, of course, this will not account for the differences among the uses of a word rabbit in (1)-(3), where it seems to denote rabbit flesh in one case, rabbit fur in another, and undifferentiated rabbit stuff in the third. So we may want to introduce specific rules to handle each of these types. This "splitting" approach is assumed by Ostler and Atkins (1991), following a line developed by Apresjan (1973). For example, in his analysis of the polysemy of Russian substantives, Apresjan distinguishes such patterns as "plant - food product made of it" (mustard); "tree - its wood" (fir); "animal - its fur" (squirrel); and "animal - its meat" (goose).

    As circumstances warrant, we can define functions of increasing specificity over more restricted domains and ranges. For example, we might want to distinguish several functions to map from the names of trees to the names of substances derived from them, according as the substances are drawn from the wood (oak, pine, etc.), from the bark (camphor, witch hazel, cassia), from the resin (balsam, frankincense), from the leaves (sumac), from the seeds (jojoba), and so forth. In the same way, we might want to distinguish functions according to particular properties of their output. For example, the grinding function in English does not generally apply to the names of plants to derive the names of cooking oils, but it does apply to derive the names of oils and essences used in perfume:

3. Against individuation: the view from lexicology

    Whether and when we want to recognize distinct transfer functions depends on the understanding with which we approach the task of lexical description. Let us start by considering the lexicological point of view - which is to say, the point of view of grammatical theory. Here, the object is to produce as economical as possible a description of the rules and items of the language, which taken together with assumptions about rational agency and particular domain knowledge will be sufficient to predict the acceptability and interpretation of particular uses of words. On this view, it is more important that an analysis should satisfy criteria of formal simplicity and generality than that it should accord with observations about the frequency of use, or that it should accord with the intuitive categories of the general reader. For example, consider the readings of rabbit in (1)-(3). In the most theoretically satisfying analysis, we would want to postulate only one transfer function here, and hence assume that the use of the word as a mass term is vague, rather than ambiguous. In that case we will say that the "literal" meaning of rabbit in this use is the sense that shows up most clearly in (3), where the term denotes simply "rabbit-derived stuff," with contextual processes providing the more specific "meat" and "fur" readings of (1) and (2).

    There are independent grounds for this analysis. For one thing, we can consider the tests that syntacticians have developed to distinguish between vagueness and ambiguity (see, eg., Zwicky and Sadock (1975)). The tests most relevant to the case at hand involve conjunction and pronominalization. For example, consider sentence (9), as uttered in answer to the question "Why did the men emigrate?"

But note that (9) does not permit paraphrases as in (10), where ellipsis or conjunction would have to ignore the distinction between the two senses of land ("plot of earth" and "country"): Now, however, note that (11) is perfectly acceptable: This suggests that the mass-term use of rabbit is vague, rather than ambiguous: the uses of rabbit that appear as the object of eat and of wear are in fact manifestations of a single general sense. Thus sentence (1) - John was eating rabbit - entails semantically only that what John was eating was stuff derived from rabbit; the further inference that John ate the meat of the rabbit, and not the fur, teeth, eyes, and so forth, is supplied on the basis of normative social assumptions about eating habits.

A second test is relevant here, this drawn from pragmatics. If the "meat" sense is semantically generated, then the inference from John eats rabbit to John eats rabbit meat should not be defeasible - no more than the inference from John eats pork to John eats meat derived from a pig. But now note (12):

From (12) we will not conclude that the religion makes a special dispensation that allows eating of rabbit fur, claws, and whiskers.
It may seem curious that what we are claiming to be the purely linguistic entailments of the mass use of rabbit are only rarely observed in isolation, and that in such cases, as in (3), we may require a relatively uncommon situation. But this is because there are few purposes for which stuff derived from one part of a rabbit is more-or-less interchangeable with stuff derived from another. And on consideration there are some circumstances less grisly than those of (3) in which rabbit seems to have an undifferentiated "rabbit stuff" interpretation: Hearing (14), we are unlikely to ask what part of the rabbit the hutch smelled of.
There is no altogether happy term in English to describe the contextual filling-in of content in cases like this. We will refer to the process as precision, which should be understood here as a nonce nominalization that renders the French précision (< préciser). We will assume that the application of transfer functions at particular lexical arguments is subject to schemas of precision that make reference to characteristic "frames" (see eg. Fillmore (1985) or "scenarios" (see Miller and Johnson-Laird (1976)) associated with various activity-types. Thus the interpretation of rabbit in We ate rabbit for dinner will be constructed on the basis of assumptions about typical situations of dining, as part of a process whereby lexical content is integrated with knowledge representation in a broader sense.

    Under the approach we are suggesting, we will have to look to pragmatics to explain what seem to be lexical exceptions to the application of the transfer functions. For example, the application of the "grinding" function to is usually restricted to certain taxonymic levels. For example:

These regularities follow, we argue, from the pragmatic conditions generally associated with transfer functions, which require roughly that the function from items in the domain (here, biological taxa) shall yield usefully discernible categories in the range (here, types of substances). (See Nunberg (1978) for a detailed working-out of these principles.) The unacceptability of the mass uses of breed names in (15) reflects the fact that the properties that distinguish the meats derived from different breeds are hard to discern, and harder to validate intersubjectively. Thus it is generally assumed that breeds can be substituted salvo sapore in recipes for chicken dishes. Similar considerations explain the judgments in (16): Here, the problem is that mammal stuff is not regarded as a uniform natural category which is susceptible of interesting generalizations. Such assumptions are culturally constructed. Fish is a taxonym of the same biological level as mammal, but people regard fish stuff as a coherent natural category ("Fish is brain food"). Note also that the grinding function does allow application to breed names when its value is interpreted as a wearable substance, as shown in (17): This suggests that we must further distinguish between what we can think of as gastronomically and sartorially basic-level taxa, further complicating the relevant cultural background, but at no expense to the description of the lexicon itself.
Another set of apparent exceptions to the general account of transfer that we are offering involves the well-known phenomenon of blocking, where the use of a derived form is pre-empted by a non-derived lexical item. In English, for example, the specialized items beef, veal, pork, and so on are usually used in place of kind terms like cow, calf, pig to refer to meat. Ostler and Atkins (1991) suggest that this observation militates for treating these items as lexical exceptions to a specific animal-to-meat rule. But the generalization is not categorical. Consider (18), for example: What makes beef odd here is that the interdiction concerns the status of the animal as a whole, and not simply its meat. That is, Hindus are forbidden to eat beef only because it is cow-stuff. Note moreover that if cow really were lexically blocked from being used as a mass term use to refer to cow meat, we would expect a sentence like (19) to be unremarkable: That is, beef and cow here should have the interpretation "cow meat and cow-derived substance other than meat." But (19) clearly strikes us as redundant.

    Rather than postulating a semantic restriction, then, we will look to pragmatics to explain the blocking phenomenon. On the account we have given here, we argue, it has a straightforward explanation via Grice's maxim of quantity, which requires, roughly, that the speaker shall say as much as and no more than the communicative circumstances require. This entails that a specific description should be used in place of a vague one where no ulterior motives intrude. If your doctor counsels you to eat less cow, for example, you may infer that she has some reason for using a term that is vaguer than beef; perhaps she suspects you of unusual dietary habits. (Analogously, if someone tells you that all the great cubist painters were European, you can ordinarily assume that he does not believe that they were all French.)
 

4. The view from lexicography

Methodically deployed, the account of systematically polysemy that we have been presenting here leaves the lexicon looking a good deal sparer than would be permitted by common-sense lexicographical views of descriptive adequacy. In the case under discussion,
for example, the lexicon contains at most a general statement of the grinding function, with no further restrictions on its range or domain. Taken by itself, it does not predict most of the distributional observations we have mentioned. For example:

a. The lexicon does not specify that the grinding function applies only at certain taxonomic levels. It does not tell the user that the function can be applied to fish but not ordinarily to mammal or Rhode Island Red.

b. The lexicon does not tell us how the process of precision is likely to color the output of the function at a particular argument. Thus it gives no hint that subject of the sentence Mink is expensive these days is more likely to be about fur than meat.

c. The lexicon makes no mention of blocking. Thus, while it provides entries for words like beef and pork, it does not indicate that these items are ordinarily used in preference to the mass terms cow and pig.

From the lexicological point of view, these are not lacunae but deliberate economies, the result of systematic effort to draw as rigorous as possible a line between the semantic and pragmatic, the lexical and the encyclopedic, what is "in" and "out of" the language as an idealized type. But we can also see "the language" as a social artifact, which does not simply reflect but also embodies the cultural circumstances that sustain it, and consequently includes a great many regularities that reflect certain kinds of encyclopedic knowledge. This was the view of the great lexicographers and linguists of the 19th century, of course, and of the philological movement in general. And while this picture is not much emphasized in modern mainstream work on grammatical theory, it also underlies several strains of of recent work on word-meaning, both in linguistics (cf. Fillmore, Lakoff, Langacker, Talmy) and the philosophy of language (cf. Putnam, Burge). Linguistically, the important consequence of this view is that linguistic categories cannot be described without reference to cultural notions and normative beliefs.

    On the lexicological view, we have assumed a view of transfer developed by a number of writers (see, eg., Nunberg (1979), Sag (1981), Lakoff (1987), Clark and Clark, (1979)), which takes it to be a pragmatic process that is properly assimilated to phenomena like metaphor and demonstrative reference. But other writers (eg. Apresjan (1973), Wilensky (1991), Ostler and Atkins (1991)) have preferred to regard it as a lexical process that is best thought of as a special case of derivation. The difference should be thought of as one of focus: in the large, polysemy ranges from a completely pragmatic to a highly lexicalized phenomenon. On the one hand, the principles that permit extended use are clearly based in general schemas of knowledge organization or conceptual organization. For this reason we are not surprised to see certain patterns of use recurring over and over again in the languages of the world, and we will expect some of them to appear universally. On the other hand, certain transfer functions are language-particular to a greater or lesser degree. Of the several hundred patterns of Russian polysemy described by Apresjan (1973), for example, perhaps a quarter have no English equivalents.4

    To explain restrictions of this sort, we must postulate some sort of lexical apparatus. Yet this apparatus need not be thought of on the model of traditional derivational processes. By way of example, consider the function from the name of a demense to its ruler. This is common enough in the languages of feudal communities, as in Shakespeare's:

Such uses are of course archaic now, having disappeared along with the background assumptions about the social order that licensed them (eg., "No land without a lordÖ"). At the same time, it would be rash to predict that every language spoken in a feudal society would permit transfers of the type in (20). So the acceptability of (20) in Shakespeare's English must have depended on the existence of a specifically lexical license, which was withdrawn once the pragmatic background that the usage presumed was no longer available.
 

5. "Lexical licenses"

We call this a "license" rather than a "rule" because unlike strictly lexical or derivational processes, the availability and range of application of the process are entirely dependent on background beliefs; when these are not available, the process is not permitted. So we may think of a license as a kind of lexical indexing of a certain regularity in the world (as speakers hold it to be), which permits the exploitation of that regularity for purposes of transfer.

    Licenses may be thought of as a particular instance of what Morgan (1978) has called "conventions of use." Where the linguistic conventions determine what is grammatical in a language, the conventions of use determine what is idiomatic, not in the strong sense in which linguists sometimes apply the term to noncompositional collocations, but in the weaker sense of "appropriate, in concordance with ordinary linguistic practices." An important difference between the two kinds of conventions, accordingly, is that usages that are not explicitly licensed may be considered grammatical, and may in fact occur in ordinary discourse; eg, as in "Norway was too ill to attend the conference."5

    This notion of a license is useful in explaining the cross-linguistic distribution of transfer functions, particularly where differences in patterns of polysemy do not correlate with obvious differences in background beliefs. For example, Jerrold Sadock informs us (p. c.) that West Greenlandic Eskimo is highly restricted in the types of transfer it allows. One may not use the grinding function to derive the names of kinds of meat or hide, though one may use it to derive the names of kinds of wood, and there is apparently no explanation for this restriction on grounds of morphology or of blocking, nor by reference to a difference in the background beliefs that license the transfer.6 Rather, we will say that West Greenlandic does not license transfers that are based on the correspondences between animals and meat-types, but does license transfers based on the analogous correspondences between trees and wood-types. Thus while West Greenlandic may be said to have a grinding function like that of English, its application is limited to certain domains, and associated with certain schemas of precision.

    The grinding function in English appears to be subject to a number of explicit lexical licenses. As we noted earlier, for example, the function does not generally apply to the names of plants to derive the names of cooking oils (?safflower, ?olive), but it does apply to derive the names of oils and essences used in perfume (lavender, ylang-ylang). This does not seem to us to admit of a direct pragmatic explanation, but neither is it a case of an explicit grammatical rule that does not admit pragmatic suspension (one can imagine that professional cooks might well say things like "I usually fry it in safflower.") So we assume that these uses depend on a specific lexical license for application of the grinding function, which is sensitive to the properties of the stuff denoted by its output.7 Analogous licenses will be required to explain the applications of the function we mentioned earlier, as in balsam, sumac, camphor, where the output of the function is a particular, generally-recognized substance other than the wood.8 And even the more common uses of the function to provide the names of meats or hides may be thought of as specifically licensed; this will explain, for example, why the use of chicken as a mass term does not ordinarily permit an interpretation as "chicken blood" or "chicken liver," though of course such interpretations may be possible in a unusual context.
 

6. Conclusion

On this understanding, then, the phenomenon of systematic polysemy involves two kinds of rules, which correspond to distinct descriptive and theoretical levels. A strictly lexicological description concerns itself only with a repertory of transfer functions provided by pragmatics, or by highly general semantic principles. A lexicographic description includes all of the regularities predicted by the licenses and conventions of use of the speech community. In large part, of course, these licenses rest on encyclopedic assumptions: that people ordinarily eat but do not wear chicken stuff, that the bark of the camphor tree produces a widely used substance, and so forth. But this information is distinguished from other encyclopedic information on the grounds that it is specifically relevant to predicting certain normative or idiomatic patterns of use. That is, the (lexicographer's) lexicon can include such information about the world as is projected onto idiomatic patterns of use. Of course it is not always an easy matter to validate the distinctions among information that is "strictly lexical," "encyclopedic but lexically relevant" and "encyclopedic and lexically irrelevant," and the differences among these might better be thought of as gradient, rather than absolute. It may be, moreover, that these distinctions can sometimes be ignored for purposes of modelling interpretation (see, eg. Briscoe and Copestake (1991), Copestake and Briscoe (1991)). But the fact that the distinction is available in principle suggests that the inclusion of pragmatically generated word-uses in lexicographical description can be justified on theoretical as well as practical grounds.
 

Notes

1 We thank T. Briscoe, L. Karttunen, N. Ostler, B. H. Partee, M. White, and the members of the Lexical Project at the Center for the Study of Language and Information at Stanford University for comments and discussion.
2 Varieties of this phenomenon have been described under such headings as "regular polysemy" (Apresjan (1973)),"deferred reference" (Nunberg (1979)),"semantic transfer rules" (Leech (1981)),"sense transfer" (Sag (1981)), "connectors" (Fauconnier (1985)), "sense extensions" (Pustejovsky (1991), Briscoe and Copestake (1991), Copestake and Briscoe (1991)), "lexical networks" (Norvig and Lakoff (1987)), "subregularities" (Wilensky (1991)), and "lexical implication rules" (Ostler and Atkins  (1991)).
3 For example, virtually every standard American dictionary lists a "meat" sense for the words chicken and tuna, whereas we have found only one that gives an equivalent sense for turkey, and none that lists this use of salmon.
4 For example, English does not generally allow transfers of the type that Apresjan describes as "bodily organ - its disease," as in U nee pocki "*She has a kidneys."
5 As Ostler and Atkins observe, the application of certain transfer functions may be subject to semantically extraneous phonological, morphological, or syntactic constraints, in which case they must be regarded as lexical rules.
6 Nominal compounds in West Greenlandic are used to express both the names of meats and woods, but in the latter case the existence of the compound does not block the use of the transfer function.
7 "Idiomatic" should not be confused with either "unmarked" or "statistically frequent." Given that English licenses both "meat" and "fur" interpretations of the transfer from animals to stuff, the sentence Mink is expensive is idiomatic on both readings. Of course the "fur" reading will be more frequent, and a hearer who is unaware of its situation of utterance might reasonably assume that the "fur" reading is intended. But this intuition for "markedness" reflects only an induction over real-life situations, not a property of words.
8 In a similar way, names of fruits can't ordinarily be used to denote the juices derived from them (?He was drinking apple), nor can the names of plants be used to refer to infusions (?She was drinking peppermint). This may be reflect a general restriction on the use of the grinding function in English to derive the names of liquids, as opposed to solids or "mush." But there are apparent exceptions (Cf She was drinking Gamay).
 
 

 Bibliography

APRESJAN, Ju. (1973) : "Regular polysemy". In: Linguistics 142.
BRISCOE, T and A. COPESTAKE. (1991) : "Sense extensions". In: Proceedings of IJCAI Workshop on Computational Approaches to Non-literal Language. Ed. by D. Fass, E. Hinkelman and J. Martin. IJCAI
CLARK, E. and H. CLARK. (1979) : "When nouns surface as verbs". In: Language 55:4.
COPESTAKE, A. and T. BRISCOE. (1991) : "Lexical operations in a unification-based framework". In: Lexical Semantics and Knowledge Representation. Ed. by J. Pustejovsky and S. Bergler. Association for Computational Linguistics.
FAUCONNIER, G. (1985) : Mental Spaces. MIT Press, Cambridge, Mass.
FILLMORE, C. (1985) : "Frames and the semantics of understanding". In: Quaderni di Semantica.
LEVINSON, S. (1983) : Pragmatics. Cambridge University Press, Cambridge, England.
LAKOFF, G. (1987) : Women, Fire, and Dangerous Things. University of Chicago Press, Chicago.
LEECH, G. (1981) : Semantics. Cambridge University Press, Cambridge, England.
MILLER, G. and P. JOHNSON-LAIRD. (1976) : Language and Perception. Harvard University Press, Cambridge, Mass.
MORGAN, J. (1978) : "Two types of convention in indirect speech acts". In: Syntax and Semantics 9: Pragmatics. Ed by P. Cole. Academic Press. New York.
NORVIG, P. and G. LAKOFF. (1987) : "Taking: a study in lexical network theory". In: Proceedings of the 13th meeting of the Berkeley Linguistics Society.
OSTLER, N. and B. ATKINS. (1991) : "Predictable meaning shift: some linguistic properties of lexical implication rules". In: Lexical Semantics and Knowledge Representation. Ed. by J. Pustejovsky and S. Bergler. Association for Computational Linguistics.
NUNBERG, G. (1979) : "The non-uniqueness of semantic solutions: polysemy". In: Linguistics and Philosophy, 3.1, 1979.
PELLETIER, F. J. and L. SCHUBERT. (1986.) : "Mass expressions". In: Handbook of Philosophical Logic, vol. 4. Ed. by D. Gabbay and Guenther. Reidel, Dordrecht.
PELLETIER, F. J., ed. (1979) : Mass Terms: Some Philosophical Problems.  D. Reidel, Dordrecht.
PUSTEJOVSKY, J. (1991) : "The generative lexicon". In: Computational Linguistics, 17.
SAG, I. (1981) : "Formal semantics and extralinguistic context". In: Radical Pragmatics. Ed. by P. Cole. Academic Press, New York.
WILENSKY, R. (1991) : Extending the lexicon by exploiting subregularities. U. C. Berkeley Technical Report.
ZWICKY, A. and J. SADOCK. (1975) : "Ambiguity tests and how to fail them". In: Syntax and Semantics 4. Ed. by J. Kimball. Academic Press, New York.