Ch. 20: Punctuation
Main authors: Geoffrey Nunberg, Ted Briscoe & Rodney Huddleston
1 Preliminaries 2
1.1 The domain of punctuation 2
1.2 Indicators and characters 3
1.3 The status of punctuation rules 4
1.4 Units of syntax and units of writing 5
1.5 Functions and classification of punctuation indicators 7
2 Primary terminals 9
3 The secondary boundary marks: comma, semicolon and colon 13
3.1 Some formal preliminaries 14
3.2 Uses of the secondary boundary marks 16
3.2.1 Coordination, syndetic or subclausal 17
3.2.2 Supplementation, syndetic and subclausal 18
3.2.3 Asyndetic combinations of main clauses 19
3.2.4 Further cases of simple boundary marking at the subclausal level 21
3.2.5 Delimiting commas 22
4 Parentheses 26
5 The dash 28
6 Quotation marks and related indicators 31
7 Capitalisation 35
8 Word-level punctuation 37
8.1 Word boundaries 37
8.2 Hyphens 37
8.2.1 Some initial distinctions 37
8.2.2 Inherent and long hyphens 38
8.3 The apostrophe 41
8.4 The abbreviation full stop and minor reduction markers 41
8.5 The slash 42
1 Preliminaries
1.1 The domain of punctuation
The central concern of punctuation is with the use of the various punctuation marks, such as the full stop, comma, semicolon, colon, question mark, quotation marks, parentheses, and so on. These serve to give indications of the grammatical structure and/or meaning of stretches of written text. The punctuation marks are all segmental units of writing -- i.e. they fully occupy a position in the linear sequence of written symbols. There are, however, various non-segmental features which can serve the same kind of purpose as the punctuation marks. For example, titles of literary or other works may be italicised as an alternative to being enclosed in quotation marks. And while the end of a sentence is indicated segmentally by a punctuation mark (a full stop, question mark or exclamation mark), the beginning of a sentence is indicated non-segmentally by capitalisation of the first letter. We will therefore regard punctuation as covering the use not only of punctuation marks but also of such non-segmental features as italics, capital letters, bold face and small capitals. Ordinary lower-case roman represents the default form, and these non-segmental features can be regarded as modifications of the default form.
One other important aspect of punctuation is the use of space, notably to separate one word from the next. Space between words is a segmental unit: like the punctuation marks, it occupies the whole of one position in linear sequence. For example, in the first sentence of this paragraph, a word-space occupies the fourth position, the tenth position, and so on. We will use the term punctuation indicator as a general term covering punctuation marks and the other devices that fall within the domain of punctuation. The classification is thus as follows:
<REB:52> [1]
Non-segmental Modifications
On another dimension, we need to clarify the domain of punctuation with respect to the size of the unit to which the punctuation applies. The punctuation marks mentioned above generally occur within a sentence (including its final boundary) but outside the individual words. There are two punctuation marks, however, that are normally word-internal: the apostrophe and the hyphen. Words may also contain various non-segmental marks, diacritics, but we do not regard these as falling within the domain of punctuation. For example, accents (which do not of course appear in native English words, but are nevertheless found in some words that are otherwise fully anglicised, such as fiancJ) are simply a matter of word-spelling. There are also punctuation marks that can apply beyond the sentence: parentheses and quotation marks can enclose stretches of writing longer than a sentence. In addition, the division of a text into paragraphs (marked by a new line and, usually, indentation space) can also be regarded as a matter of punctuation. It is not usual, however, and nor would it be helpful, to extend the domain of punctuation to cover the lay-out of larger units (division into chapters or sections, use and format of headings, and so on). It follows that a non-segmental feature such as italics counts as a punctuation indicator when it serves to mark quotation, a title or emphasis, but not when it is used for a heading of a certain hierarchical level in the organisational structure of a book or comparable document.
1.2 Indicators and characters
In virtually all written material the apostrophe is physically -- or, as we shall say, graphically -- identical with a single quotation mark. We need, therefore, to distinguish between two kinds of concept which we will call indicators and characters. The characters are the graphical shapes, or symbols, that realise the indicators. Apostrophe and single quotation mark are then distinct indicators that may be realised by the same character.
For reasons we will discuss in #6, we take single and double quotation marks as distinct indicators, but each of them can be realised by three different characters. Quotation marks normally occur in pairs, and there is one character that is used to open the quotation (` or ``), another which is used to close it (' or ''), and a third that is used in both positions in fonts (such as the standard typewriter keyboard) that do not have the separate opening and closing characters (U or "). We do not have opening and closing quotation marks as distinct indicators because the choice of character is predictable from the position: they are contextual variants of the same indicator. But the apostrophe has to be distinguished from the single quotation mark because it can never be realised by the character used to open a quotation -- even when it is used at the beginning of a word, as in rock 'n' roll (not *rock `n' roll).
The distinction between indicator and character is also important with respect to dashes and hyphens. We distinguish three indicators, illustrated in
<REB:53> [2]
ii (ordinary) hyphen non-negotiable
iii long hyphen the doctor--patient
relationship
Consider finally the full stop and ellipsis points. The full stop is used to mark the end of a sentence or an abbreviation (as in Col. Blimp), but as these always have the same realisation we regard them as different uses of a single indicator. There is, however, a distinct indicator that is used to mark omission (as in The President said, `We will send as many troops as it takes ... to restore order in the region'); this indicator, which we call ellipsis points, is realised by a sequence of three dot characters or a single character consisting of a sequence of three dots.
In most other cases we have a simple one-to-one relation between indicator and character. The following table lists the punctuation marks we shall be concerned with, giving their realisation and commonly used alternative terms:
<REC:31> [3]
[3] indicator realisation(s) alternative terms
ii question mark ?
iii exclamation mark ! exclamation point (AmE)
iv comma ,
v semicolon ;
vi colon :
vii dash C -- - ---
viii parenthesis ( ) round bracket (BrE)
ix square bracket [ ] bracket (AmE)
x ellipsis points ... ellipsis
xi double quotation mark `` '' " ) double/single quote (mark)
xii single quotation mark ` ' U ) inverted commas
xiii apostrophe ' U
xiv slash / stroke, solidus, virgule
xv long hyphen -- en-dash
xvi ordinary hyphen -
xvii asterisk *
A great deal of the written material that we read is put out by publishers (of books, newspapers, journals, etc.) with the text edited by people whose profession is precisely to prepare text for publication. To a significant extent this process involves the conscious application of codified rules, set out in manuals specific to a particular publishing house or accepted more widely as authoritative guides. Those outside the publishing trade are generally likely to be unfamiliar with at least some of the more technical rules, and in the context of preparing text for potential publication many writers will defer to the advice of handbooks and the like. It is true, of course, that style guides commonly deal with points of grammatical usage too, but here they have a less influential role: a very high proportion of our use of language involves spontaneous speech, with no need or opportunity to consult such works. For this reason, we ourselves in writing this chapter on punctuation have given greater weight to the prescriptions of major style manuals than we have in the chapters on grammar. But we should also note that many of the rules of punctuation that have been mastered by competent writers are part of tacit linguistic knowledge no less than the rules of spoken language are, and as such are never mentioned in usage manuals or style guides.
D Variation
In spite of the codification mentioned above, punctuation practice is by no means entirely uniform. On some matters, such as whether or not to mark abbreviations with a full stop, we find variation from one publishing house to another. More important, there is some significant regional variation, most notably with respect to the interaction between quotation marks and other punctuation marks.
It is worth noting, however, that we do not find social variation between standard and non-standard such as we have in grammar: there is no punctuational counterpart of grammatically non-standard usage like I ain't done nothing or Who done that? -- that is, a repertory of variants that are used in a consistent way by one social group but not by another. Moreover, the style contrast between formal and informal is of relatively limited relevance to punctuation. One might say that the multiple question marks and exclamation marks in [4] belong to informal style:
<REB:64> [4]
ii Thanks for inviting us --
we had a wonderful time!!
What we do find, however, is a distinction between light and heavy punctuation styles that is independent of regional and publishing house variation:
<REB:57> [5]
ii On Sundays, they like to
have a picnic lunch in the park, if it's fine. [heavy]
1.4 Units of syntax and units of writing
D The orthographic sentence
Syntax is traditionally defined as the study of the way words combine to form sentences, but from a syntactic point of view the delimitation of the sentence is quite problematic. A sentence may have the form of a clause or of a sequence of clauses, and while sentences with the form of a clause can generally be delimited straightforwardly, it is not so clear when successive clauses are syntactically combined into a larger unit. The central cases of sentences with the form of a sequence of clauses are those where the clauses are coordinated, with at least one of them being marked by a coordinator. The syntactic construction of coordination, however, does not have to be explicitly marked by means of a coordinator: coordination can be asyndetic (Ch. 15, #@@). This is evident from the examples like Her family, her friends, her colleagues had all rallied to her support, where the underlined NPs combine to form an asyndetic NP-coordination that functions as subject of the clause. Note also that the gapping construction can occur with asyndetic coordination. Compare, then:
<REB:54> [6]
ii Some went by bus, some by train.
iii Some went to the concert,
some stayed at home.
Coordination, moreover, is not the only syntactic relation that need not be marked by any formal device. The same applies with supplementation (Ch. 15, #@@). Compare:
<REB:55> [7]
ii There's another reason why
we should hesitate -- (namely,) it is likely that interest
rates will raise again in a few months.
For these reasons there will often be no syntactically-marked distinction between a sentence with the form of a combination of two successive main clauses and a sequence of two sentences each of which has the form of a main clause. In writing, one function of punctuation is precisely to indicate whether successive clauses belong together or are to be treated as separate. In speech, prosody also serves to convey information about the relation between successive clauses, but it is important to emphasise that punctuation cannot be described as a means of representing the prosodic properties of utterances. When we are talking about the relation between successive main clauses, therefore, we cannot be neutral between the spoken and written medium. The term orthographic sentence is therefore applied to the unit that is defined by punctuation: leaving aside complications that we will take up below, an orthographic sentence is a unit of writing that begins with a capital letter and ends with a full stop, question mark or exclamation mark. The term `orthographic sentence' embodies no commitment as to whether or not the unit concerned is syntactically a sentence, a question which may have no determinate answer. Since this chapter is about punctuation, however, we will henceforth take it for granted that the term `sentence' on its own is to be understood as ``orthographic sentence'' unless we explicitly indicate otherwise.
D The orthographic word
Similar issues arise with the word. In the grammar we make a distinction between a morphologically complex word and a syntactic construction containing separate words:
<REB:56> [8]
ii Who lives in that green
house opposite? [syntactic construction]
1.5 Functions and classification of punctuation indicators
D Four main functions
The punctuation indicators serve a range of functions which can be grouped (leaving aside a few minor special purpose uses) into four main types.
E (a) Indicating boundaries
<REB:58> [9]
ii By all means take the book
with you, but be sure to return it.
E (b) Indicating status
<REB:60> [10]
ii The boys' behaviour was
hardly likely to make her change her mind!
E (c) Indicating omission
<REB:61> [11]
ii `F*** off!' he yelled, `or
I'll call the police.'
E (d) Indicating linkage
<REB:62> [12]
ii I met her in the dining-car
of the London--Glasgow express.
D Prevention of misreading
We have noted that punctuation marks are often optional, with light and heavy styles differing with respect to how many of these optional marks are inserted. Even in what is overall a light style, however, such indicators will tend to be added if their omission might lead to an initial misreading of the sentence. Indeed, indicators may be inserted to prevent confusion of this kind even in places where they would not normally be permitted. Compare:
<RED:92> [13]
ii Liz recognised the man who entered the room, and gasped.
iii Most of those who can,
work at home.
D Organisation of this chapter
It will be evident from the brief survey with which we began this section that a number of indicators have diverse functions. Most notably, perhaps, the full stop can mark the end of a sentence or indicate an abbreviation. As a consequence, it is not possible to draw up a satisfactory unidimensional classification of the punctuation indicators. The organisation of the rest of this chapter, therefore, represents a compromise between treating them in successive subsets and dealing with them function by function.
In #2 we describe what we call the primary terminals: the full stop as used to end a sentence, and the question and exclamation marks. With the latter two the function of marking status is more important than that of marking a terminal boundary, and they are not constrained to occur at the end of a sentence; nevertheless, they are mutually exclusive with the terminal full stop, and hence form a natural group with it.
A second group, dealt with #3, consists of the comma, the semicolon, and the colon, which we refer to as secondary boundary marks. They are secondary in the sense that they mark boundaries within a sentence, not between sentences. Or rather, that is invariably the case with the comma and the semicolon, and predominantly the case with the colon.
We turn next, in #4, to parentheses. These occur in pairs (with distinct opening and closing characters), enclosing units which are usually smaller than a sentence, but do not have to be. In #5 we turn to the dash; in most of its uses this is a secondary boundary mark, but it has considerable affinities with the parenthesis, and hence is best dealt with at this point in the exposition.
The following section, #6, covers the related functions of quotation, citation and naming. Quotation marks, single or double, are the main indicators for these functions, but italicisation is used too, and there are also places where there is no punctuational indication at all. Square brackets and ellipsis points occur primarily within quotations and are thus dealt with in this section. Related in some respects to quotation is capitalisation, the topic of #7.
Finally, #8 deals with those aspects of word-level punctuation not already covered. By word-level punctuation we mean the marking of word boundaries and the use of punctuation marks (mainly hyphens and apostrophes) within a word. Other punctuation we will refer to by contrast as higher-level punctuation. We treat the slash as a word-level punctuation indicator on the grounds that it is not (or at least not normally) flanked by spaces.
2 Primary terminals
For the most part, discursive written text consists of a sequence of sentences, each beginning with a capital letter and ending with a primary terminal -- a full stop, a question mark or an exclamation mark.
The full stop that marks the end of a sentence we refer to as the terminal full stop, as opposed to the abbreviation full stop (and various more specialised uses of this indicator). We suggested above that the primary function of the question and exclamation marks is to indicate status rather than boundaries, and this is reflected in the fact that they differ from the terminal full stop in being able to occur medially, internally within a sentence, and to be followed by other punctuation marks:
<REC:20> [1]
ii Her son -- what a scoundrel he is! -- is threatening to sue her.
D Sentence terminals and clause type
In sentences with the form of a single clause, there is a significant correlation between terminals and clause type. The default relations are illustrated in:
<REB:75> [2]
[2] clause type sentence terminal
ii Let me know if you need any help. imperative )
iii Have you seen my glasses? interrogative question mark
iv What nonsense they talk!
exclamative exclamation mark
D Question mark
As the name implies, this indicates that the constituent it terminates has the status of a question.
E Terminal of unembedded sentence
In the simplest case the question mark occurs at the end of an unembedded sentence, in which case it is in contrast with the full stop and (normally) the exclamation mark. It is the default punctuation mark following an interrogative main clause, whether closed or open. It is also used after other clause types with the punctuation itself signalling the question meaning, as rising intonation does in the corresponding spoken forms:
<REB:67> [3]
ii Why do fools fall in love? [open interrogative]
iii You saw him, then? [declarative]
iv Take it back on Saturday?
[imperative]
Examples where the sentence has the form of a sequence of main clauses are:
<REB:68> [4]
ii Where did you get it from and how much did it cost?
iii It certainly looks very
good, but isn't it rather expensive?
The question mark tends to be replaced by one of the other sentence terminals in questions that are used as indirect speech acts:
<REB:69> [5]
ii Why don't you try to get this report to me by tomorrow.
iii Aren't they lucky to have got away with it!
iv Who cares what I think about
it, anyway!
E Embedded questions
When a question is embedded, the punctuation depends on the grammatical form: it normally takes a question mark if it has main clause form, but not if it has the form of a subordinate content clause. Compare:
<REB:70> [6]
ii a. Again the question arises: why were b. Again the question arises as to why
iii a. Her son (you remember him, don't b. [no subordinate version]
Note that where the question has main clause syntax it may or may not begin with a capital letter, signalling its presentation as an embedded sentence. Example [ia] is a case of direct reported speech (Ch. 11, #@@); here a capital is required if the question is enclosed in quotation marks, but otherwise lower case is permissible, especially with relatively short questions (I'm afraid he always asks himself, what's in it for me?). In [iia] the question is cited or identified, and here capitalisation is optional. In [iiia] the question is parenthesised (and for this type there is no matching subordinate construction); here capitalisation, while not impossible, is relatively unlikely.
E Parentheticals
Sentences containing interrogative parentheticals, or parentheticals in construction with an interrogative main clause, are illustrated in:
<REB:76> [7]
ii Will he tell them?, she asked.
iii Will he tell them, I wonder?
iv Will he tell them, do you
think?
E Use of question mark to indicate doubt
<REB:71> [8]
ii He lives with an ophthalmologist
(?) in Kensington.
D Exclamation mark
E Terminal of unembedded sentence
<REB:72> [9]
If only we had listened to her! That it should have come to this! Quick!
ii What a mess they made of it! How kind you are!
iii Look out! Get some water!
iv That's cheating! They had come without any money!
v Isn't it fantastic! What
does it matter, anyway!
E Embedding
Like questions, exclamations can be embedded within a matrix sentence, and may also be subclausal:
<REB:73> [10]
ii At first things went smoothly,
but soon, alas!, the casualties began and we had to devise a new strategy.
<REB:74> [11]
ii It's amazing what a difference
a good night's sleep can make!
D Multiple terminals
It is possible for question and exclamation marks to be iterated for emphatic effect, and for an exclamation mark to follow a question mark:
<REB:92> [12]
ii Guess what -- we've sold the house at last!!
iii Did you see his face when
she mentioned the doctor?!
D Punctuation of phrases and coordinate main clauses as separate sentences
<REB:93> [13]
ii The house needs painting.
And there's still the roof to be fixed.
3 The secondary boundary marks: comma, semicolon and colon
While the terminal full stop marks the boundaries between successive sentences, the comma, semicolon and colon normally mark boundaries within a sentence, and hence can be regarded as secondary boundary marks. They indicate a weaker boundary than the full stop, and we will see in #3.1 that there are grounds for regarding the comma as weaker than the colon or semicolon, so that these indicators may be arranged into a hierarchy of relative strength as follows:
<REC:5> [1]
In the present section we confine our attention to sentences containing neither parentheses nor dashes. The dash is also a secondary boundary mark in its main use, but it does not fit neatly into the hierarchy of strength, and we defer consideration of it until #5.
D Exception: colon marking a non-final sentence
One exception to the distributional distinction between the primary and secondary boundary marks is that the colon is sometimes followed by a capital letter:
<REB:95> [2]
ii A number of questions remain
to be answered: Who will take responsibility for converting the records
to digital form? How are the old records to be stored? Who will have access
to the digital files?
3.1 Some formal preliminaries
D Asymmetry between marking of left and right boundaries
There is an important asymmetry in the marking of boundaries:
<REB:96> [3]
ii Constituents whose left boundary
is marked almost always have their right boundary marked -- by a mark at
least as strong as the one on the left.
<REB:97> [4]
b. She suggested that the most important factor had been overlooked: the cost.
c. He has written books on Babe Ruth; on Tinker, the shortstop, Evans, the second baseman, and Chance; and on Hank Aaron.
ii a. *Anyone can take part, provided they're over 18 so there'll be no problem
b. *He told the press his reason: he did not want have to renegotiate his contract, but he did not give any explanation to the team owners.
c. *He has written books on Babe Ruth; on Tinker, the shortstop, Evans, the second baseman, and Chance; and on Hank Aaron, and they've all sold well.
D The strength hierarchy
It is constraint [3ii] that justifies the hierarchy of strength given in [1] above. In particular, it provides evidence that the comma is weaker than the colon and semicolon. A constituent with a colon or semicolon on the left cannot have a comma on the right, as illustrated in [4iib--c]. It is not possible to establish any categorical difference between colon and semicolon in this respect, and it is for this reason that we have placed them at the same position in [1]. Compare, for example:
<REB:99> [5]
ii With a book as complex and
anarchic as this, such reductionism is misleading. You could as easily
say it was about the failure of Sixties' radicalism; the decline of the
dollar; the hegemony of television culture: it is all these, and
more.
D The single level constraint on the colon and semicolon
Two colons or semicolons may not occur at different levels within a single construction (leaving aside cases where one is located within a parenthesised element). Compare:
<REC:2> [6]
iii *All students had to take
a language; Sue took French; she already spoke it well.
D Further constraints on the colon.
The colon is subject to two further constraints. Firstly, unlike the comma and the semicolon, it is not used to separate elements in a coordinative relation, but is restricted to constructions containing just two terms. Compare:
<REC:3> [7]
ii Many welcomed the proposal; some were indifferent; a few strongly opposed it.
<REC:4> [8]
Secondly, a constituent whose left boundary is marked by a colon cannot be followed by further material in the same clause:
<REC:21> [9]
3.2 Uses of the secondary boundary marks
We observed in #1 that the syntactic relations of coordination and supplementation need not be formally marked, so that with a sequence of main clauses there may be indeterminacy as to whether or not they are syntactically related in a coordination or apposition construction. For this reason we will look first, in ##3.2.1--2, at these constructions in cases where they are formally marked (i.e. they are syndetic rather than asyndetic) and/or involve constituents lower in the hierarchy than main clauses (i.e. they are subclausal, with the understanding that this covers subordinate clauses). Then in #3.2.3 we consider asyndetic combinations of main clauses. The last two subsections, ##3.2.4--5, deal with remaining cases of subclausal boundaries, the first covering cases where there is no requirement that the left boundary be marked as well as the right, the second with what we call delimiting commas, where both boundaries must normally be marked.
3.2.1 Coordination, syndetic or subclausal
In coordination, punctuation is commonly used to separate one coordinate from the next. The comma is the default mark; under certain conditions, however, a semicolon (but not a colon) is used instead. We will look in turn at bare and expanded coordinates, i.e. those that respectively lack or contain a coordinator (see Ch. 15, #@@).
D Non-initial bare coordinates: left boundary mark obligatory
<REB:79> [10]
ii The President, Dr Jones, and I myself will chair the first three sessions.
iii Do you call this government of the people, by the people, for the people?
iv They can, should, and indeed must make due restitution.
v It has a powerful, fuel-injected
engine.
D Non-initial expanded coordinates
With coordinates introduced by a coordinator, we have no categorical rule comparable to the one given for bare coordinates. This is an area where we find variation between heavy and light punctuation, the former style including more commas in this position than the latter. Punctuational marking is more likely before a long and complex coordinate than before a short and simple one (and thus, other things being equal, before a clause than before a VP, for example); it is somewhat more likely with but than with and and or; it is inadmissible in joint coordination (Ch. 15, #@@). Compare:
<REB:80> [11]
ii He packed up his papers and stormed out of the room.
iii I'll do my best, but I doubt whether I'll get very far.
iv My flat-mate and my brother's
philosophy tutor have just got engaged.
D Use of the semi-colon in coordination
A semicolon can be used instead of a comma, typically in relatively formal style, under conditions illustrated in:
<REB:81> [12]
ii After the war, the United States produced half of the world's goods; our manufacturers had no peers; and our military, bolstered by the atomic bomb, had enemies but no equals.
iii His band members are Phil Palmer, guitar; Steve Ferrone, drums; Alan Clark and Greg Phillinganes, keyboards; Nathan East, bass; and Ray Cooper, percussion.
iv Professor Brownstein will chair the first session, and the second session will be postponed; or I will chair both sessions.
v He had forgotten the thing
he needed most: a map; and he was soon utterly lost.
3.2.2 Supplementation, syndetic and subclausal
D Markers of supplementation
As noted in Ch. 15, #@@, supplements may be marked bu such indicators as namely, that is, that is to say, viz, for example, in particular, and so on. Supplements introduced by such items may be preceded by any of the secondary boundary markers. Examples [13i--ii] have subclausal and main clause supplements respectively:
<REB:94> [13]
b. This statement is still valid today, since `resemblances' lead us to think in `as if' terms; that is, in metaphorical terms.
c. Wittgenstein's treatment of the `Other Minds' problem is an extended illustration of a point in philosophical logic: namely, that the meaningfulness of some of the things we say is dependent on contingent facts of nature.
ii a. Mature connective tissues are avascular, that is, they do not have their own blood supply.
b. One way of speaking about this is to say that images in a dream seem to appear simultaneously; that is, no part precedes or causes another part of the dream.
c. Pneumatic bearings also
have a considerable application which has not been developed outside gyroscopes:
for
example, a patent has recently been taken out covering the use of a
pneumatic bearing for a glass polishing head.
The left boundary of subclausal supplements may be marked by a comma or a colon, though the constraints outlined in #3.1 mean that the colon is admissible only if the supplement follows the clause containing the anchor:
<REC:7> [14]
ii They went to Bill Clinton, the only man who could help them.
iii It was her face that frightened him most of all, the frosty smile, the brilliant unblinking eyes.
iv Either eat your breakfast or get dressed, one or the other.
v The ship steered between the buoy and the island: the only course that would avoid the rocky shoals.
vi Areas with a high concentration
of immigrants tend also to be areas of ethnic conflict: Los Angeles,
Miami, Adams-Morgan, Crown Heights.
3.2.3 Asyndetic combinations of main clauses
In combinations of main clauses with no coordinator or supplementation marker, there is no grammatical indication of the nature of the relation between the clauses. In some cases, notably where and or but could readily be inserted, they can be interpreted as coordinate; in others, the second provides an elaboration of the first -- an explanation, an exemplification, a consequence, and so on. In general, the absence of any grammatical link strongly favours a stronger indicator than a comma to separate the clauses. Thus, although examples like the following occur, they would be widely regarded as infelicitous in varying degrees:
<REB:86> [15]
ii *Your Cash Management Call
Account does not incur any bank fees, however, government charges apply.
Nevertheless, there are certainly conditions under which a comma is acceptable, and we will accordingly give in turn examples of these asyndetic main clause combinations marked by a comma, semicolon and colon.
D Comma
<REC:8> [16]
ii To keep a child of twelve or thirteen under the impression that nothing nasty ever happens is not merely dishonest, it is unwise.
iii Some players make good
salaries, others play for the love of the game.
D Semicolon
<REB:87> [17]
ii The Latin, for example, was not only clear; it was even beautiful.
iii Some colonies started under the rule of private corporations that looked for the profits in fish, fur, and tobacco; some were begun by like-minded religious seekers.
iv All students had to take a language; Sue took French.
v The bill was withdrawn; the
sponsors felt there was not sufficient support to pass it this session.
D Colon
<REB:89> [18]
ii He told us his preference: Jan would take Spanish; Betty would take French.
iii The rules were clear: they were not allowed to speak to the committee directly.
iv Brown pointed out the costs
to the community on the radio last night, and McReady mentioned the political
consequence in this morning's paper: the bill will cost the taxpayers more
than $100,000 in the first year, and may be seen as giving the Republicans
an unfair electoral advantage.
Like the comma and semicolon, the colon can separate a positive--negative sequence, where the first clause contains not + only/simply/merely/just:
<REB:90> [19]
3.2.4 Further cases of simple boundary marking at the subclausal level
D Between verb and direct reported speech complement: obligatory comma or colon
<REB:83> [20]
ii He added: `Some missiles
missed their targets, resulting in collateral damage.'
D Before certain types of complement: optional colon
<REC:9> [21]
ii The question to be considered
next is: `How long should artificial respiration be continued in the absence
of signs of recovery?'
D Between the main constituents of a gapped clause: optional comma
<REB:84> [22]
ii Some of the immigrants went
to small farms in the Midwest; others, to large Eastern cities.
D Between subject and verb: comma under exceptional circumstances
<REB:85> [23]
3.2.5 Delimiting commas
Simple examples of delimiting commas are seen in:
<REC:10> [24]
ii The plumber, it seems, had omitted to replace the washer.
iii Henry, who hasn't even read the report, insists that it was an accident.
iv I suggest, Audrey,
that you drop the idea.
<REC:11> [25]
ii Things are quite difficult: unlike you, I don't get an allowance from my parents.
iii We've been making good progress; even so, we've still a long way to go.
iv The plumber had omitted to replace the washer, it seems.
v They want to question Henry, who hasn't even read the report: it's quite unfair.
vi I suggest you drop the idea,
Audrey;
it would be better to stay where you are.
D Types of delimited element
The above examples illustrate the range of elements that are commonly delimited. In [25i--iii] we have an adjunct, in [iv] a parenthetical, in [v] a supplementary relative clause, in [vi] a vocative. With parentheticals and vocatives delimitating punctuation is required. Supplementary relative clauses, and similarly detached participials, are usually set off punctuationally, but (contrary to the rules given in the manuals) examples without punctuation are certainly attested. Supplementary NPs interpolated within a clause also take delimiting punctuation, as seen earlier in [14i]. In addition, commas are obligatory with the peripheral elements in left and right dislocation structures (Ch. 16, #@@): My neighbour, she's just won the lottery (left dislocation), I don't think a lot of him, the new manager (right).
E Constituents introduced by coordinators
We have seen that commas are often used to separate coordinates but, less commonly, they have a delimiting function:
<REC:12> [26]
ii She laughed, and laughed again.
iii He seemed to be both attracted
to, and overawed by, the new lodger.
E Adjuncts and complements
Because the function of delimitation is to set an element off from the central part of the message, it applies in clause structure predominantly with adjuncts rather than complements. Delimitation of a complement in its basic position is normally grossly deviant: *He blamed, the accident, on his children. With adjuncts, there is considerable variation as to when delimiting commas are used: this is the area where the contrast between the heavy and light styles of punctuation is most evident.
The main factors influencing the use of delimiting punctuation are:
<REC:13> [27]
ii whether or not there are punctuation marks nearby
iii the linear position of the constituent
iv the semantic category of an adjunct
v the possibility of misparsing
vi prosody
<REC:14> [28]
ii She was not sorry he sat
by her but, in fact, was flattered.
As for position, delimiting commas are most likely with adjuncts located internally within the clause. And they are more likely with elements in front position than at the end of the clause. This latter point applies, indeed, to complements too: in the relatively few cases where complements are delimited they are in front position. Compare, then:
<REC:15> [29]
b. To have any chance of winning, you'll have to train every day. )
ii a. He's not humble. ) [complement]
b. Humble, he's not.
)
<REC:17> [30]
ii She didn't buy it, because
her sister had one.
Consider next the semantic category of the adjunct. We have noted that complements are not normally delimited and this reflects the fact that they are more tightly integrated into the main predication; similarly, within the very wide range of adjunct types, those that are related most directly to the verb and its complements are less likely to be marked off by commas than the semantically more peripheral ones. Within the (necessarily incomplete) list of categories given in [1]@@ of Ch. 8, #1, the later ones thus tend to favour delimitation more than the earlier ones. Among the categories that most strongly favour commas are adjuncts of result, evaluative adjuncts (especially when non-initial), speech-act related adjuncts and connectives:
<REC:18> [31]
ii No one had noticed us leave, fortunately.
iii Frankly, it was an absolute disgrace.
iv It now looks likely, moreover,
that there will be another rate increase this year.
<REC:19> [32]
Consider finally the relevance of prosody. We have emphasised that punctuation cannot be regarded as a means of representing the prosodic properties of utterances, but there is no doubt that there is some significant degree of correlation between the use of delimiting commas and the likelihood that the constituent concerned would be set apart prosodically in speech. Compare, for examaple:
<REJ:36> [33]
ii That is clearly unsatisfactory.
Thus
the original proposal still looks the best.
4 Parentheses
In their primary use parentheses occur in pairs and enclose what we will call a parenthesised element. Their function is to present that element as extraneous to a minimal interpretation of the text, as inessential material that can be omitted without affecting the well-formedness and without any serious loss of information. They provide an elaboration, illustration, refinement of, or comment on, the content of the accompanying text.
D Range of parenthesised elements
<REC:22> [1]
ii Southern liberals (there are a good many) often exhibit blithe insouciance.
iii But listening to his early recordings (which have just been re-issued by Angel), one has the impression of an artist who has not yet found his voice.
iv If your doctor bulk bills (that is, sends the bill directly to the Government) you will not have to pay anything.
v It seems that (not surprisingly) she rejected his offer.
vi The discussion is lost in a tangle of digressions and (pseudo-) philosophical pronunciamentos.
vii Any file(s) checked out must be approved by the librarian.
viii One answer might be that
only different (sequences of) pitch directions count as different tones
with respect to the inventory.
In all these examples except [ii], the parenthesised element is integrable in the sense that the parentheses could be omitted or (as in [iii--v]) replaced by commas (at the left or both left and right boundaries). With the non-integrable type the status of the parenthesised element cannot be changed in this way. Where it is medial within the containing clause, the parentheses could only be replaced by dashes, which would make hardly any change to its informational status; where it is final, a colon or semicolon could be used to separate it from what precedes. The non-integrable type characteristically has the form of a main clause; we also find sequences like that in:
<REC:23> [2]
D Linear position
Non-integrable parenthesised elements must follow the constituent they are associated with., their anchor. Compare [1ii], for example, with *The committee included a group of (there are still a few around) Southern liberals. Integrable ones occupy the same position as they would if the parentheses were dropped, but there is a constraint prohibiting parenthesisation of an element at the absolute beginning of a clause. Thus we can parenthesise an element following a clause subordinator, as in [1iv], or following a coordinator -- as in but (not surprisingly) she rejected his offer -- but not right at the beginning: *(Not surprisingly) she rejected his offer.
D Combination with other punctuation marks
Punctuation within the parentheses depends mainly on the requirements of the parenthesised element itself. Thus terminal question and exclamation marks are used when it has the appropriate status. A full stop, and associated initial capital, however, is permitted only when the parenthesised element is not embedded within a sentence: compare [1i--ii]. The hyphen in (pseudo-) philosophical is required to be inside the parentheses because if pseudo were dropped the hyphen would drop too.
Punctuation outside the parentheses depends on the requirements of the containing sentence: it is the same as it would be if the parenthesised element were omitted. Any such punctuation normally follows, rather than precedes, the parentheses, as in [1iii].
D The single layer constraint
It is normally inadmissible to have one pair of parentheses included within another (leaving aside the secondary uses mentioned in footnote 14). Some manuals recommend that where the need for such embedding arises square brackets should be used at the lower level, but this is very much a minority usage. The usual way of solving the problem is to have parentheses at one level, dashes at the other, with no constraint on which of them occurs at the higher level:
<REC:24> [3]
ii Measures by Britain -- land
of la vache folle (mad cow disease) -- to contain the problem have been
ineffective.
Parentheses set the enclosed material apart from the main text in such a way that the latter cannot depend on it for its well-formedness or interpretation. This is why such examples as the following are inadmissible:
<REC:25> [4]
ii *She brought in a loaf of bread (and a jug of wine) and set them on the table.
iii *Ed won at Indianopolis (and Sue came in second at Daytona) in the same car.
iv *Languages like these (which
linguists call `agglutinating') are of great interest. Agglutinating languages
are found in many parts of the world.
5 The dash
Dashes occur either in pairs or singly, marking an ostensible break or pause in the production of the text. They are not used to separate coordinates, and hence, unlike the comma and the semicolon, they do not occur in open-ended series.
D Paired dashes
<REC:58> [1]
ii Exeter clearly enjoyed full employment -- as full, that is, as was attainable in the conditions of the time -- while Coventry languished in the grip of severe unemployment.
iii The book -- and the movie -- were strongly condemned by the Legion of Decency.
iv Immigrants do come predominantly from one sort of area -- 85 per cent of the 11.8 million legal immigrants arriving in the U.S. between 1971 and 1990 were from the Third World; 20 percent of them were from Mexico -- but services have not adapted to that reality.
v Many of Updike's descriptions of Hollywood -- the place -- are nicely observed.
vi In theory -- no, no theory!
-- ideally, both description and dialogue should forward narrative.
In this function dashes are in competition with delimiting commas and parentheses; either could replace them in [i--ii], while commas could in [iii]. They mark a clearly stronger break from the surrounding text than commas, and allow a larger range of constituent types to be delimited -- including, for example, a main clause, or combination of main clauses, as in [iv]. The distinction between integrable and non-integrable parenthesised elements drawn in #4 thus applies to dash-interpolations too.
There are also significant differences between paired dashes and parentheses. Dashes cannot enclose part of a word or a separate whole sentence: they could not, for example, replace the parentheses in [1i, vi--vii] of #4. They would also be at best very questionable in [1viii], where sequences of is a non-constituent that is not coordinated or otherwise paired with a comparable one. No less important is the functional difference. We noted that a parenthesised element is presented as inessential to and insulated from the accompanying text. This is often not so with dash-interpolations. In [1v] of this section, for example, the place is understood in a semantically restrictive sense, serving to distinguish Hollywood the place from Hollywood the industry: with parentheses it would give descriptive rather than identifying information, like a supplementary relative clause (as in Hollywood, which is a place). In [1vi] the interpolation serves to justify the correction of in theory to ideally, and the dashes are neither omissible nor replaceable by parentheses. Example [1iii] shows that dashes do not insulate the interpolation: the verb-form were agrees with the coordinate subject, and the pronoun they has it as antecedent. Note, then, that all but one of the deviant examples in [4] can be corrected by replacing the parentheses with dashes. The exception is [4iii], where comparison with same requires that the coordinates be of equal status; compare, similarly, *Kim -- and Pat -- are a happy couple.
D Single dashes
<REC:59> [2]
ii Initiative, self-reliance, maturity -- these are the qualities we're looking for.
iii We've got to get her to change her mind; the question is -- how?
iv You may be right -- but that isn't what I came here to discuss.
v But we would like your permission to do -- that is, to go further if need be.
vi `I think --' `I'm not interested
in what you think,' he shouted.
D Relations with other indicators
A dash can follow a question or exclamation mark and a closing quotation mark or parenthesis, but otherwise it is normally mutually exclusive with other indicators -- in particular, the comma:
<REC:60> [3]
ii *As he had no money -- he'd
spent it all at the races --, he had to walk home.
<REC:61> [4]
ii As he had no money -- he'd
spent it all at the races -- he had to walk home.
<REC:62> [5]
ii *Only four people came to
the meeting: Ed, Mr Lake -- Ed's father -- Sue and me.
As far as the scope hierarchy shown in [1] of #3 is concerned, the dash can be placed on a level with the colon and the semicolon. Example [4i] shows that a dash can have scope over a comma, while the impossibility of having a comma in place of the second dash of [4ii] shows that a comma cannot have scope over a dash. Both scope relations hold between dash and colon or semicolon. In [1iv] the second dash has scope over the semicolon, while the semicolon has scope over the dash in The results are somewhat disappointing -- 20% down on last year; nevertheless, we are confident that the full year's results will match last year's. Similar pairs can be found for dash and colon.
Like the colon, the semicolon and parentheses, the dash cannot occur at two different hierarchical levels within a single constituent. The functional similarity between dashes and parentheses, however, means that where the need for such embedding might arise, the formal constraint can be avoided by alternating between the two different indicators, as in [3] of #4.
6 Quotation marks and related indicators
D Functions of quotation marks
Quotation marks serve to assign a special status to the stretch of text they enclose, which may be anything from a word to a sequence of paragraphs. Usually they indicate that the wording of the matter enclosed is taken from another source instead of being freely selected by the writer, as with ordinary text. The main categories of enclosed matter are as in [1], with corresponding examples given in [2]:
<REC:40> [1]
ii quotation from written works
iii certain kinds of proper names, e.g. titles of articles, or radio/TV programs
iv technical terms, or expressions used ironically or in some similar way
v expressions used metalinguistically
ii Fowler suggested that many mistakes made in writing result `from the attempt to avoid what are rightly or wrongly taken to be faults of grammar or style'.
iii `Neighbours' is Channel Nine's longest-running soap.
iv Their `mansion' was in fact a very ordinary three-bedroom house in suburbia.
v He doesn't know how to spell
`supersede'.
The above functions can all be indicated by means of either single or double quotation marks. AmE predominantly uses the double marks, while usage in BrE is divided, though British manuals tend to favour single marks. Strictly speaking, then, all examples containing quotation marks should have the % annotation, but we will simplify by omitting them, allowing this general statement to stand instead. (Our practice in this book is to differentiate between the two types, with single marks used for general purposes and double marks used for the special metalinguistic function of indicating meanings.)
When quotation marks are needed at different levels there is agreement that the two kinds of quotation marks should alternate:
<REC:42> [3]
ii Wilson's claim that ``Shakespeare's
`To be or not to be' is surely the most famous line of English literature,
or any other'' is disputed by French critics.
D The pairing of quotation marks.
Quotation marks normally come in pairs, with one member marking the beginning, the other the end, of the quotation. One departure from this pattern is sometimes found in fictional writing. If a single character's speech extends over more than one paragraph, an opening quotation mark may be used at the beginning of each successive paragraph, with the closing one being reserved for the end of the final paragraph of the entire sequence. This is especially common in older (e.g. Victorian) novels, some of which have whole chapters told by a character in the 1st person, with opening quotation marks at the beginning of every paragraph. However, it is found in contemporary fiction as well.
D Quotation marks in combination with other punctuation marks
When an expression is enclosed within quotation marks inside a larger matrix sentence we need to consider the distribution of punctuation marks within the quotation itself and in the matrix sentence. This is a matter on which there is a good deal of variation, firstly between AmE and BrE, and secondly, within BrE (and other non-American varieties), between different publishing houses.
Let us begin with an untypically simple example:
<REC:32> [4]
E (a) An internal terminal full stop cannot occur medially within the matrix
<REC:33> [5]
ii *She said, `I don't know.' and stormed out of the room.
iii *Nor would he consider
trying to join Leslie and his men, rumoured to be close at hand and making
for Scotland, `which I thought to be absolutely impossible. I decided instead
to make for France', where it was hoped that Louis would back the royalist
cause.
<REC:34> [6]
ii Yet Craig remains confident
that the pitching `will come round sooner or later. We just have to hope
everybody stays healthy.'
D (b) Raising of semicolons and colons
<REC:35> [7]
ii `We ought to get going,'
she said; `the train leaves in half an hour.'
<REC:36> [8]
With quotations in final position it is usual to suppress one of the sentence terminals:
<REC:37> [9]
ii So I asked, `Whose fault was it?' )
iii Did he really say `I couldn't care less'? [suppression of internal full stop]
D (d) Relative order of comma or full stop and closing quotation mark
AmE has a rule that when a comma or full stop is adjacent to a closing quotation mark the latter must follow, irrespective of the relative semantic scope. BrE tends to position the punctuation marks according to scope, i.e. the meaning, subject to the constraints covered in (a)--(c) above. Meaning, however, does not always provide an unequivocal criterion, so we find a certain amount of variation within BrE practice.
The following cases are straightforward
and uncontroversial, with the versions given here representing uniform
BrE practice:
<REC:39> [10]
ii Instead of doing his homework he was watching `Neighbours'.
iii I replied, `It was all
Angela's fault.'
Less straightforward are cases like the following:
<REC:43> [11]
ii %`She said, `It was all Angela's fault', but no one believed her.
iii %`In that case,' she said, `we'll do it ourselves.'
iv %`Some of them',
she said, `look very unsafe.'
D Marking alterations to quotations: ellipsis points and square brackets
Two indicators are used to mark alterations made to quoted matter -- ellipsis points indicate omissions and square brackets indicate substitutions or additions made by the quoting writer:
<REC:44> [12]
ii She concluded: `The first [model] fails the test of descriptive adequacy.'
iii According to Jones, `[N]o other language has such an elaborate tense system.'
iv It says that `the first
version has been superceded [sic] by a cheaper model.'
D Alternatives to the use of quotation marks
E Block quotes
In expository texts, quotations of a substantial length (more than five lines, according to some style manuals) are often presented as block quotes, indented and set off from the surrounding text (and often in smaller type). In this case, no quotation marks are used:
<RDR:21> [13]
For the less central functions of quotation marks given in [1iii--v], italics are often used instead. With titles, it is common to make a distinction between various categories, with quotation marks used for articles in periodicals or chapters in books, for example, and italics for whole monographs or journals. Bold face and small capitals provide alternative means of indicating technical terms, and works on language will typically employ a variety of indicators for different kinds of metalinguistic use, as we do in this book. Italics are also commonly used for foreign language expressions, or for emphasis:
<REC:47> [14]
ii Ed is a writer -- a writer!
-- and Sue composes crossword puzzles for magazines.
Direct reported speech -- in the broad sense of the term -- is not always marked as such by punctuational means, especially when it is a matter of thought or interior monologue:
<REC:45> [15]
ii I bet she's missed the train,
he thought.
7 Capitalisation
The use of capital letters has two main functions: to mark a left boundary and to assign special status to a unit. As a boundary markers, capitalisation normally applies to the first letter of the first word of a sentence, though in verse it occurs at the beginning of a new line. The use of capitals to mark sentence boundaries has been dealt with in #@@, and in the present section we will confine our attention to capitalisation as a marker of status.
D Kinds of special status
As status markers, capitals are prototypically used with institutionalised proper names and functionally comparable expressions. In addition, they can mark personal and relative pronouns anaphoric to the name of a deity (God in His infinite mercy), personification (We can conceptualise this as a game played against Nature), emphasis or loudness (I said, Don't Do That!; He must be a Really Important Guy in your life), or key terms in technical and legal texts (the Tenant shall be responsible for all damage). Capitals are also used in many initialisms -- abbreviations (TV, VIP) or acronyms (AIDS, TESOL): see Ch. 19, #2.2@@). And there is the use of I for the nominative form of the 1st person singular pronoun.
D Grammatical categories marked by capitals
<REC:55> [1]
ii noun (or nominal) next Monday, a Ford Cortina, a Beethoven symphony
iii adjective French, Edwardian, Pinteresque, un-American
iv clause What's Up, Doc?,
Alice Doesn't Live Here Anymore
The precise way in which capitalised expressions are marked is subject to some variation, but the above examples illustrate a very common practice. Each word in a capitalised NP or clause has a capital letter except for short transitive prepositions (such as of, in, on), coordinators and, under certain conditions, the articles. The latter have a capital when part of the official title of a publication (such as The Times) or the official name of an institution (e.g. The European Union), but not in reference to holders of offices (the Bishop of London, the Queen) or when not part of the official title (the New Scientist). With an increasing number of compound proper nouns invented as product or business names, initial capitals appear in separate bases within the word even when there is no hyphenation: PetsMart, WordPerfect.
D Semantic categories
Capitalised expressions are used to refer to or denote a great range of different kinds of entity: indeed there would seem in principle to be no limit to it. Many are personal names, where surnames, given names and initials are capitalised (Jane Austen, T.S. Eliot). A personal name may be preceded by a capitalised appellation, abbreviated or not (Dr Jones, Professor Chomsky, Ms Greer, General Noriega, Rabbi Lionel Blum). Capitalisation is also used with the names of places (London, Steeple Bumstead), a geographical or topographical feature (the Thames, the Black Forest, the Gulf Stream), a monument or public building (the White House, the Cenotaph), an organisation (the Home Office, Amnesty International, Shell, Dolland and Aitchison), a political or economic alliance (the European Union), a country, nation or region (Great Britain, Scotland, Tyneside), languages and peoples (English, Chinese), historical or cultural periods or events (the Renaissance, the South Sea Bubble), social or artistic movements (Chartism, Decorated style), days of the week, various specials days, and months (Tuesday, Christmas Day, September), deities (God), honorifics (Her Majesty), trademarks (Coca-Cola), computer software (Word, Emacs), a kind whose name is taken from a proper name (a Chevrolet, an Oscar, a Boeing 747), and more. Capitalisation is commonly accompanied by italicisation or quotation marks in the titles of published and artistic works, as described in #6 above.
Common nouns denoting roles or institutions are often capitalised when used in combination with the definite article in reference to a particular individual or entity:
<REC:57> [2]
ii I hear the University
has increased its student intake again.
8 Word-level punctuation
8.1 Word boundaries
Word boundaries are marked by space, immediately adjacent to the word or separated from it by one or more punctuation marks. Opening quotation marks, parentheses and square brackets are located between the space and the left boundary of the word, other punctuation marks between the right boundary of the word and the following space. The dash is exceptional among the higher-level punctuation marks in that it is immediately adjacent to both the word on its left and the one on its right or is separated by space from both (as in the style used in this book). These points are illustrated in sentence [1i], whose ten (orthographic) words are listed separately, in abstraction from the higher-level punctuation, in [1ii]:
<REC:48> [1]
ii the vice-consul Ed's companion
hasn't I'm told seen Oklahoma yet
8.2 Hyphens
8.2.1 Some initial distinctions
There are two hyphen indicators, an ordinary hyphen and a long hyphen, which is realised by an en-rule and of very limited distribution. As noted in #1.2, when the en-rule character is not available (as in handwriting or material written on a conventional typewriter), the functions of the long hyphen are taken over by the ordinary one.
At the first level we can distinguish three uses of the (ordinary) hyphen:
<REC:50> [2]
ii To mark a break within a word at the end of a line: the soft hyphen
iii To represent in direct speech
either stuttering (`When c-c-can I come?') or exaggeratedly slow
and careful pronunciation (`Speak c-l-e-a-r-l-y!')
D The soft hyphen.
The purpose of this hyphen is to allow the amount of space between words on different lines to be relatively uniform. It occurs especially, but by no means exclusively, in typeset and right-justified text, and in these cases the division is made by the printer or the word-processing program. Normally, the division is made in a manner designed to facilitate reading, based on a mixture of morphological, phonological and purely visual criteria. The precise rules used will depend on the publishing house style or the word processing system, but there is also significant regional variation, with AmE tending to favour breaks at syllable boundaries (e.g. democ-racy) and BrE those at morphological or etymological boundaries (demo-cracy). Regional differences are likely to diminish with the increasing internationalisation of publishing, and the increasing tendency to rely on automatic systems for word separation provided by word processing systems, which are for the most part developed in the United States and not redesigned to take account of other countries' traditional hyphenation practices.
Divisions are not normally permitted within monosyllabic words, or within components that have (or could have) a hard hyphen at one of their boundaries (thus school-master, but not *schoolmas-ter). They also tend to be disallowed if they would yield a unit spelt the same way as some unrelated word (*of-ten, *the-rapist, or *putt-ing, as a form of the verb put).
8.2.2 Inherent and long hyphens
Among the hard hyphens we can distinguish (though not always sharply) between those that are lexical and those that are syntactic. The lexical hyphens are found in morphologically complex bases formed by processes of lexical word-formation, as described in Ch. 19. Syntactic hyphens join forms together when they occur in a specific syntactic construction, namely as attributive modifier in a nominal.
D Lexical hyphens
The hyphen may join the bases of a compound (bee-sting) or the affix and base of a derivative (ex-wife).
E Compounds
We have noted that the component bases of what from a morphological point of view is a compound may be written in three ways: juxtaposed (blackboard), hyphenated (stage-manager) or separated (Nissan hut). It is an area where we find a great deal of variation, with respect either to particular items (e.g. startingpoint, starting-point, or starting point) or to different compounds of the same morphological type (e.g. dressmaking vs letter-writing). There are two general tendencies to be noted. First, compounds which are long established are more likely to be written in juxtaposed format than more recent ones (compare dishwasher and chip-maker). Second, AmE tends to use hyphens somewhat less than BrE.
To a large extent, the choice between the three formats has to be specified individually in the dictionary. We illustrate in [3], however, a range of morphological types where hyphens are found in most or a high proportion of cases (the categories and concepts invoked here are explained in Ch. 19):
<REC:51> [3]
ii contains transitive prep free-for-all, sister-in-law, serjeant-at-arms
iii intransitive prep as 2nd base break-in, build-up, drop-out, phone-in, stand-off
iv coordinative compound Alsace-Lorraine, freeze-dry, murder-suicide
v nominal compound + @ed one-eyed, red-faced, three-bedroomed
vi numerals and fractions twenty-one, ninety-nine, five-eighths
vii dephrasal compounds cold-shoulder (V), has-been (N), old-maidish
viii verb with noun as 1st base baby-sit, gift-wrap, hand-wash, tape-record
ix 1st base is letter-name H-bomb, t-shirt, U-turn, V-sign
x rhyming-base compounds clap-trap, hoity-toity, teeny-weeny, walkie-talkie
There are also particular bases which always or usually take a hyphen: great, as used in kinship terms, always does (great-uncle), while self and the combining form pseudo@ usually do (self-knowledge, pseudo-science).
E Derivatives
Suffixes are almost invariably juxtaposed, whereas there are a number of prefixes which in BrE are usually or commonly hyphenated: non@, pre@, post@, pro@, anti@, ex@, co@, mid@ (but compare such semantically specialised forms as nonentity, midnight, etc.). It is also the usual practice, in both BrE and AmE, to insert a hyphen where there might otherwise be a danger of confusion caused by successive vowel letters or repeated sequences (re-elect, de-emphasise, de-ice, re-release), or to distinguish a word where the prefix is used in its productive sense from one where it is no longer analysable as a separate component (e.g. re-form, ``form again'', from reform; or re-cover, ``cover again'', from recover). Prefixes are generally hyphenated before a base beginning with a capital letter: un-American.
E Conflicts of scope
In general a space marks a division at a higher level of constituent structure than a hyphen. The immediate constituents of oil-rich kingdom, for example, are oil-rich and kingdom not oil and rich kingdom. There are cases, however, that depart from this pattern:
<REC:52> [4]
ii ex-army officer non-mass
market pro-United States mass market-style
D Syntactic hyphens
Hyphens are also used to join into a single orthographic word sequences of two or more grammatical words functioning as attributive modifier in the structure of a nominal:
<REC:53> [5]
ii a four-point plan a fast-food outlet the small-business sector
iii out-of-town shopping
the Hobart-to-Sydney classic a creamier-than-average taste
a never-to-be-repeated offer the what-was-it-all-for? factor
The syntactic hyphen is used with expressions in modifier function that either do not occur elsewhere in the same grammatical form (The plan contains four points/*point, The company is based in Bradford / *Bradford-based) or occur elsewhere without hyphens (The reply was well argued, We shop out of town).
D The long hyphen
This is used instead of an ordinary syntactic hyphen with adjuncts consisting of nouns or proper names where the semantic relation is ``between X and Y'' or ``from X to Y'':
<REC:54> [6]
8.3 The apostrophe
The apostrophe has three distinguishable uses:
<REC:49> [7]
ii reduction: can't there's fo'c's'le ma'am o'clock
iii separation: A's PhD's if's
1960's
The most common uses of the abbreviating apostrophe mark the negative inflectional forms of auxiliary verbs, as in can't, and the cliticisation of auxiliary verbs, as in There's no time: see Ch. 18, ##5.5, 6.3@@). Fo'c's'le is an alternative spelling of forecastle, one which matches the pronunciation. Ma'am is related to madam, but there are differences of use/meaning between the two forms. The apostrophe in o'clock reflects the etymology (of the clock), but there is no alternation with the full form in the current language. The apostrophe does not normally appear at the left or right boundary of a word in established spellings: such forms as 'phone or 'flu are now clearly archaic. The form 'n' is an abbreviation of and used in a small number of fixed expressions, mainly rock 'n' roll and fish 'n' chips. Omission of initial h (the 'ammer) or the final g of the gerund-participle suffix (huntin') is found in the representation of direct speech to indicate socially distinctive pronunciations.
A minor use of the apostrophe is to separate the plural suffix from the base, as in [7iii]; this occurs when the base consists of a letter (She got three A's in philosophy), certain kinds of abbreviation, a word used metalinguistically, or a numeral (see Ch. 18, #4.1.1@@).
8.4 The abbreviation full stop and minor reduction markers
D The full stop as a marker of abbreviation
The full stop is commonly used to mark an abbreviation -- in a broad sense of that term, covering certain kinds of contraction and acronyms. This use is subject to a great deal of variation. The omission of the abbreviation full stop is more common in BrE than in AmE, and more common in recent publications than in those of, say, twenty or thirty years ago. While there are certain kinds of reduced form where a full stop is categorically excluded, it is doubtful if there are now any cases where a full stop is required in all varieties and house styles.
The alternation is illustrated in [8] for various categories of abbreviation:
<REC:63> [8]
ii T S Eliot / T.S. Eliot JFK/J.F.K.
iii eg/e.g. cf/cf.RSVP/R.S.V.P.
iv FBI/F.B.I. pc/p.c.
v NATO/?N.A.T.O. radar/*r.a.d.a.r.
vi demo/*demo.
D Terminal full stop omitted after abbreviation full stop
The abbreviation full stop is part of an orthographic word and as such can be followed by higher-level punctuation marks. A terminal full stop, however, is suppressed after an abbreviation full stop to avoid a sequence of two full stops. Compare:
<REB:78> [9]
The asterisk or dash can be used to reduce taboo words (though such reductions are much less common than they used to be); the dash is also found in other types of reduction, for example of names:
<REC:64> [10]
We include the slash among the word-level indicators since it usually occurs without flanking spaces:
<REC:65> [11]
ii the June/July period staff/student
relations