The Seuren Blog

Neg-Raising — 4

Remarkably, verbs of the want-class appear to be very strong Neg-Raisers, so much even that some linguists have declared their non-raised versions ungrammatical. I think that goes way too far, but it is a fact that these predicates, if they are Neg-Raisers, seem to be the strongest ones of all. Any suggestion of a polite understatement or of “euphemistic reticence” (Horn 1989, p. 333) so as to mitigate the negative import possibly carried by the utterance and leave it to the listener to infer the more precise L-not(p) from the less precise not-L(p) is absent here. A drowning child crying out I don’t want to die can hardly be taken to have raised the negation from the embedded infinitival out of regard for his/her possible rescuers, who are politely invited to infer that what the child wants is to stay alive. So, if want and its cousins are Neg-Raisers, they have gone all the way down the path of grammaticalization without any trace being left of the presumed pragmatico-linguistic origin of NR. Put differently, the relative weakness of the negative import adhering to want-not(p) sentences is insufficient to explain the very strong pressure on these predicates to have the negation as the highest operator, standing over the main predicate when a negative wish is expressed. This is a question that needs clarification.

I can think of two possible, and perhaps mutually compatible or complementary, answers, neither involving NR. One, not unattractive, solution is to split up the verb want (and its cousins and equivalents in different languages) into two variants, one expressing ‘wanting’ in a strong sense, the other expressing the weaker sense of mere willingness. Let us call the strong variant want1, with the meaning ‘want positively/preferably’, or, in the best practical translation I can think of, ‘really want’. The weak want, say want2, represents a mere willingness and is synonymous with be willing, be prepared. This ambiguity is taken to be universal, except for those languages that distinguish the two senses lexically.

Ancient Greek, for example, had two words for ‘want’: boúlomai and ethélo. Although they sometimes occur indiscriminately in the extant texts, the proper meaning of boúlomai is ‘really want’ or ‘rather want’ or ‘strongly want’, whereas ethélo has the weaker sense of ‘be willing’, ‘not be averse to’. Liddell & Scott, the 19th-century Ancient Greek dictionary sans pareil—I have used it for sixty years now and have never ever found a misprint in all of its 2111 huge pages of dense print in a variety of fonts—quotes, for example, “Ei boúlei, soì egò ethélo lógon léxai” (‘if you really want it, I am willing to tell you something’) from Plato’s Gorgias 522e.

If we further assume that want1 is a positive polarity item (PPI) and want2—other than be willing, which is neutral—a negative polarity item (NPI), then not-want1(p) will only be possible with not as the metalinguistic, presupposition-cancelling, echo-producing radical negation allowed to stand over PPIs, and not-want2(p) will only be possible with the default, presupposition-preserving, negation, or in other so-called ‘negative contexts’, such as questions or if-clauses. In fact, the simple verb want used in invitational questions like Do you want to join me for a drink? clearly represents want2 and not want1, unless the question is formulated as Do you really want to join me for a drink?, which is rather the opposite of an invitation. As far as I can see, this set-up seems to fit the facts to a tee. But it means that want and its cognates must be taken to be systematically ambiguous between these two senses. The advantage, I think, of this analysis is that it is more precise than any pragmatically flavoured solution and that it explains the fact that the negation over want and its cousins very strongly enforces the reading ‘not be willing’, much more so than in genuine NR cases. In fact, the strength of this enforcement matches the strength of the default, presupposition-preserving minimal negation over the nondefault, metalinguistic, presupposition-cancelling radical negation.

To hark back to the Greek pair boúlomai and ethélo, the suggestion arises that the former, when properly used, should thus be a PPI, whereas ethélo would tend towards being an NPI. Whether this is actually so, I have not been able to make out. But it is striking that of the many quotes given in Liddell & Scott, those with ethélo occur overwhelmingly in a negative context, whereas this is not so for boúlomai.

This solution thus does not involve NR: the negation is found in its ‘original’ place, the one standing over want1 being the radical, the one standing over want2 being the minimal negation. For either verb, the internal negation is possible but marked, just as it is for verbs of perception: I definitely heard the clock not strike is not normal English but stylistically marked and meant to have a special, mildly amusing, effect. (The example is taken, if I remember correctly, from the post-WWII British comedy Arsenic and Old Lace.)

The other possible answer I have in mind does not require NR either, nor the assumption of an ambiguous want, but requires a theory of prelexical syntactico-semantic analysis close to what was proposed by Jim McCawley in the late 1960s. This analysis plays on the fact that verbs for ‘wanting’ carry the notion of preference in their meaning, which can be given semantic substance by assuming the semantic operator ‘rather’ as an element in their syntactico-semantic analysis. The verb want is then decomposed, in the sense of prelexical syntax, into the component parts ‘rather’ and, say, ‘have-it-that’, with ‘rather’ already being a PPI and the predicate ‘have-it-that’ being neutral as regards polarity. The combination ‘rather+have-it-that’ is then taken to correspond to what we have called above want1. Since, in the semantic tree structure, ‘rather’ is an operator over ‘have-it-that’, the negation not can occur both above and below ‘rather’. If it occurs above ‘rather’, then, since ‘rather’ is a PPI, it must be the non-default, metalinguistic, presupposition-cancelling, radical NOT, which occurs only in assertive main clauses and only as the highest operator. We can thus have a sentence corresponding to, for example, ‘I NOT rather have it that I die’, with the radical NOT and the well-known echo-effect, expressing the meaning that it is not the case that the speaker wishes to die, as has been suggested in current discourse. If, by contrast, not occurs between ‘rather’ and ‘have it that’, resulting in a structure ‘rather+not+have-it-that(p)’, rather is the highest operator and not can, therefore, only be the default, presupposition-preserving, minimal not. The difference in meaning with ‘NOT+rather+have-it-that(p)’ is striking. Apart from the difference between radical and minimal NOT/not, there is the scope difference, which matches that between ‘NOT-rather’ and ‘rather-not’: the former is, strictly speaking, neutral as regards the speaker’s wishes; the latter, due to the higher rather, expresses a marked preference for the negative, as in ‘rather+not+I have-it-that(I die)’, which is close enough, semantically, to I don’t want to die in the default sense. Thus, in the configuration ‘rather+not+have-it-that(p)’, ‘have-it-that’ is equivalent to our want2, in the meaning ‘be willing’.

The problem is, of course, how to specify the lexicalizations into want1 and want2 in a systematic and generally valid way. For want1 this is  relatively simple: the combination ‘rather+have-it-that’ can be combined into one form want, which has inherited its PPI-status from rather. But want2 causes a problem in that the minimal not occurs between ‘rather’ and ‘have-it-that’. McCawley was in the habit of using the expression “brute force” to gloss over problems of this nature, but that was precisely one of the things that made him vulnerable. So I won’t do that. I’d rather state the problem as it is. We can hypothesise, of course, that ‘rather+not+have-it-that’ as a whole lexicalises into not-want2, but then we must assume an underlying not for all negative contexts, also for those without any overt not, which is quite a step to take. Or we can simply assume that the semantic element ‘rather’ is deleted during final lexicalisation and ‘have-it-that’ is replaced with want, so that we have NOTwant1(p) and not-want2(p) alongside each other, with a very strong preference for the latter. But as long as we know so little about these things, that would be a clear instance of “brute force”. All I can do here is leave the question open.

One should note that in neither of these two analyses is it possible to fit the items concerned into a Square structure, the reason being that two different negations are involved, the one minimal and the other radical, which makes it impossible to have the relations of contradiction, contrariety and subcontrariety in one Square structure in any linguistically relevant way. For language, the Square is relevant only if radical falsity is left out of account.

I personally have not made up my mind yet as to which of the two answers I am willing to put my money on, but I think it is useful to consider both. A crucial point is the status of prelexical syntax. If it turns out after all—I mean after Chomsky’s perfidious war against Generative Semantics—to be possible to develop a well-motivated general theory of prelexical syntax providing legitimate room the second solution, then the chances of my money going to the second solution will be considerably enhanced, since, I think, it will then account for the ambiguity of want-predicates in a motivated way and may even be seen to incorporate the first solution. If, on the other hand, prelexical syntax in any form turns out to be a nonstarter, I think one will have to fall back on the first solution, which is not too bad either, as the phenomenon of universal lexical ambiguities is becoming more and more known and accepted. Unless, of course, my entire approach is misguided, in which case I want to see solid arguments, not the kind of wishy-washy rhetoric one has unfortunately got used to in linguistics these days.

There is one further point that remains to be mentioned. Various authors, including Horn (1989, pp. 339–40), have argued that want and its cousins cannot be Neg-Raisers because every language has sentences like I want chocolate or I like you, with their respective negations I don’t want chocolate or I don’t like you, which show the same ambiguity as in I don’t want to die but which lack an embedded clause to raise the negation from. The conclusion drawn by some authors (discussed in Horn 1989) that, therefore, NR does not exist for any predicate at all goes way too far, but as long as one defends the NR analysis for the want-class, the objection has to be answered. However, in my view the want-class predicates do not induce NR: neither of the two solutions requires NR. So what we have is a situation in which a given predicate can take either a referential or a clausal argument term, which is nothing exceptional. Predicates of perception, such as hear, see, feel are clear examples: one can see a sinking ship and one can see a ship sinking. Or else (but only for want-predicates, not for the like-class), one can assume that the predicate have is systematically deleted under both want1 and want2: I want an apple is then short for ‘I want to have an apple’ and likewise for I don’t want an apple—a proposal that has been well known since the days of traditional grammar. (Systematic deletions of designated lower predicates of a general nature are well known in the languages of the world. In German, for example, one can say Ich muß nach Berlin, literally ‘I must to Berlin’, which is short for ‘I have to go to Berlin’. The same go-deletion, by the way, occurs frequently in Shakespeare’s texts.)

This is as much as I have to say about NR for the moment. I am taking a few weeks’ holiday now, with only occasional access to the internet. I hope to be back in good shape by mid-January next year. In the meantime, to all of you my best wishes for a good Christmas and a happy New Year.


Neg-Raising — 3

So what is the significance of the classic Square of Opposition in the context of NR? First, it seems significant that all NR predicates occupy the A-position in the Square (not conversely: it is not so that all predicates in A-position are Neg-Raisers). But what does it mean to say that a predicate L occupies the A-position in the Square? This way of looking at lexical predicates is not found in the literature, but it is, in fact, fairly simple and straightforward, and, I hope, enlightening. I distinguish two classes of cases. The first class does not take an embedded complement-S as an argument but takes only nominal arguments, such as hit or eat. These predicates form a Square structure when there is a different predicate entailed by, or a different predicate contrary with, them, or both. Thus, as we saw in the previous posting, since murdered entails dead, murdered is in A-position, with dead in I-position. The rest follows automatically: not dead now stands in E-position, and not-murdered in O-position. For not dead we can read alive: Square structures are valid under constancy of presupposition; given that, alive is the contradictory of dead within the universe restricted by the presupposition that the subject is animate. In this case, there are different predicates both in I-position (dead) and in E-position (alive). (Apparently, there never is a special lexicalization for the O-position; the reason for that is discussed in P.A.M. Seuren & D. Jaspers, ‘Logico-cognitive structure in the lexicon’, to appear in Language, some time in 2014). All those well-known pairs of contraries, such as goodbad, politeimpolite, hotcold, insideoutside, etc., etc., form Square structures with the simple positive in A-position and the contrary term in E-position. Litotes forms, such as not bad, not impolite, thus occupy the I-position, as they are the contradictories of their opposite number in E-position. Perception predicates like see or hear, when they take simple nominal object arguments, are in A-position with regard to, for example, be aware of in I-position: the rest of the Square follows in virtue of the negation standing over these predicates.

The second class of A-position predicates takes a complement-S as one of its arguments. The complement-S can itself be negated, in which case we speak of the internal negation, as opposed to the external negation which stands over the main predicate. Now, to say that a complement-taking predicate L(p) (where p stands for the nonnegated embedded argument-S) occupies the A-position in the Square means that L(p) entails not-L(not-p), which will then occupy the I-position. Sometimes there is a special, mostly optional, lexicalization M(p) for not-L(not-p). When there is such an M, we say that L and M are duals. There may also be a special lexicalization for the E-position, as with say(p) in A-position and deny(p) (= say(not-p)) in E-position, which makes say(p) and deny(p) contraries.

Given the entailment from A-position to I-position—the so-called positive subaltern entailment—the rest follows again automatically, in virtue of the contradictories of both positions formed by means of a higher negation. Thus, given that L(p) stands in A-position, not-L(p) is in O-position and not-not-L(not-p), which equals L(not-p), in E-position. When there is a dual predicate M(p) standing for not-L(not-p), the E-position is filled by not-M(p), equivalent with L(not-p). When there is a contrary predicate K(p) standing in E-position, the I-position is filled by not-K(p).

Given the positive subaltern entailment from L(p) to not-L(not-p), it follows, by contraposition, that L(not-p) (the E-corner) entails not-L(p) (the O-corner). This is known as the negative subaltern entailment. The I and O corners, not-L(not-p) and not-L(p) respectively, are subcontraries, since if the former is false, its contradictory L(not-p) is true, and if the latter is false, its contradictory L(p) is true; but L(not-p) and L(p) are contraries, which means that they cannot both be true at the same time. It follows that not-L(not-p) and not-L(p) cannot be false together. They may be true together since no logical clash occurs when they are. This means that not-L(not-p) and not-L(p) are subcontraries. Voilà a Square structure (salva presuppositione), complete with all logical relations of the classic Square, but not necessarily with the Conversions.

Examples of I-predicates are nondeontic allow(p), with force(p) as its dual in A-position, or approve(p) in I-position, with disapprove(p) in E-position, disapprove(not-p) in A-position and approve(not-p) in O-position. Accept(p) is likewise an I predicate, with accept(not-p) in O-position, not-accept(p) in E-position and not-accept(not-p) in A-position. It is a bit of a puzzle, but you will see that the pieces fit nicely together.

In some cases, again, there is a separate lexical form for some composition with the internal and/or external negation. Thus, for example, the predicate believe(not-p), in E-position, is optionally lexicalized in English as disbelieve, so that we get the Square configuration [A: believe(p)], [I: not disbelieve(p)], [E: disbelieve(p)], [O: not believe(p)]. Likewise, as has been said, for say(not-p), which optionally lexicalizes as deny(p), so that we have: [A: say(p)], [E: deny(p)], [I: not deny(p)], [O: not say(p)]. Yet, although both believe and say are A-predicates, believe is an NR predicate whereas say is not. Other A-predicates are, for example, force(p), which has allow(p) for a dual—that is, in I-position—and disallow(p) in E-position, assert(p), plan(p), hope(p), prefer(p), (with disprefer(p) as an optional lexicalization for the E-position).

It seems that all complement-taking NR predicates are A-predicates, though not vice versa. In the present context, this makes sense because an A-predicate taking scope over a negated proposition (or propositional function) has clear negative import, due to the universal character of the E-position. This negative import is indirect because, other than with the A-position, it is mediated by the internal negation over the embedded proposition p. This indirect negative import will, according to my hypothesis, have the effect observed in NR cases: the negation will tend to hop across the higher predicate and come out with highest scope. In sentences with the logical structure all-not, and-not, force-not, cause-not, and others of the same nature, this preference is easily realized, since all, and, force, and the others all have a single-item lexical dual available: some, or, allow, etc., respectively . All that has to be done to get the negation in highest position is to put it there and replace all, and and force, etc. with their respective duals. This is, presumably, why sentences with all-not, and-not, force-not, etc. are decidedly unpopular, though not altogether ungrammatical. Speakers don’t like Everybody did not leave; they prefer Nobody left. And likewise with John didn’t leave and Harry didn’t leave, and He forced me not leave, which are dispreferred in favour of Neither John nor Harry left and He didn’t allow me to leave, respectively. This seems to be the case in all languages.

But what if the predicate L over the negated p does not have a single-item lexical dual? Well, if raising not to the position over L changes the truth conditions in such a way that communicative havoc would result, there will be no NR. Thus, a sentence like I told John that it wasn’t raining does not allow for NR, since I didn’t tell John that it was raining conveys a very different message and saying the one while meaning the other will hardly make for useful communicative interaction. So the verb tell is not a Neg-Raiser, even though it is an A-predicate. In general, verbs that may serve as speech act verbs when used in the first person singular and in the present tense, such as say, tell, assert, claim, forbid, order, etc. do not allow for NR. The same for factive predicates, which presuppose the truth of the embedded clause. If the negation over the subordinate clause p is moved upward into the main clause, what is presupposed is no longer not-p but p: John realizes that not- p, with the factive predicate realize, presupposes the truth of not-p, but John doesn’t realize that p, under the default minimal negation that preserves presuppositions, presupposes the truth of p. Swopping the one for the other rather turns the entire communicative set-up upside down (and destroys any possible Square structure).

But what predicates are then left, which do allow for the negation in the lower clause to be raised to do service in construction with the main predicate? Here I fall back on the suggestion made by Horn and earlier authors to the effect that Neg-Raisers are those predicates that allow for use as a litotes or understatement, politely leaving it to the listener to infer the intended overall negative import, especially when they are in the first person and in the present tense. In this view, NR would then have its ‘origin’ in a polite, mitigating manner of speaking. The precise sense of the term origin in this context is unclear, but it is perhaps best taken to refer to a natural, preprogrammed tendency in the early formative stages of a language, such as the early Middle Ages for most Romance languages, or the immediate post-Conquest period for English. (It would be interesting to see what can be found out about NR in the early stages of Creole languages, some of which have been documented to some extent from very early on in their development, but I know of no work done in this respect.) NR would then quickly develop, in each language, into a grammaticalized construction, and it would fix on certain predicates allowing for NR but not on others, depending on ‘the will of the people’ (what I call settling in Chapter 1 of my book From Whorf to Montague, just out with OUP). Such processes are far from clear. They take place in the border area between language, cognition and society, an area that is still in need of much clarification and where things look as if they are more fluid than within the more strictly regulated boundaries of well-established language systems proper. And, of course, we would very much like to have some precise criteria for possible use as a litotes—again untrodden ground. Yet it does look as if the solution to the NR problem should be sought in this direction, hazy as most of all this still is.

It is anyway suggestive that there are sentences like I thought you would never come, that is, without NR, which are perfectly idiomatic and not interchangeable with their raised version: I didn’t think you’d ever come conveys a rather different message—an observation made by Wim Klooster many years ago. But note that the non-raised version does not have overall negative import but rather expresses the speaker’s satisfaction at the addressee’s arrival, late though it may have been, which makes the high negation as a warning sign for negative import inappropriate. Likewise for a sentence like You are not going to the supermarket, I suppose?, which, again, has no negative import but is, rather, a veiled request for a small favour, like a lift or an errand the addressee could run for the speaker. … to be continued

Neg-Raising — 2

When discussing a problem that has proved so refractory as NR, we should first of all be systematic. One way of being systematic is to distinguish which questions are on the table. I distinguish four of them: (1) How do we identify true cases of NR? (ii) What evidence will show that NR is or is not a rule of syntax? (iii) If so, how does NR fit into a theory of syntax? (iv) What triggers NR and how do we delimit the class of possible NR predicates? In the vast literature on NR, these questions are usually not clearly distinguished, which is not good. So let’s look at them briefly one by one.

The most common criterion for the identification of NR is semantic: I don’t think he is safe is normally felt to mean the more committal ‘I have the belief that he isn’t safe’ and not the less committal ‘I do not have the belief that he is safe’. But that is not enough. How do we know, for example, that a sentence like They are not anxious to move in, when used to conceal the blunter message ‘they are anxious not to move in’, or a sentence like He is not the sort of man to pick a quarrel with, used to convey the message that he is the sort of man it is wise not to pick a quarrel with, are not cases of NR but rather of litotes or understatement? The answer is complex. To exclude a sentence S from the NR category one will have to show that S does not fit into a pattern of regularities and other phenomena that are associated with clear cases of NR. And that again requires at least an initial theory of NR.

But let’s let this question rest for the moment and pass on to question (ii). To argue that NR as we know it is a rule of syntax, we will have to appeal to “phenomena involving rule interaction, mood, complementizer type, opposite polarity tags, object case marking, sentence pronominalization, anaphoric destressing, sluicing, subject-aux inversion, queclaratives, or the syntactic reflex of De Morgan’s Law” (Horn 1989, p. 313). The best known criteria have been the occurrence of strict (i.e. same-clause) negative polarity items (NPIs) in the lower clause, as in I don’t think he was in the least interested, where the NPI in the least, which normally requires a preceding negation in the same clause for the sentence t0 be grammatical, has seen its not promoted to the higher clause. Another, perhaps stronger, criterion is derived from the theory of island constraints: not in a lower clause cannot be raised to the main clause when the former is part of an ‘island’ as this notion is known in the theory of syntax. This criterion has recently been elaborated to great effect in Chris Collins & Paul M. Postal, Classical NEG Raising: An Essay on the Syntax of Negation, Linguistic Inquiry Monographs, MIT Press, to appear May,  2014. The authors argue, for example, that a sentence like *I don’t have the expectation that they will find a living soul there is ungrammatical because the strict NPI a living soul has had its required negation moved upward across the island boundary set by the complex NP the expectation that.

In Horn (1978, 1989)—two classics that are indispensable for any proper study of NR phenomena—a large range of arguments are discussed for and against a syntactic status for NR, but Horn comes to incompatible conclusions. In his (1978, p. 216) he concludes: “… the result is that, faute de mieux, NR must be regarded as a rule in the synchronic grammar of English and other languages.” Yet in his (1989, p. 313) we read: “In the end, I see no reason to repeal my conclusion in the 1978 paper: the strongest positive arguments on behalf of a syntactic rule of NR prove to be untenable, indecisive, or dependent on additional (often tacit) assumptions which are at best theoretically and/or empirically dubious.” Not only are these two conclusions, which are based on largely the same material, mutually incompatible, there also seems to have occurred a rather crucial lapse of memory, besides a significant shift over time, away from syntax and towards pragmatics. Horn never mentions the island argument mentioned by a number of earlier authors, including myself in my 1974 paper, and elaborated by Collins & Postal in their recent monograph. Nor does he mention in his (1989) my plural-were argument, as in I don’t think either Harry or John were late (see the previous posting), still treated as a strong argument in his (1978). In any case, opinions as regards the possible grammatical status of NR have been heavily influenced, over the past forty years, by ideological parti pris: Chomskyans don’t like it because it smacks too much of Generative Semantics; pragmaticists don’t like it because their aversion to syntax increases as the syntax becomes more abstract; formal semanticists hate it because it complicates semantic compositionality. I will take it that NR is indeed a rule of syntax, even though there is no unanimity on how to integrate it into the machinery of grammar. Which brings us to question (iii), which I won’t further discuss here, other than by saying that I expect some mileage from a precyclic notion of NR in terms of Semantic Syntax. Wim Klooster wrote an article specifically addressing question (iii), ‘Negative Raising revisited’, in Germania et Alia. A Linguistic Webschrift for Hans den Besten, edited by Jan Koster & Henk van Riemsdijk, 2003, available on the internet. Rejecting the Semantic Syntax way of accounting for NR, he proposes a Minimalist account. I won’t go into the details of that, since on general grounds I can’t see how Minimalism could pass muster as a theory of grammar, and also because I don’t want to cross swords in public with a close friend of five decades’ standing. I would have needed expert help anyway for a proper understanding of the minimalist grammatical technicalities.

What I want to focus on here is question (iv): What triggers NR and how do we delimit the class of possible NR predicates? Horn devotes many pages to this question. He lists as “potential NR triggers” predicates of (1) opinion, (2) perception, (3) probability, (4) intention/volition, (5) judgment / (weak) obligation (1978, p.187; 1989, p. 323). What binds this class together is, according to him, the fact that its members occupy an intermediate position on a set of “pragmatic scales” (1989, pp. 324–5), characterized by the property that its negation “will be an intermediate value on the corresponding negative scale” (1989, p. 325). This property is based on the observation that in scalar contexts, as a general rule, the negation represents the weaker relation of contrariety, not the stronger one of contradictoriness.

Some will need need a prompt here. Two sentences are contraries when they cannot both be true but may both be false at the same time (e.g. Caesar is a man and Caesar is a dog); they are subcontraries when they cannot both be false but may both be true at the same time (e.g. John is dead and John has not been murdered); they are contradictories just in case they can neither be true nor be false at the same time (e.g. John lives in Paris and John does not live in Paris). It follows that a proposition p under contradictory not1 ENTAILS the same proposition p under contrary not2: if p and not-p can be neither true nor false together, it follows that they cannot be true together. Logicians consider not to be a guaranteed producer of the contradictory of what forms its scope. If this distinction is applied to the raised not, when taken literally (i.e. as if it had not been raised), the semantic result in common language use is the contrary not2, not the contradictory, not1.

The question is, of course, WHY. Why is not reduced to the weaker contrary not2, in NR contexts? Horn gives two answers to this question. One is (1989, p.333) that the use of the weaker contrary not2 is more guarded or polite, in that it leaves it to the listener to infer the more precise contradictory not1, and that this pragmatic politeness convention has spread to other categories, to different degrees in different languages. The other answer, given in (1978, p. 215–6) is: “The view proposed here is that NR originates as a functional device for signalling negative force as early in a negative sentence as possible.” Yet despite the many pages devoted to this topic, neither answer is shored up with precise criteria. One hardly manages to form a concise idea of what is meant and, beyond rough outlines, it remains obscure what exactly prohibits or allows NR in any given predicate. So what we are left with is a series of more or less useful and certainly intuitively appealing but informal suggestions as to what may or will be, or have been, behind the facts of NR.

I find this altogether unsatisfactory. For one thing, the scales that are invoked (1989, p. 324–5)  are anything but self-evident and no empirical criteria are given for their validity. For example, what to think of a scale (ranging from weak to strong) “be legal/ethical — want/choose/intend/plan — order/demand/require” (1989, p. 325)? It seems to have been put together pour le besoin de la cause. Consequently, it remains opaque why, in particular, predicates of wanting or liking are such extremely strong Neg-Raisers in all languages, even to the point that some linguists have considered a sentence like I want not to go ungrammatical. So let me try to bring a little more light to the entire question of NR, which is both crucial to the theory of language and, apparently, extremely hard to solve.

I take as my cue the intuition expressed in Horn (1978, p. 215–6): “NR originates as a functional device for signalling negative force as early in a negative sentence as possible.” It seems to me that we may come a little closer to a solution if it is assumed that NR is the grammaticalization of a cognitive trend to give the negation operator largest scope over assertive propositions that have negative import, which is either indirect or psychologically qualified. I have no formal definition at hand for the notion of negative import, but the informal definition is that what has been said in the end amounts to an overall negative conclusion or evaluation. Perhaps the notion is to be defined in psychological or cognitive terms, but even then it would be desirable to have criteria that are a little more precise. With this proviso, I think we may tentatively proceed and accept that negative import is gradable in that it can be more or less indirect and more or less qualified. Correspondingly, we see that different languages and dialects draw the line for NR at different places in a partially ordered gradability scale, leading to cross-linguistic variation in this regard. Verbs for ‘hope’, for example, are Neg-Raisers in some languages (e.g. German, Dutch) but not in others (e.g. English). The ‘original’ function of NR would then be to alert the listener straight away that s/he should be prepared for a negative message, which might not be altogether welcome.

Let us see first what is meant by indirect negative import, the question of what is meant by psychological qualification being left hanging for the moment. Cases of indirect negative import are those where the truth of the proposition logically implies the reality of a negative fact not-f without the falsity of the proposition referring to f being expressed directly by the negation operator taking paramount scope: ‘cause [not p]’ results in the imagined fact f referred to by p not becoming a real fact; ‘necessary [not p]’ means that f cannot be a real fact; ‘and [[not p], [not q]]’ means that both p and q are false; ‘all x [not PRED(x)]’ amounts to ‘no x[PRED(x)]. I argue that placing the negation operator in highest position is psychologically functional in that it prepares the listener for the overall negative import of the utterance at hand. Lexical switch to the logical dual (from all to some, from cause to allow, from necessary to possible, etc.) is called for in such cases to avoid communicative breakdown. The point is that negative import should be signalled by the negation being at the top of the semantic tree.

A little note is called for about the notion of duality. Two operators (predicates) K and L are considered duals just in case not-K-not is equivalent with L and not-L-not is equivalent with K. Thus, all and some are duals because not-all-not is equivalent with some and vice versa. Likewise for order and (deontic) allow, and a few other predicate pairs, such as (epistemic) necessary and possible. The internal negation can swop places with the external negation, provided one dual is replaced with its opposite number.

A note also about the Square of Opposition, the classic traditional schema of predicate logic (not quite correctly attributed to Aristotle, but that is a different issue). Around 1900, the old Square was replaced with modern (Russellian) predicate logic, as it was widely believed that the old Square was logically faulty. In a number of recent and forthcoming publications I argue that this attitude is misguided in that the Square is not simply logically defective and is of great value in the study of language, more so, actually, than standard modern predicate logic. The Square is traditionally represented as a geometrical square, whose four vertices or ‘corners’ have the traditional type names  A (for ‘all R is M’), I (for ‘some R is M’), E (for ‘no R is M’), and O (for ‘not all R is M’), as shown in Figure 1 (where arrows indicate entailment, the cross in the middle stands for contradiction, “C” means ‘contraries’, and “SC” means ‘subcontraries’).


It has not, or at any rate not often, been observed (a) that one may have a Square without duality, but complete with all its logical relations, and (b) that the Square without duality has a wide application outside logic in the lexicon as a whole. Square structures with one or more lexical predicates but without duality are, for example:  [A: be murdered], [I: be dead], [E: be alive], [O: not be murdered]; or [A: believe(p)], [I: not disbelieve(p)], [E: disbelieve(p)], [O: not believe(p)]. Why this is both relevant and interesting I hope to show in the next post. … to be continued

Neg-Raising — 1

There is one particular aspect to the question of scope that deserves special attention. This is the question of how to account for the worldwide phenomenon that, with certain verbs, a negation that is interpreted as part of the subordinate clause is found grammatically in the main clause, thereby giving the impression that it takes semantic scope over the whole sentence. I mean cases such as I don’t think she’ll win, with not in the position for negation over the whole sentence with think as main verb, whereas the sentence is normally understood as saying ‘I think she won’t win’, where not has scope only over the embedded complement clause. I will use the term Neg-Raising (NR) to denote this phenomenon, regardless of the fact that this term originated as one of the names (along with Negative Transportation, Negation Absorption and others) of a putative transformation in the framework of early transformational grammar. Traditional prescriptive grammars used to dictate that this way of speaking or writing should be avoided, as it was ‘illogical’. Yet the phenomenon is part of human language and thus needs an explanation.

What I will argue for in this post series is that NR is indeed a rule of syntax. This does not exclude a pragmatic motivation for the widespread use of NR constructions, conceivably based on the wish to speak in a modest manner (I think she won’t win is blunter than I don’t think she’ll win), but the tendency to let negation take the largest possible scope when the message has an overall negative purport is, in my view, a universal and functional trait of language, built into the human language system as it links up with cognition—regardless of the frequent, politeness-induced use of NR in sentences of the type I don’t think that … . In the light of the way linguistic theory has developed over the past half century, I also feel that if the concerted efforts, over the past forty years, by pragmaticists, by Chomskyan grammarians and even by formal semanticists to write NR out of the grammar-cum-semantics script and lodge it in pragmatics had been backed by more careful and more evenhanded arguments, it would have become clear much earlier that they were bound to fail. One cannot help but think that the efforts in question were motivated by the generally prevailing wish to debunk Generative Semantics, the most profound and most abstract of all linguistic theories.

After a few decades of relative rest, the question of how to account for NR phenomena has again come to the fore in a recent lengthy paper by Chris Collins and Paul Postal, ‘Classical NEG Raising’, which has been announced as a Linguistic Inquiry Monograph, to appear in May 2014. There is also a paper by Wim Klooster, ‘Negative Raising revisited’, in Germania et alia. A linguistic webschrift for Hans den Besten, edited by Jan Koster and Henk van Riemsdijk, 2003, available on the internet.

But let us start with the beginning. In early transformational grammar (TG), NR was taken to be a cyclic rule ‘promoting’ the negation standing over an object or subject complement clause to the position reserved for sentence negation, without, however, that becoming the ‘new’ interpretation of the sentence. As such, NR was a prototypical rule of Generative Semantics (Semantic Syntax or SeSyn), although it was already part of the repertory of rules in earlier forms of TG. A load of arguments was adduced to show the reality of NR as a syntactic rule.

An important and still widely quoted argument is found in Robin Lakoff’s 1969 paper ‘A syntactic argument for Negative Transporation’ (Papers of the Fifth Regional Meeting of the Chicago Linguistic Society, 1969, pp. 140–7, reprinted in P. A. M. Seuren ed., Semantic Syntax, OUP, 1974, pp. 175–82). Robin Lakoff observed was that a restricted class of negative polarity elements (NPIs) can occur in clauses under an NR verb such as think, from which the negation has been raised but not in clauses under a verb that does not allow for NR, like claim (the NPIs are italicised):

(1)        a          I don’t think that Jim will arrive until tomorrow.

—         b.         *I don’t claim that Jim will arrive until tomorrow.

(2)        a.         I don’t think Jim is in the least interested.

—         b.         *I don’t claim Jim is in the least interested.

Since in general NPIs like until or in the least require the negation to be in the same clause, the fact that the negation is, exceptionally, in a higher clause in (1a) and (2a) is evidence for NR. However, Robin Lakoff herself cast doubt on this argument by pointing at grammatical sentences like No-one thought that John would leave until tomorrow, which is semantically equivalent to Everyone thought that John would not leave until tomorrow. Her problem was that it was unclear how NR could transform the latter into the former.

I tried to resolve this question in my 1974 paper ‘Negative’s travels’ (in my edited little volume Semantic Syntax, OUP, 1974, pp. 183–208). The solution I proposed there was, for me, a dramatic confirmation of the Semantic Syntax approach to grammar and language. I observed that NR also occurs under a number of ‘universal’ predicates like modal verbs of necessity/obligation, the propositional operator and, the quantifier all, and the predicate cause, often but not always with a lexical switch from the ‘universal’ to its ‘particular’ counterpart, entailed by a subaltern entailment. A clear example without lexical switch is the French sentence (3a), where the negation ne…pas takes grammatical scope over the verb doit (must) but in whose semantic representation the negation takes scope only over the embedded clause, as overtly expressed in the less felicitous (3b):

(3)        a.         Ça ne doit pas être gai. (that mustn’t [be very nice])

—         b.         Ça doit ne pas être gai. (that must [not be very nice])

The position of ne…pas in (3a) shows that it is grammatically in construction with doit, manifesting an underlying structure where the negation takes scope over the verb devoir (must), whereas in (3b) it is grammatically in construction with être gai. It is hard to deny that this is a case of NR: the negation has been ‘promoted’ from the subordinate infinitival clause to the main finite clause. (The same is true for English mustn’t in the translation of (3a). Unsophisticated grammarians tend to say that must here takes grammatical scope over not, but this is false, since the suffixated form n’t results from a transposition of a higher not to suffix position, as in can’t or cannot, written as one word, which means ‘not possible’ and not ‘possible not’; see my Semantic Syntax, pp. 111–6.) (3a) is a case of NR without lexical switch. Cases with lexical switch are, for example, (4a–c):

(4)       a.         That can’t be true.    from: [must [not [that be true]]]

—        b.         No-one left.    from: [everybody x [not [x leave]]]

—        c.         He didn’t allow me to go.    (from: [he cause [not [I go]]]

The inference that, at least in some cases, the negated particulars can’t, no-one and not allow derive from their nonnegated universal counterparts standing over a lower negation  is based on the fact that the nonnegated versions That must not be true, Everybody did not leave and He made me not go are clearly less favoured than their NR-induced versions in (4). Apparently, ‘universal’ operators that have an entailed ‘particular’ counterpart have a tendency to induce NR.

This is also the solution I proposed in my 1974 article to Robin Lakoff’s problem with sentences like No-one thought that John would leave until tomorrow. Its derivation is now clear. From an underlying (5a) cyclic NR first gives (5b); from there, by repeated cyclic NR, (5c); from there, again by cyclic NR but with lexical switch from everyone to someone, we get (5d), and, finally, through the cyclic and postcyclic rules of English grammar, (5e):

(5)        a.         [everyone x [x think [until tomorrow [not [John leave]]]]]

—         b.         [everyone x [x think [not [until tomorrow [John leave]]]]]

—         c.         [everyone x [not [x think [until tomorrow [John leave]]]]]

—         d.         [not [someone x [x think [until tomorrow [John leave]]]]]

—         e.         No-one thought that John would leave until tomorrow.

The negation is thus seen to ‘travel’ upwards from being the lowest to being the highest operator in the SA of sentence (5e). This solution is only viable, of course, if prepositions, quantifiers and other operators are treated as semantic predicates, as proposed by McCawley in the late 1960s. Yet the problem with this solution, neat though it may be, is that for this to take place the normal cyclic procedures of the grammar must be suspended till NR has done its work cyclically through the whole structure. In particular, the negation not must be kept from being lowered into its argument-S as long as this Neg-Raising process is going on. This is one of the reasons why I later (in my Semantic Syntax of 1996, p. 114, note 4) proposed that grammars contain a precycle, reserved for certain ‘privileged’ cyclic rules that receive VIP treatment in that they are allowed to precede the normal syntactic cycle. Their function is to fashion SAs into a shape fit for further syntactic treatment. NR is then best taken to be a ‘VIP’ cyclic rule (possibly along with one of the forms of Conjunction Reduction) in that sense.

The treatment sketched above implies that until likewise induces NR, so that semantic until–not becomes syntactic not–until. And analogously for even: not even means ‘even not’ (Laurence R. Horn, A Natural History of Negation, Chicago University Press, 1989, p. 151). Both not until and not even can be preposed, but only with AUX-Inversion, as in:

(6)        a.         Not until five did he leave.

—          b.         Not even then did he leave.

This shows that AUX-Inversion is sensitive to syntactic negativity of the preposed adverbial, whereby the fact that in (6a,b) the negation semantically does not apply to until or even but to the main verb leave no longer counts. Moreover, one will note that English not even has counterparts in other languages with the two words inverted: même pas in French, sogar nicht in German, zelfs niet in Dutch, which means that in the other languages the counterpart of English even does not induce NR. (No amount of pragmatic reasoning will be able to account for such phenomena.)

A further corollary of this analysis is that the negation that is up for Raising must always be the syntactically highest operator of the embedded S in the precyclic cycle. It may also be semantically the highest, as in (7a), which can have the meaning of (7b) but not of (7c):

(7)       a.         I don’t suppose many members agree.

—         b.         ≈ I suppose not many members agree.

—         c.         ≉ I suppose many members do not agree.

This explains why (8a) lacks an NR reading and must mean literally ‘it is not the case that I think that she may make it on time’, since in She may not make it on time the highest operator is may, with not below may. By contrast, (8b) can be read as ‘I think that it is not possible that she makes it on time’, since can’t in She can’t make it on time represents ‘not possible’, with ‘not’ taking scope over ‘possible’:

(8)        a.         I don’t think she may make it on time.

—          b.         I don’t think she can make it on time.

A wealth of further evidence for NR, taken from a variety of languages, is provided in Larry Horn’s book-length article ‘Remarks on Neg-Raising’ in Pragmatics (= Syntax and Semantics, vol. 9), 1978, edited by Peter Cole, pp. 129–220. Ironically, however, Horn shows a preference for dismissing all this carefully gathered evidence in favour of NR and argues visibly towards a  pragmatic explanation (about which more in the next post), only to conclude at the end that NR “lies at the heart of the intersection of syntax, semantics, and communicative intent” (p. 214).

I will, in the end, subscribe to this conclusion (though with a few qualifications), but without sharing Horn’s aversion to a syntactic, and his preference for a pragmatic, account. But before I come to that, I want to focus on Horn’s argument, presented on pp. 174-5 of his 1978 article, against my explanation of the grammaticality of (9c) below, with the plural verb form were, presented in my ‘Negative’s travels’ of 1974, pp. 204–5. My explanation rests on the observation that (9c) is grammatical and can only be interpreted as (9a). In my analysis, (9a) is first reduced by NR (lifting a unified ‘not’ over ‘and’, which is replaced with ‘or’, i.e. one half of De Morgan’s Laws) to (9b) and then, after repeated NR, to what ultimately becomes (9c). The crucial point is the plural were in (9c), which is remarkable since (9d) is ungrammatical. This plural is taken to derive from the semantic plurality inherent in (9a), where the predicate be late is assigned to two subject referents, Harry and Fred: we say Harry and Fred were late but Harry or Fred was late. This plural assignment, however, must be optional, since (9e) can also have the NR reading of (9a), besides the non-NR reading of (9f), where the negation stands over believe:

(9)       a.         I believe [and [[not [Harry be late], [not [Fred be late]]]]]

—         b.         ==> I believe [not [or [[either Harry be late], [Fred be late]]]]

—         c.         ==> I don’t believe that either Harry or Fred were late.

—        d.         *Either Harry or Fred were late.

—         e.         I don’t believe that either Harry or Fred was late.

—         f.          not [I believe [or [[Harry be late], [Fred be late]]]]

I’m afraid Horn misrepresented my argument, saying that, according to me, (9c) in the sense of (9a) “must take plural verb agreement” (Horn p. 174), whereas all I say is “Any syntax of English will have to explain why [9c] is grammatical although [9d] is not” (my p. 204). He then rejected my argument on the grounds that “I, for one, can interpret [9e] as corresponding to either [9a] or [9f], although [9c] does favor the former reading” (Horn p. 175). But that is precisely what I said in my article! Therefore, Horn’s conclusion (p. 175) that the data of (9c,d,e) “do not refute Seuren’s claim, based on the more restrictive dialect, […] they merely muddy the water” is not only ungenerous, it is also baseless. What “muddied the water” was Horn’s misrepresentation of my analysis and his not so relevant invocation of ‘dialectal’ differences. But perhaps we should put this down to the prevailing atmosphere of the day, where debunking Generative Semantics was the thing to do. … to be continued

Scope — 4

In the previous post I defined scope as the matrix-line sentential argument(s) of a predicate at SA-level. This is a simple and straightforward definition, but it requires a Semantic Syntax ( SeSyn), i.e. erstwhile Generative Semantics, theory of grammar. This definition is not only formal, in the sense that it is cast in terms of SA-structures, but also semantic, in that it defines the semantic function of scope in a general way. The semantics of scope is defined as follows:

If a predicate in  general denotes a property C of any kind assignable to the referent(s) of the n-tuple of its argument terms in a proposition, a scope-bearing predicate, when used in a proposition, assigns C to whatever is denoted by (the n-tuple of) its matrix-line sentential argument term(s).

In SeSyn, the classical quantifiers are defined for two terms, the matrix term (i.e. their scope) in subject position and the restrictor term in object position, as shown in Figure 1-b of post Scope—2. Both terms denote sets (‘the set of x such that…’) and the property assigned by the universal quantifier to this pair of terms is, in general terms: “the set denoted by the restrictor term is included in the set denoted by the matrix term”. Likewise for the existential quantifier: “the set denoted by the restrictor term and the set denoted by the matrix term have a nonnull intersection”. One can, of course, modify these semantic definitions, for example, by adding to the universal quantifier the condition that the R-set be nonnull, or by adding to the existential quantifier the condition that the intersection is partial for both sets. Such modifications may deliver, under conditions of consistency, logical systems that differ from standard modern predicate logic and may be closer to natural intuitions. This, however, is not immediately relevant at this point (though it has been the main focus of my research for the past fifteen years).

If the matrix term (M-term) under a scope-bearing operator O contains a further scope-bearing operator Q, then Q is in the scope of O. But if the restrictor term (R-term) under O contains a further operator Q, as in All students who have published some poetry will be considered for the prize, then Q is not in the scope of O. This is shown by the fact that the Conversions are not valid when the internal negation stands over the R-term. The Conversions say that All students did not read this book is equivalent with No student(s) read this book, and Some student(s) did not read this book is equivalent with Not all students read this book. That is, ‘all not’ is equivalent with ‘not some’ (= ‘no’), and ‘some not’ with ‘not all’. But these equivalences do not hold when the internal negation stands over the R-term: All of those who were not students failed is not equivalent with None of those who were students failed, and Some of those who were not students failed is not equivalent with not all of those who were students failed. The reason is that for the Conversions to hold the internal negation must be in the scope of the higher quantifier, which, according to my definition, it isn’t when the internal negation stands over the R-term.

With this, I provisionally consider the notion of scope well-defined, ‘provisionally’ because, of course, many questions remain. Let us now turn to question (b) of post Scope—2: how to account for the fact that some scope readings are legitimate for some surface structures and others are not. This question has dominated the entire scope literature since its inception during the late 1960s. Yet, despite the considerable amount of research devoted to this  question, no convincing answer has so far been provided. Perhaps this is not so surprising, not only because the notion of scope has never been properly understood, but also because, of the two poles of the relation, the logico-semantic structure has at best been specified only in the approximate terms of  what is called ‘the language of modern logic’, which leaves it open which variety of that ‘language’ suits our purpose best. Also, opinions differ vastly regarding the nature of the grammatical machinery into which the relation between logico-semantic scope and its surface manifestations is to be integrated.

Even so, a variety of answers have been tried out. One answer, current among cognitivists to the extent that they are concerned with scope questions at all, is that interpretive scope assignments are a question of ‘information structure’, that is, the packaging of information in a wider context into a sentence with some grammatical form—a process taken to be restricted to surface structure operations (see, for example, Adele E. Goldberg, Constructions: A Construction Grammar Approach to Argument Structure, Chicago University Press, Chicago, 2006, p. 161). This answer is surprising in that it implies that differences in information structure correspond with truth-conditional scope differences and vice versa, which strongly contrasts with their explicit or implicit assumption, which they share with the pragmaticists, that information structure is merely a matter of ‘packaging’ given truth-conditional content for the purpose of context-bound interpretation and thus  has no truth-conditional implications (I hope to show in a later blog that this is false).  In general, cognitivists seem to be blissfully unaware of the importance of truth conditions in semantics and of ways of representing scope. Moreover, they fail to specify in any precise way how this machinery is supposed to work. All we get is vague allusions. I will, therefore, pay no further attention to cognitivism in this discussion.

Chomskyans want scope differences to be reducible to structural conditions and variations in surface structure. They have in principle two strategies to reach a solution, the Quantifier Raising (QR) and the surface C-command strategy, possibly combined into one umbrella strategy. The former, Quantifier Raising, is a bottom-up machinery that isolates quantified terms and negation from surface structures and gives them a place in a logical scope hierarchy. It goes back to Robert May’s 1977 MIT PhD-thesis The Grammar of Quantification (which provoked some hilarity as it proposed the converse of what had been current theory in Generative Semantics, namely top-down Quantifier Lowering, whereby quantifiers are lowered from SA-structures into surface structures, without properly analysing the differences between the two methods). Globally speaking, the surface C-command strategy implies that scope is subject to C-command in surface structure. For those who are not familiar with the term C-command: in a tree structure, a node A C-commands a node B iff the first node up from A dominates B. Thus, if A and B are sister nodes, they C-command each other. In these terms it is said that an operator Q takes scope over an operator R iff Q C-commands R in surface structure. If Q and R C-command each other, there is scope ambiguity.

I won’t discuss in detail the fairly large literature on bottom-up scope determination, with surface structures as input, as every proposal made so far either overgenerates in that it assigns scope readings that are in fact impossible, or undergenerates in that it fails to allow for scope readings that native speakers assent to, or both. Moreover, certain surface phrases are identified as quantifier phrases whereas they may well not be. I will give some examples in a moment.

Generative Semantics (Semantic Syntax) tends to take the obvious correspondence between operator scope and left-to-right order in surface sentences as the default point of departure, proposing a system of Operator Lowering from SAs to surface structures subject to the Scope Ordering Constraint (SOC), which says that a higher operator, when lowered, has to land in a position to the left of the operator lowered earlier. I myself have proposed and defended a system of that nature in many publications over the years. However, such a system still needs so many hedges and extra provisions that, for the moment at least, I can’t see my way through to a rule system that is both empirically adequate and sufficiently compact and transparent to carry conviction. Yet it must be observed that prima facie a strategy of operator lowering is the only option in any overall theory, such as SeSyn, that sees a grammar as a topdown machinery converting meaning representations into surface structures, reflecting the intuition that sentences are type-level schemata for the expression of cognitive propositional content. In such a theory it would be anomalous to treat the clearly truth-conditional phenomena of operator scope in a bottom-up fashion.

SeSyn is the only theory that advocates a top-down rule system. All other approaches, whether emanating from the Chomskyan camp or from any of the logic-oriented approaches to grammar and semantics, such as Montague Grammar (MG) or Categorial Grammar (CG), attempt to specify scope in a bottom-up fashion, from surface to SA. Logic-oriented approaches, despite the impressive display of formal and mathematical prowess, allow in principle for any scope assignment, which is empirically untenable. Like all other theories, they moreover fail to distinguish between dominant and recessive scope readings. What I mean by the latter is this. A sentence like Nobody here speaks two languages is normally (dominantly) interpreted as saying that nobody here is bilingual: ‘not – there is a person x here such that – there are two languages y such that – x speaks y. In this interpretation SOC is fully observed. However, it is also possible, with rising intonation on two languages, to assign a recessive reading to the sentence in which it says that there are two languages that nobody here speaks, assigning wide scope to two languages, thereby violating SOC. Often, however, as in some of the examples given below, there simply is no recessive reading, which makes the logic-oriented theories empirically inadequate.

I can only give a very partial illustration of the problems encountered in the various approaches. Consider a sentence like All books of some students were stolen. This has a dominant reading with some students as the highest operator: ‘there are some students all of whose books were stolen’. The recessive reading ‘all books belonging to some student or other were stolen’ is hard to get. Yet all linguistically oriented bottom-up theories exclude the dominant reading on the grounds that it violates an island constraint. Since it also violates SOC, SeSyn is equally at a loss. Another, hackneyed, example is taken from Ed Klima’s 1964 study on negation: Many smokers don’t chew gum is unambiguous and differs by scope inversion from the equally unambiguous Not many smokers chew gum. Logic-oriented and C-command theories fail in that they generate both readings at least for the second sentence. Only SOC provides an answer here. Likewise for the German Ich habe kein Wort verstanden (‘I didn’t understand a word’) versus Ich habe ein Wort nicht verstanden (‘there was one word I didn’t understand’), both entirely unambiguous, or the unambiguous English sentence For many reasons I did not go out versus the ambiguous I didn’t go out(,) for many reasons (the reading without the comma stands out when a phrase like but for one reason only is added). Sometimes a reading can indeed be attributed to something like ‘information structure’, especially when it takes the form of topic-comment structure (TCS). But in many such cases it is doubtful that the comment is indeed a quantifier. Take well-known examples like A rose adorned every table or A ballerina escorted every officer. These look as if they assign higher scope to every and lower scope to the indefinite article, taken to represent the existential quantifier. But does the indefinite article represent the existential quantifier in such cases? Couldn’t it represent so-called ‘isa’ predication of the type John is a soldier? The underlying structure would then be something like ‘the x such that x adorned every table was a rose’. The predicate a rose is then lowered to the position of the variable in the matrix clause, giving A rose adorned every table, with predicate accent on a rose. This would then not be Operator Lowering but Comment Lowering. Montague treated a soldier as in John is a soldier as an existential quantifier, but he was immediately corrected by Barbara Partee, who considered this to be nonsensical. I think Partee is right, but then, why does the indefinite article systematically occur with the ‘isa’ sort of predication in so many different languages? We don’t have an answer. Given all this, I fear that, at least for now, the upshot is that we have not, so far, been able to resolve the question of what determines scope readings in surface structures in any satisfactory way. I have the feeling that we still need a whole lot of innovating research before we will be in a position to tackle this problem.   

Scope — 3

Let’s recapitulate. I began by stating that the notion of scope, though universally acknowledged and used, has so far remained undefined, whether in logic, where it originated, or in semantics. All one finds is the statement, or implication, that scope is a property of logical operators (constants). I then decided to follow custom and rely on some intuitive understanding of the notion based on examples. I showed that scope phenomena extend way beyond the logical elements in sentences and that nonlogical lexical elements, such as high-peripheral adverbials or complement-taking verbs such as cause or allow, interact with each other and with logical operators in exactly the way logical operators do among themselves and I concluded that these nonlogical elements have scope the same way as the logical operators do. I argued that at least logical scope differences are best represented as hierarchical differences in the tree structures of some logico-semantic language that may be taken to represent meanings, the language of semantic analysis (SA) and I concluded that the nonlogical scope-bearing elements in sentences should be represented the same way, thus blurring the distinction between purely logical representations and semantic representations in a more general sense. This gave rise to the idea, based on McCawley’s work, that it pays to treat all lexical items with semantic content as predicates at SA-level, letting the grammar define their surface category. We have seen that in many languages what are causative predicates at SA-level regularly manifest themselves as morphological causal affixes in surface verbs (in English often as zero morphemes, as in verbs like drop, close, burn, explode), but that nevertheless some scope-bearing adverbials can take the complement of SA causative predicates as their scope, even when the surface causative verb has been fully lexicalized, provided the scope-bearing adverbial is in right-peripheral (clause-final) position. All this has given rise to a whole lot of questions, some of which I will discuss now.

I want to focus first on the question of how to define the notion of scope, because I think we are a bit closer to an answer now. Given the assumption that all semantically significant surface categories are predicates at SA-level, we make a distinction, at SA-level, between abstract predicates and matrix predicates (what I call “matrix” here was called “nucleus” in my 1969 PhD-thesis Operators and Nucleus). Matrix predicates occur as the main predicate (verb or adjective) in surface sentences or clauses. Abstract predicates are incorporated by the grammar into the main clause or sentence in a bottom-up cyclic way (this is what I usually call “matrix greed”). They end up in surface structure as supplementary elements under some surface categorial label, often of an adverbial kind. Logical operators are SA-level abstract predicates. Most surface sentences have incorporated one or more abstract predicates, which are organized at SA-level in a hierarchical scope structure. Thus, in a sentence like Last year, most tourists preferred Italy, the surface elements last year, most, and the past tense suffix -ed represent abstract SA-predicates, whereas prefer is the matrix predicate of the tenseless matrix structure prefer(x, Italy). At SA-level, every abstract predicate in an SA tree structure commands at least one matrix line ending in the bare matrix-S. Matrix predicates that take a complement clause in subject or object position start a new matrix line for the embedded clause. In Figure 3, the matrix line of the sentence All protestors studied law in Paris is indicated by lines in bold print.


It is interesting to see in broad outline how the grammar of English transforms the SA-structure in Figure 3 into the surface structure All protesters sudied law in Paris shown in Figure 4 in a fully automatic, algorithmic way. This is a bit technical, but if you take the trouble to follow me you will see that the machinery works beautifully (technical details and definitions are given in my Semantic Syntax, Blackwell, Oxford, 1996). My former assistant Henk Schotel turned the apparatus as applied to English into a computer program.

The grammar starts at the lowest S or NP/S structure, in this case the matrix structure NP/S4, and subsequently goes through all higher S or NP/S structures in a cyclic way. This has been known since the mid-1960s as the transformational cycle. The predicate of each S or NP/S structure is lexically defined for the rule or rules it induces. It assumes its surface category as the cyclic mechanism passes through its S or NP/S. On the NP/S4-cycle, no rule is activated because the predicate study induces no cyclic rule. Yet since the cyclic mechanism passes through its NP/S, study is relabelled Verb (V). The predicate past of the NP/Ss-cycle induces two rules, first Subject Raising (SR), then Lowering (L). SR raises the subject term of the lower S or NP/S, in this case NP[x] in NP/S4, to the subject position of the higher S or NP/S. As a result, what remains of the matrix structure is automatically relabelled Verb Phrase (VP) and moved to the right. VPs are thus S-structures bereft of their subject term. Next, still on the NP/S3-cycle, Pred[past] is incorporated into the matrix-VP, where it ‘lands’ on Verb[study], as Affix[past], forming one cluster with Verb[study]. The node NP/S3 now dominates NP[x] followed by VP[Verb[Affix[past] – Verb[study]] – NP[law]]. Then, Pred[in Paris] of NP/S1 is lowered to the peripheral position at the far right of NP/S3 and relabelled Preposition Phrase (PP), giving the matrix structure

NP/S[NP[x] – VP[V[V[study]+Affix[past]] – NP[law]] – PP[Prep[in] N[Paris]]]

which stands semantically for ‘the set of x such that x studied law in Paris, or, more simply, ‘the set of those who studied law in Paris’. Node NP/S1 has become vacuous and is deleted. Then, Pred[allx] of S0 incorporates the object term NP/S[protestor x] according to the common cyclic rule of OBJECT INCORPORATION and thus gives the complex predicate [Pred[all]+NP/S[protestor+x]]. This complex predicate is then lowered onto the variable x in the matrix structure, giving the following string in terms of labelled bracketing, corresponding to the tree structure in Figure 4:

S[NP[Det[all] N[protestor]] – VP[V[V[study]+Affix[past]] – NP[law]] – PP[Prep[in] N[Paris]]]


After this cyclic machinery, each language has a set of postcyclic rules to prepare the structure for the morphology, if necessary, and for possible further cosmetics. Again, for technical details and ample illustration taken from a variety of languages, see my Semantic Syntax (Blackwell, Oxford, 1996). I think that if a metric were to be applied for the ratio between data coverage and generalizations, this system would not score too badly.

But back to the notion of scope. Some abstract predicates, in particular the binary logical connectives and and or, take two parallel matrix lines, which means that such binary connectives have a twin scope. Most languages lower such twin scope predicates to the position between the two matrix constituents. Twin scope also occurs with those matrix predicates that can take an S-structure in both subject and object position. Examples are suggest, mean, prove, entail, etc., as in That the butler had blood on his hands proved that he was the murderer. Here, the lexical matrix verb prove occupies the normal position of the finite verb in English sentences. Curiously, all lexical matrix verbs that take both a sentential subject and a sentential object term are subject to the rule that the sentential subject term is factive in the sense that the whole sentence presupposes that what is said in the subject-S is, in fact, true. Why this should be so is unknown.

Given all this, we can now tentatively define the notion of scope in a simple and general way as follows:

– The scope of an SA-predicate P is the matrix-line sentential argument(s) of P.

Whether this definition will stand up to future scrutiny is a question I cannot answer now, but, as far as I can see, it does stand up to all past scrutiny. An important corollary is that the logical operators are treated as abstract predicates in the semantics of natural language, just as was proposed by McCawley during the late 1960s. The logical operators are thus lexical items like the others and belong in the lexicon of a language. They distinguish themselves from nonlogical predicates in that (a) they are, probably in all languages, abstract predicates and thus do not occur as matrix verbs, and (b) their (core) meanings are definable in the mathematical terms of the human cognitive version of set theory. This is, in fact, the perspective in which I have developed my theory of the natural logic of language and cognition over the past 15 years.

Another corollary is that definite NP terms do not have scope. This has to be said because, in the wake of Russell’s (1905) analysis of definite descriptions as existentially quantified terms, it has become common to treat definite determiners such as English the, that, etc. as quantifiers. I will not expand too much on this here, but I consider that to be a fatal distortion of the reality of language. For one thing, definite determiners do not participate in the game of scope differences: there is no semantic difference between, for example, English I don’t know that man and German Ich kenne diesen Mann nicht, despite the fact that in English the negation precedes but in German it follows the definite object term. By contrast, English I don’t know all men corresponds to German Ich kenne nicht alle Männer, with the negation not in final position but preceding the universally quantified object term, as in English. But German also has Ich kenne alle Männer nicht, meaning ‘for all men x it is not so that I know x’, which is impossible in English: *I know all men not is ungrammatical in modern English, which, but for a few well-defined exceptions, wants the negation to be constructed with the finite verb form. Crucially, those who want to treat definite determiners as quantifiers have not provided a semantic definition for them in the terms of a binary higher order predicate, as we have done for the standard quantifiers. … to be continued

Scope — 2

Sentence (6a), at the end of the previous posting [For two years the sheriff jailed Robin Hood], strikes one as a bit comical, as it evokes an imgage of the sheriff putting Robin Hood in and out of jail repeatedly for two years. This oddness is of a pragmatic nature, as it is due to prevailing circumstances in life as we live it. It disappears when we change the lexical items, as in (7a) (McCawley discussed these examples in his 1971 paper “Prelexical Syntax”, republished in his book Grammar and Meaning, Taishukan Publishing Cy, Tokyo, 1974, pp. 348-9):

(7)       a.       For two months Harry lent me his bike. (unambiguous)

—       b.       Harry lent me his bike for two months. (ambiguous)

It is perfectly normal to lend someone one’s bike, say, every Tuesday and to do so for two months, as expressed in (8a):

(8)       a.      For two months Harry lent me his bike every Tuesday. (unamb.)

—       b.       Every Tuesday Harry lent me his bike for two months. (absurd)

Remarkably, if every Tuesday is preposed, as in (8b), eyebrows are raised, since a period of two months is far longer than one week, so how can Harry allow me every Tuesday  to use his bike for two months?! (8b) can only be saved by giving a sentence-final intonation to lent me his bike and starting a new sentence intonation at for two months. In writing, such an intonational break is best indicated by a comma or perhaps an m-dash.

Scope phenomena as illustrated here give rise to at least two questions: (a) how do we represent the different scope relations, and (b) how do we account for the fact that some scope readings are legitimate for some surface structures and others are not.

As regards the first question, I know of no better way of representing scope relations than in terms of hierarchical tree structures or, equivalently, bracketing structures as used in logic, preferably with labelled brackets. (The bracketing in logic is never labelled, but we have learned from linguistics that bracket labelling—that is, node labelling—is a useful additional device.) In linguistics, McCawley was the one who drew attention to the formal identity of these two methods of representation and subsequently proposed that meaning representations (I speak of semantic analyses or SAs) are best cast in the terms of a language closely akin to that of modern predicate logic. This proposal has gained universal acceptance in all varieties of formal linguistics. (In formal semantics it didn’t have to be accepted, as FSx is itself directly derived from modern predicate logic.) The general format of logical structures would then be that of operators functioning as (abstract) predicates taking a sentential (propositional) term as their scope, as shown in Figure 1.

When he wrote on prelexical syntax, McCawley did not know yet of generalized quantifiers, but it has meanwhile become clear that quantifiers are best treated as higher-order binary predicates, that is, as predicates over pairs of sets: the universal quantifier expresses an inclusion relation and the existential quantifier expresses a relation of nonnull intersection of two sets. The sentence Not all flags are green is then semantically analysed as ‘it is not the case that the set of green things includes the set of flags’ and represented as the tree structure in Figure 1a (ignoring tense and other nonessential elements). And the sentence All boys admire some footballer(s) is analysed as ‘the set of those who admire some footballer(s) includes the set of boys’, or more explicitly ‘the set of x such that there is a nonnull intersection between the set of footballers and the set of those admired by x includes the set of boys’. The corresponding tree structure is given in Figure 1b. (The label “NP/S” stands for constituents that have the syntactic form of an S-node but the semantic status of an NP-argument standing for a set of objects. Thus, the propositional function NP/S[flag,x] has the semantic status of an NP standing for ‘the set of flags’ but the structure of an S as it consists of a predicate and an NP-argument.)


It will be clear that the (dominant reading of) the sentence Some footballers are admired by all boys requires an SA where all and some have swapped scope compared with Figure 1b. (Try drawing the corrsponding SA tree structure: it’s a useful exercise.)

Of course, the above is only a rough outline and quite a few questions are still to be answered, but the overall idea will be clear. That being so, the question arises of how to represent the scope relations in the pragmatically normal reading of (6b) of the previous post [The sheriff jailed Robin Hood for two years] or in (7b) above in the reading in which Harry gave me a permission valid for two months to use his bike. On the face of it, there is only one scope-bearing operator in these sentences: for two years in (6b) and for two months in (7b). For the corresponding (a)-sentences there is no problem, as the operator has highest scope over the rest of the sentence. But how do we render the meaning of the (b)-sentences that is not shared with the corresponding (a)-sentences in such a way that the scope difference is expressed in the systematic terms of scope-bearing operators? For this we need at least two operators in each sentence. So, where is the other one? This difficult question will force us to tread on ground that has so far remained largely unexplored.

An obvious way to detect a second scope-bearing operator in the sentences in question is by breaking up the predicates jail and lend into component parts. Jail would then be analysed into ‘cause to be in jail’ and lend into ‘allow to use’, as proposed in McCawley’s theory of prelexical syntax. Let us provisionally pursue this line of thought for a moment, leaving open the possibility that there are alternative ways of accounting for the relevant facts. The theory doesn’t say that jail is semantically equivalent with ‘cause to be in jail’ or lend with ‘allow to use’. All it says is that the lexical verb jail entails ‘cause to be in jail’, but not vice versa, and analogously for the lexical verb lend and ‘allow to use’. Let us, for the moment, assume that these analytical paraphrases make up what we may call the core meaning of these predicates. Once a semantico-syntactic complex has been unified into a single lexical predicate, that predicate is likely to assume further semantic refinements. A clear example is the English verb fell, which is historically the causative of fall (‘cause to fall’, with Germanic causative umlaut on the vowel), but which is now typically restricted to the felling of trees and exists along with other specialized verbs whose meaning can be described as implying ‘cause to fall’, such as trip, push over, knock over, etc. (which does not mean that these verbs should also be considered to have ‘cause to fall’ as their core meaning).

In the same way, jail and lend, being lexical predicates, may be seen as having additional idiosyncratic semantic elements associated with the items themselves and not expressed in their core meanings. Thus, jail implies that the referent of the subject term has the power or authority to cause the object referent to be in jail and that it is through this power or authority that the jailing takes place. And lend implies additionally that the lender is the owner or possessor of the object lent, not, say, the guardian, even though a guardian may well allow one to use some object in his custody. Clearly, such distinctions and conditions will have to be elaborated in a general theory of the lexicon, a general lexicology. But if such a theory could be developed, it would provide the structural space for a second operator and thus for a scope swap. This is shown in Figure 2, where the meaning that is common to (6a) and (6b) is represented in Figure 2a, and the meaning that is unique to (6b) is represented as Figure 2b. And analogously, of course, for (7a) and (7b), where lend would be broken up into ‘allow to use’. In Figure 2, the two scope-bearing operators of (6a,b) are for two years and cause. For (7a,b), the two operators are for two months and allow.


The theory of prelexical syntax clearly makes some sense, but it also poses serious difficulties, as is to be expected from a radically new theory that opens new and promising horizons. The proper answer to this is, of course, not an immediate cold dismissal on grounds of trivializing arguments (which is what happened to it in the early 1970s), but an open-minded, positive reception and a serious willingness to look into its potential for the science of language. In any case, one of the positive consequences of this theory is that the nonlogical, merely lexical verbs cause and allow can now be taken to be scope-bearing operators ‘inside’ causative verbs such as jail or lend. Which in turn may provide an opening to our original problem of how to define the notion of scope, as we may envisage the possibility of saying that, in general, the complement of a complement-taking predicate is its scope—a highly significant generalization that would bring logic and language a great deal closer to each other. So let us defer the question of prelexical syntax for the moment and focus again on the question of scope in general, seeing if we can uphold the generalization that all scope-bearing operators are abstract predicates—that is, predicates at SA-level—with their own predicate-argument structure.

I have shown above that at least the quantifiers are naturally treated as predicates. This move will now have to be extended not only to the operators of propositional logic (not, and, or, if) but also to those prepositions that can occur in high peripheral position, such as for indicating temporal extension, or because of, or during, or after, etc., and then also to subordinating conjunctions like because, unless, although, while, etc. In general, it pays to consider all lexical items with semantic content to be predicates in SA-structures, as proposed in my Semantic Syntax of 1996 The lexicon of a language L will then define for each semantic predicate what its surface category will be: verb, noun, adjective, preposition, affix, etc. and the grammar of L will then deal with each such item according to the procedures fixed for verbs, nouns, adjectives, prepositions, affixes, etc. This is how what is a predicate of causation at SA-level may turn up in surface structure as, say, a causative affix of the kind found in a large number of languages.

Some linguists have expressed bewilderment at this approach, but that seems to me to be due merely to their lack of imagination. There is nothing wild or exotic about it. On the contrary, it constitutes a significant generalization over logico-semantic structures and throws an explanatory light on the fact that, for example, negation is an auxiliary verb in Finnish, that what are prepositions in European languages are verbs in Korean, and other such unexpected grammatical category shifts in various languages. (It will also help to maintain the generalization that it is semantic predicates that induce presuppositions, given that because, although, while and quite a few other subordinating conjunctions induce a factive presupposition.) … to be continued

Scope — 1

In the next few postings I intend to deal with the notion of scope in language—which will necessarily take me to the notion of scope in logic. I had promised Mark Brenchley to devote one posting to Chomsky’s defective citation practices, having admitted that I was wrong in saying that Chomsky had failed to give Karl Lashley his proper due in the review article against Skinner in Language of 1959: Lashley was referenced in that article in a perfectly adequate way, mea culpa. But I think it has been made sufficiently clear in the avalanche of comments to subsequent postings that it was indeed Chomsky’s practice not to refer to his sources adequately. So let’s leave that topic for what it is and turn to something that is intellectually more exciting and morally more uplifting.

Since the mid-sixties, the notion of scope has played an important role in a large variety of linguistic theories, starting with Transformational Generative Grammar. Yet if you look for a definition, or at least a reasonably precise description, of what is meant by ‘scope’ in the linguistic literature, you will find nothing. Since the notion stems from modern logic, one would expect a precise definition in the logical literature. Yet again, one draws a blank. I have consulted a fair number of advanced logical handbooks, but all I found was a casual use of the term, usually illustrated with a few examples and usually restricted to the quantifiers all and some. Occasionally, there is talk about the scope of the propositional connectives (not, or, and, if) or of the operators of modal logic, but no definition of what scope does in logic generally. There is an implicit convention to speak of scope as a property of logical operators—the so-called ‘constants’ of any logical system, as opposed to the variables that are used.

How about formal semantics, where scope is all-important? In the admirable Introduction to Montague Semantics by David Dowty, Robert Wall and Stanley Peters (Reidel, 1981, pp. 63–6), scope for quantifiers is described, not defined, in terms of the machinery of variable substitution in Montague formulae, but what scope actually is, is not revealed. Nor is scope described or defined, in that book, for other logical operators than the quantifiers. Sometimes one finds scope defined in terms of the notation used: count the number of brackets or full stops or whatever, and you will be able to see what is in the scope of what. But what it means to be in the scope of something is never revealed. I myself am equally guilty: I have written a great deal about scope over the past 45 years, but without any proper definition of that notion other than the implicit assumption that scope is a property of (logical and other) operators. I may be wrong, of course. There may be some publication where scope is properly explained. If so, please let me know. But so far I haven’t found any.

The way out of this curious situation is far from simple. In fact, I will argue that scope is a necessary property of all lexical and logical predicates that take an embedded proposition or propositional function as one or more of their terms, which means that one has to break loose from a number of established basic logical notions if one wishes to provide an adequate and general definition of scope. Part of what my argument will show is that it is very hard (read: impossible) to escape from the conclusion that it makes a lot of sense to treat all logical and nonlogical operators as (abstract) predicates. For the moment, however, I will do what everybody has done so far: I will give you a feel of what scope is simply by giving a few examples. Consider the difference between (1a) and (1b) (I have italicised the surface representatives of the scope-bearing elements):

(1)     a.         One student did not pass the test.

—      b.         Not one student passed the test.

It is universally agreed that the difference in meaning between these two sentences consists in the different scopes of the operators one and not: in (1a) one takes scope over not and in (1b) it is the other way round. The difference is, of course, straightforwardly truth-conditional. Logical paraphrases make the difference explicit. For (1a): ‘there was one student s such that – it is not the case that – x passed the test’ and for (1b): ‘it is not the case that–  there was one student x such that – x passed the test’. It seems obvious that a proper theory of meaning will have to specify what scope is, and so far no theory of meaning has achieved that.

I have chosen the examples (1a) and (1b) because they are both unambiguous: there is hardly a way (1a) can be taken to have the meaning of (1b) or vice versa, no matter how one plays around with accent, intonation or context. Often, however, there is a ‘dominant’ and a ‘recessive’ reading. It is a standard observation that in a sentence pair like (2a) and (2b), the one can, though with some difficulty, have the meaning of the other under certain intonation patterns and/or in certain contexts, though the dominant meaning of (2a) is ‘there is nobody x here such that there are two languages l such that x knows l’ and of (2b): ‘there are two languages l such that there is nobody x such that x knows l’:

(2)     a.         Nobody here knows two languages.

—      b.         Two languages are known by nobody here.

That is, if in (2a) the NP two languages is given high pitch accent, it may have the meaning of (2b), and a similar scope-inverting effect can be observed when (2b) is read with low accent on two languages and high accent on nobody here.

One regularity strikes the observer immediately: there is a predominant trend to assign the highest scope to the first occurring operator in the surface sentence, the second-highest scope to the second occurring operator, etc. This is nicely illustrated in a sentence like:

(3)     John may not have been able to buy two books.

Here we identify four scope-bearing operators: the epistemic possibility operator may, the negation operator not, the agentive possibility operator be able, and the existential operator two. And indeed, the natural interpretation of (3) is: ‘it is possible that – it is not the case that – John was in a position to  bring about that – there were two books b such that – John bought b’. It is possible, under special intonation, to assign highest scope to two books: ‘there are two books such that John may not have been able to buy them’, but it is not possible to conjure up a reading where two books has some intermediate scope in between two other operators: it has either the highest or the lowest scope.

As I alluded to above, scope turns out not to be exclusively reserved for logical operators: scope differences are also observed with nonlogical expressions, as is clear from the following:

(4)    a.         Because of the rain I did not go out. (unambiguous)

—     b.         I did not go out(,) because of the rain. (ambiguous)

—     c.         Not because of the rain did I go out. (unambiguous)

Phrases like because of the rain are not dealt with in logic, but they occur in language and turn out to have scope there. So if we are looking for a general definition of scope, we will have to make sure that the definition covers nonlogical operators as well. (I’ll keep calling all scope-bearing elements operators, whether logical or nonlogical.) I will also call prepositional phrases (PPs) that have the structural status of because of the rain HIGH-PERIPHERAL PPs (HPPP), to distinguish them from PPs in other structural positions, such as in London in Harry lives in London, or about wildlife in Harry writes about wildlife, or on the other hand, as in On the other hand, Harry never writes about wildlife, or in addition, as in In addition, Harry writes about wildlife, etc. (In my Semantic Syntax, Blackwell, 1996, pp. 116–28, I distinguish at least six levels at which English adverbs can function at sentence level. PPs differ from adverbs in certain ways, but they too can  occupy positions of different status  in English sentences.) HPPPs are typically scope-bearing operators.

The examples (4a–c) again illustrate the tendency for operator scope to be reflected in the left-to-right order of the operators (or their representatives) in surface structure: in (4a), because of the rain precedes not, but in (4c) the latter precedes the former. I will henceforth call that tendency the Scope Ordering Constraint (SOC). SOC is not an absolute constraint, as has been observed by a great many authors, but I defend the thesis that it is a, probably universal, default constraint on the ordering of operators in surface structures. This default constraint can be overruled under certain conditions, which will have to be carefully charted in the hope that the violations of SOC will be explained by the interference of other systematic factors. One case where SOC is overruled is (4b), which is clearly ambiguous—an ambiguity that is naturally resolved by means of different intonation patterns. In one reading, (4b) means what (4a) means; in the other it has the meaning of (4c). In the meaning of (4a), because of the rain has the status of HPPP and  a comma is preferred between out and preferred. In the other reading, not takes scope over because of the rain, which is, therefore, not a HPPP. The difference between the two readings is, of course, truth-conditional and not a matter of pragmatics in whatever sense.

The difference is systematic. It is, for example, also found in pairs like:

(5)   a.         Every morning I read two poems. (unambiguous)

—    b.         I read two poems every morning. (ambiguous)

(6)   a.         For two years the sheriff jailed Robin Hood.  (unambiguous)

—    b.         The sheriff jailed Robin Hood for two years.  (ambiguous)

In (5a) and (6a), the PPs have the status of HPPP (I reckon every morning to be a PP). In (5b) and (6b), they either have HPPP status or they have lower scope: under two poems for (5b), and under … oh dear, under what for (6b)? … to be continued

Chomsky in retrospect — 5

Most people will agree that there are many controversial aspects to the person of Noam Chomsky, especially as regards the period after 1960. In my view, the most controversial aspect, as far as linguistics is concerned, is his destruction of Generative Semantics (GS) during the late 1960s and early 1970s—the period of the “linguistics wars”. Many have argued that it was not “his” destruction, but that GS destroyed itself. I disagree, with many others. In any case, to form an adequate idea of what happened, it is necessary to be familiar with three books. The first is Frederick Newmeyer, Linguistic Theory in America, (Academic Press, New York/London, 1980). (There is a second, revised, edition of 1986, which I have seen but which I don’t have at hand; so I will rely on the 1980 edition.) This book, especially the chapters 4 and 5, is, in my view, the best account available of the academic issues that played a role, even though, as Newmeyer himself candidly admits, he gives “an overly one-sided picture of the confrontation” (p. 133), favouring the Chomsky side. The other two books are Randy Harris, The Linguistics Wars (OUP, New York, 1993) and Geoffrey Huck and John Goldsmith, Ideology and Linguistic Theory. Noam Chomsky and the Deep Structure Debates (Routledge, London/New York, 1995). The latter two authors’ accounts are likewise a bit lopsided, as they favour the GS camp. They devote more space to the ‘external’—that is, the personal, emotional and social factors involved, though the academic aspects are not neglected. Academic purists may say that such ‘external’ aspects should be kept out of court and that only purely academic issues should be discussed, but if one does that one buries one’s head in the sand and will not understand what was going on. As Christina Behme pointed out in one of her comments, when one tries to understand the past, one has to take all causal factors into consideration. And, I might add, one has the right, in some cases, to express a moral judgement as well.

As far as the purely academic issues are concerned, I have to be brief, since a full discussion is not feasible here. Many issues were raised, most of them not essential to the GS research programme, such as prelexical syntax, the (non)existence of a deep structure level between the semantic representation and the surface structure, or the status of global rules. The essential tenet of GS, most of the time not identified as such and confused with the nonessential issues, was that every sentence S in a language L has an underlying semantic representation (SR) cast in terms of a formal logical language very much like the formal language used in modern predicate logic, and that the grammar of L consists in a top-down mapping system mediating between SRs and surface structures by transforming the former into the latter. During the years 1964 to 1966, Chomsky agreed with this view (still defended in his Cartesian Linguistics of 1966), but as from 1967 he proposed a different, more complex system, called Autonomous Syntax or Interpretive Semantics, where every S has an ‘autonomous’ syntactic deep structure (DS), converted by transformations into a surface structure (SS) and where both DSs and SSs are input to a system of ‘interpretive’ rules providing a SR in terms of the language of modern predicate logic. Academically speaking, the arguments to and fro were inconclusive, often fed, especially on the Chomsky side, by rhetoric, impressions, an inability to envision new horizons, and other subjective elements. It is fair to say that if the discussions had been allowed to be carried on in a normal way, they would most probably have crystallised into greater clarity and substantive new insights might well have been gained.

This, however, is not what happened. Right from the start, in early 1967, when Chomsky returned from his sabbatical leave at Berkeley, personal motives prevailed and academic reasoning became a mere fig leaf barely covering what it was really all about. And that was power, Chomsky’s power, to be sure. Chomsky was the one who rejected the new development of GS, of which he had himself been a part for a few years, but, notably, not the leading part. Let us suppose (I don’t believe it, but let’s suppose) that Chomsky had genuine intellectual qualms about the road to abstractness that the GS boys had taken. An orderly discussion would then have been the normal thing to engage in. Instead, to everybody’s “dreadful surprise” (Ray Jackendoff’s words quoted in Harris, 1993, p. 139), Chomsky immediately started a “counteroffensive” (Newmeyer, 1980, p. 114), in that he “launched a series of lectures that completely reversed the abstract syntax trend of deepening deep structure. […] Everyone immediately perceived them as an attack on generative semantics, a reactionary attempt to cut the abstract legs out from underneath the upstart model.” (Harris, ib.) It is clear from the sources, such as the correspondence between Chomsky and McCawley (Huck & Goldsmith, 1995, pp. 63–66), that, all of a sudden, while on leave for a few months at Berkeley, Chomsky had turned from a sympathetic listener into a deliberately obtuse block of resistance, unwilling to engage in constructive discussion. Harris writes (p. 285) that Lakoff, Ross and Chomsky “had only one meeting upon Chomsky’s return, which was interrupted by a call from the magazine [i.e. Time; PS] which took up most of the scheduled time; subsequent meetings were cancelled.” In short, it soon became clear that Chomsky was dead set on destroying Generative Semantics.

A well-known and very significant incident at the Texas Linguistics Conference held in the autumn of 1969 (Huck & Goldsmith pp. 115, 125, 134, 161–2) bears witness to this ugly turn of events. During discussion time after Chomsky’s lecture, Ross was about to present some counterexamples to Chomsky’s generalizations but Chomsky wouldn’t let him finish, interjecting that counterexamples were irrelevant, whereupon “Haj just turned around and walked away while Chomsky went on with his interruption” (Huck & Goldsmith p. 134). Postal’s comment, as given in Huck & Goldsmith (ib.), was: “Ross’s gesture signaled that this was a breakdown of communication, that he felt that Chomsky had broken the rules. Which I believe he had.” And he had indeed, in a major way. In a long footnote (p. 162), Postal adds: “One clear implication was that disagreeing with Chomsky, even then the most renowned and influential person in the field, would have a high price. A second was that the controversies which had arisen were not being treated by Chomsky as (only) technical matters to be resolved in normal scientific ways but as somehow sufficiently threatening to induce strong emotional responses and even clear violations of normal standards.” The matter thus became a question of ethics, more than of academic right or wrong. Chomsky had not only betrayed  himself and his generative semanticist followers and students but he had also broken the ethical code of academe. From then on, any rational exchange of ideas or arguments between the generative semanticists and Chomsky and his group became impossible: what remained was hostility. And, given the balance of power as it was, it was inevitable that Chomsky would win. But at what price…

I myself was only marginally involved, as I did not belong to the MIT tribe. But I had published my PhD-thesis Operators and Nucleus in 1968, and I had written a few articles, all in the GS spirit. Having met Chomsky personally in 1966 at Frits Staal’s home in Amsterdam, I met him again at MIT in early December 1970, at the start of a month-long visit, naïvely unaware of the change that had taken place. On that occasion, he treated me in the rudest possible manner, after which others quickly explained to me what was going on. I remember having a feeling of great disappointment, which, fuelled as it was by multiple later manifestations of academic impropriety, has deepened over the years into moral abhorrence.

In his old age, Chomsky is not doing well, morally speaking. After 2000, it became his habit to publish books and articles under his own name but without actually doing any writing. A large number of publications have appeared and are still appearing that consist merely of interviews, mostly servile adulation on the part of the interviewer and sloppy, offhand, self-indulgent and mendacious prattle on the part of the interviewee. Even respectable publishers have indulged in this practice—only, one presumes, because any book with the name “Chomsky” on its cover will automatically sell thousands of copies.

Does this man deserve a niche in the academic hall of fame? I doubt that very much. His thinking is certainly sharp, quick and broad in superficial extension, but it lacks depth, flexibility and above all vision. Nor is it really inquisitive, or at least it hasn’t been since the mid-sixties. From that period on, we see a man who digs in his intellectual heels and defends his fort, warding off ideas that might widen his perspective or make him look at things from a different, perhaps more promising, angle. We see a man who, having made a very promising start, rapidly began to abuse the enormous amount of social power he had acquired, eliminating dissidents while putting up an innocent face to the outside world. A man who professes high ideals of freedom and dignity in his political writings but practises the opposite in his academic activities. We see a compulsive prima donna, a clever manipulator of public opinion, a man who has consistently put his own person above the ideals that all unusually gifted persons occupying a leading position in any sphere of life have a special duty to pursue.

Chomsky in retrospect — 4

Meanwhile, apart from the undeniable stimulating effect of his presence in the field of linguistics, which he dominated for four decades, Chomsky has also done untold damage to that field. Linguistics has suffered greatly through his claim to absolute monopoly, his refusal, or inability, to consider other people’s ideas with an open mind and his divisive tactics giving rise to deep sentiments of aversion. When he finally lost his grip, around 2000, linguistics fell into chaos and disarray, and hence into general academic disrepute, calling forth, by way of reaction, a number of, mostly antiformalist, trends, which all copied Chomsky’s recipe for monopoly but now for different, mostly inferior, theoretical positions. The result is that it is impossible, nowadays, to write a coherent general introduction to linguistics or to set up a ‘state of the art’ linguistics teaching programme. All one can do, if one does not want to conform to one particular school, is list the various schools, leaving it to the students to make their choice. It must be said, though, that at least half the guilt for this state of affairs lies with Chomsky’s gullible devotees, who followed his every twist and turn, always presented as an ‘extension’ or ‘further development’ of the theory, even if the new positions were the opposite of earlier positions taken. Had the followers been less docile, the field might have been spared the ruinous state it finds itself in at the moment. But, of course, they all remembered too well the treatment meted out to the dissident generative semanticists between 1968 and 1975—not at all in agreement with the ideas of freedom and self-realization of individuals endlessly propagated in the same man’s political writings.

Besides seeing him as a linguist, many also consider Chomsky to be a philosopher. He has indeed frequently trespassed on philosophical ground (notably in his Reflections on Language, Pantheon, New York, 1975), but on each such occasion one sees that he is out of his depth and fails to reach a well-argued position. During the early sixties, the methodological issue of realism versus instrumentalism was brought to the fore explicitly, in the context of Chomsky’s rejection of behaviourism. Despite his protestations to the contrary and the obfuscation of this fact in the doctored 1975 edition of his The Logical Structure of Linguistic Theory (see my Western Linguistics, Blackwell 1998, p. 253–5), Chomsky had implicitly been a positivist instrumentalist like everyone else in the tradition of structuralist linguistics he had been brought up in. The budding generative semanticists were pushing hard for realism: grammars should strive for a maximally realist description of linguistic competence as a machinery used during speech—a point of view Chomsky appeared to be sympathetic to for a while. But when it became clear that this would mean a considerable loss of academic power to Generative Semantics and to psycholinguistics, he backed off, yet was unable to define the position he wanted to hold. Since then he has always dithered on this basic issue, as amply documented by a variety of authors. His 1995 Minimalist Program is neither this nor that: sometimes instrumentalist, sometimes realist.

The philosophical notions of universal grammar and innatism he espoused were not his but were developed in earnest as from the 16th century by scholars such as Sanctius (1523–1600), Arnauld (1612–1694) and Lancelot (1615–1695) in their 1660 Port Royal Grammar, or Beauzée (1717–1789), and a multitude of others, as is recognized by Chomsky himself in Cartesian Linguistics pp. 52–59 (except for the crucial figure of Sanctius, who is not mentioned at all). It is fair to say that Chomsky’s contribution in this respect consists mainly in reviving the issue, after its neglect in structuralist linguistics, and in putting forward, or enabling others to put forward, proposals for actual language universals in the grammatical machinery of the languages of the world. The validity of the latter, of course, depends largely on the reliability of the theory in terms of which they are proposed, so that any doubt as regards the theory will be reflected on the proposed universals. Yet some of these proposals, in particular those to do with island constraints, seem to have survived despite any theory-related doubts, which may well signal real progress. Meanwhile a myth seems to have established itself to the effect that Chomsky himself was the originator of innatism. This myth is especially alive among non-specialists, who immediately associate innatism with the name Chomsky. It is unclear to what extent Chomsky himself is to be blamed for this, but what is clear is that since Cartesian Linguistics he has done very little to correct that false impression.

Other than that, the man has also, on occasion, ventured into general philosophy, often in public addresses or interviews subsequently published as articles or books (see, for example, the inane 1995 article ‘Language and Nature’ in Mind 104 (413), pp. 1–61, based on a couple of public lectures). In fact, however, he never elaborated or analysed a single philosophical question. But he did make a mockery of scientific methodology by declaring that an appeal to facts is illegitimate in science, one alleged reason being that all so-called facts are ‘idealised’—which, of course, gave him a free hand to ignore counterexamples. Not much of a philosopher, it would seem.

It has become clear over the years that his real passion lies in his political productions, which vastly outnumber his linguistic writings, numerous though these are by themselves. It started with ‘The responsibilty of intellectuals’ in The New York Review of Books of February 1967 (in later years, he has continually belied, by his own irresponsible behaviour, the admonition implicit in the title). Then came a number of books against the Vietnam war. This, one has to say, showed some courage, as he risked possibly severe reprisals from the US authorities. After that, what we find is a never ending tirade against the US as a superpower, supported by endless selective quotes, often taken out of context or even plainly misquoted, and always without even the remotest attempt at a balanced analysis or representation of the state of affairs. Had he been more objective and more modest, and less opinionated, partisan, prejudiced and publicity-driven, he might have made an honest name for himself as a political journalist.

In all this, he shows anarchist leanings but declines to define his exact political position, even when challenged by interviewers. Thus, in an interview of May 1995 with the prominent Irish journalist Kevin Doyle (see Chomsky, Language and Politics, AK Press, 2004, pp. 775–785), when politely but insistently asked about his definition of anarchism, he kept evading the question, falling back on terminological hairsplitting laced with irrelevant historical quotes. The closest he came to a clear statement is when he said, at the very beginning: “That is what I have always understood to be the essence of anarchism: the conviction that the burden of proof has to be placed on authority and that it should be dismantled if that burden cannot be met.” But if that is what anarchism amounts to, then the ultraconservative, antidemocratic Greek philosopher Plato was an anarchist in his Republic, which is one extended justification for an infernally totalitarian Gulag state, with rigidly fixed social classes and authority, control and punishment reaching into people’s most private moments. Plato’s justification, by the way, does not fall back on the behaviourist grounds that humans can be conditioned into any state of subservience but on the innatist grounds that humans are, by their very nature, ignorant, fickle, unruly and dangerous, and thus need to be restrained.

Then, the obvious questions of who is to decide whether the burden of proof has or has not been met and how “authority” should be “dismantled” if that burden of proof has not been met are not answered. In a talk “Containing the Threat of Democracy”, given in Glasgow in January 1990 (see Chomsky on Anarchism, selected and edited by Barry Pateman, AK Press, 2005, pp. 153–177), Chomsky explains why he refrains from discussing concrete action plans and stays on an abstract, philosophical or theoretical level (p. 173):

So it seems that I have two choices: to keep to the general issues of freedom and common sense […]; or to discuss specific questions of power, justice, and human rights. If I were to take the latter course, I’d have to keep to questions to which I’ve given some thought and study. Thus, in the case of national self-determination, I would feel able to discuss the question of Israel–Palestine, but not that of Northern Ireland. In the former case, what I have to say might be right or wrong, smart or stupid, but at least it would be based on inquiry and thought.

Here we really see the man at work. The whole talk was on the pernicious effects of US state and corporate power, not on questions of “national self-determination.” And, of course, Chomsky has given a great deal of “thought and study” to, and thus has detailed and explicit knowledge of, the intricacies of US state and corporate power, which he invariably denounces as criminal. So, by his own reasoning, he should be in an excellent position to hold forth on the practical issue of how to “dismantle” that system. Yet he refrains from doing so. Why? I think because he is short of ideas and also because he does not want to imperil his own private investments which depend directly on the very same criminal system he repudiates. What remains is a picture of a riotous agitator whose scoffing criticisms, though often justified in themselves, in no way contribute to any kind of constructive solution or repair. All we get is abstract ideological grandstanding without any practical or theoretical value and, in the end, a non-distinctive, bourgeois accommodation to the existing system within the progressive intellectual left as it exists in the US. … to be continued