Frequency Linguistics — 3: John Taylor’s “The Mental Corpus”

by pieterseuren

Two blogs ago I promised that I would say more about John Taylor’s book The Mental Corpus (OUP, 2012). Now is the time for this sad duty. Sad because I find reading bad books or papers, especially when they pretend to be serious science, both infuriating and a form of self-inflicted mental torture, and I find writing about them depressing. I’d much rather go in for a serious, productive discussion. Unfortunately, however, that is not possible with this book. I will try to make the best of it, but it will remain a hatchet job. So here goes.

Given its title, The Mental Corpus: How Language is Represented in the Mind, one would expect the book to provide some precise idea of what is meant by a ‘mental corpus’. But no, nowhere in the book is this notion defined in a comprehensible way. The final chapter (pp. 280–287), called The mental corpus, explicitly tries to make the notion clear. Yet here too the reader is left with empty hands. The best I found is on p. 263, where the first paragraph reads: “The thesis of this book is that knowledge of a language can be conceptualized in terms of the metaphor of the mental corpus. Language is acquired by a strictly bottom-up process, through exposure to usage events, and knowing a language consists, not in knowing a battery of rules, but in accumulated memories of previously encountered utterances and the generalizations which arise from them.” Note that one is casually, and only in the penultimate chapter, told that the term corpus is a metaphor, but what the metaphor stands for is not explained. One has to do with the vague notion that each speaker has a memory that somehow stores all the utterances in the language s/he has ever heard or read, including phonetic details (voice quality, gender and all), context of utterance, precise semantic, morphological and syntactic form, under some unknown but no doubt highly complex formula of analysis, whose origin is left obscure. One is then supposed to draw on that, literally fabulous, memory when one speaks or writes. Nothing on the complex formula of analysis in the assumed memory store, although that precise formula would have to be the central element in Taylor’s so-called ‘theory’, nothing on the internal structure of that memory, nothing on how speakers actually draw on it, nothing on the kind of memory involved (episodic, semantic, declarative, procedural), no crucial evidence, no arguments why this should be so, nothing on the evidence against this notion, gathered in the early 1960s and published in the Journal of Verbal Learning and Verbal Behavior, showing that utterances are remembered in terms of their semantic content, as their precise phonetic and grammatical features vanish from memory within minutes, nothing on the circularity paradox in his theory of language acquisition (the child has to have knowledge of the language in order to be able to judge on sameness and difference—see the last blog— yet it is these judgements that make up language acquisition)—in short, nothing at all.

No method is specified for counting. For example, how does one count the number of times a reference is made to an entity in a given text? Try to find out how often Robin Hood is referred to in Walter Scott’s Ivanhoe, and you will find that it is impossible to do so. For example, do anaphoric or reflexive pronouns count as reference occurrences? Does a sentence like I know who has done it, where the person intended is Robin Hood, count as an instance of reference to Robin Hood? Readers or listeners do have a global idea of relative frequencies, but this notion is, though invoked (e.g. on pp. 148, 175-6), curiously mixed up with absolute frequencies and not discussed or analyzed any further. If readers or listeners have the feeling that Robin Hood is more prominent in the story than some other characters, is this based on the occurrences of linguistic forms or is it a complex function of the number of times and the length of episodes in which, and the prominence with which, Robin Hood plays a role in their mental representations brought about by their interpretation of the texts? These are normal empirical questions, but they are not discussed: as if by statute, everything has to be reduced to the frequency of linguistic occurrences.

Despite all the unclarities, one thing is crystal clear: Taylor abhors any formal kind of grammatical machinery, in particular Chomsky-type generative grammar. He even goes so far as to deny emphatically and repeatedly the reality of rules in linguistic competence: everything is frequency-based. Yet, curiously, he implicitly admits the reality of rules. Opposing the generative rule-based model, which seeks to maximize generalizations and to avoid redundancies, to the construction-based account, which he favours and which, according to him, admits of redundancies, Taylor says, on p. 128: “[T]o the extent that knowledge of a language does contain redundancies, the construction-based account might be psychologically more plausible.” Now, first, the redundancies point is not valid, since the information that a particular product of the grammar is a set phrase or common collocation is independent additional information, which takes the grammatical structure of the phrase or collocation for granted. This means that any full description of a language will have to specify not only the rules of the grammar but also what specific combinations have gained currency as idioms or set phrases. But apart from this, the quote taken from Taylor’s p. 128 implies that there is also an extent to which knowledge of a language does not contain redundancies, for which a rule-based account should then be considered psychologically more plausible—which would be an implicit admission of the validity of the notion ‘rule’ in the theory of grammar. Not so for Taylor. While an inordinate emphasis is placed on the massive occurrence of idioms and lexical innovations in languages, the fact that these are all structured according to rule-governed patterns (some of which are reserved for idioms of certain kinds, such as the sooner the better) is left undiscussed. Idiosyncrasies take pride of place and anything remotely suggesting that language use is subject to rules and regulations, like taking part in traffic, is covered up and kept out of sight.

Yet the two concepts are not incompatible. It is no doubt so that speakers, in producing their utterances, often do not literally have to apply their implicitly known rules but use the shortcut of falling back on syntactic and morphological chunks drawn from memory, thereby minimizing the effort needed for the production of their speech. And to study these processes is, of course, a legitimate undertaking. Yet it is equally true that speakers actually use rules when they speak. In fact, the less experience they have in a language, the more they depend on rules. And the more competent speakers are, the better their command of idiomatic and fixed expressions stored in memory. It is, therefore, in child language that one observes most directly the actual use of rules during speech. You will remember the example I gave in my last blog of my nine-year old granddaughter saying to me, after a short hesitation, that “everything was freezed over”. Her hesitation showed that she was wavering between, on the one hand, the irregular frozen, which she had heard said many times during that December and, on the other, what she felt certain was the rule. She chose the rule but should have chosen the memory-stored exception. I fail to see why one should not accept a theory that posits a rule system (grammar) combined with a finite but open-ended store containing a list of minimal morphemes, words, preferred expressions, frequent or fixed collocations and idioms, and all exceptions to the rules. The fact that, on the whole, Chomskyan grammarians show relatively little inclination to elaborate these aspects of grammatical theory may be a point of critique, but it is no argument against the viability of any research programme involving the notion of rules. The non-viability of the specifically Chomskyan rule-based programme is due to quite different and much more basic faults. Strangely and unbelievably, in rejecting Chomskyan generative grammar, Taylor thinks he has rejected any form of rule-base grammar.

Thus, while an inordinate emphasis is placed on the massive occurrence of idioms and lexical innovations in languages, the fact that these are all structured according to rule-governed patterns (some of which are reserved for idioms of certain kinds, such as the sooner the better) is left undiscussed. Don’t worry is a set expression, but Don’t wait is not. Yet they are formed according to the same grammatical rules. Idiosyncrasies take pride of place, but anything remotely suggesting that language use is subject to rules is kept away—except on p. 128, where it is suddenly implicitly admitted that there may well be rules in the competence system.

In Taylor’s vision, everything in language is supposed to be mushy and wishy-washy, and the reader’s understanding of his so-called ‘analyses’ is likewise supposed to be mushy and wishy-washy—the worst kind of quasi-science. There is no clear notion of what constitutes data: judgements concerning well-formedness, possible meaning(s), dialectal, sociological, interactional or special group register markings, are mixed up with corpus-derived frequency data, whose selection is, moreover, heavily biased against formal (that is, Chomskyan) grammar—which, for Taylor, automatically means in favour of a frequency-based account. Nowhere are criteria given that might identify counterexamples. Obvious and well-known generalizations are overlooked or, perhaps, just ignored.

To mention just one example out of a multitude, Taylor stresses repeatedly (e.g. pp. 2, 158–60) that adjectives like total or unmitigated have certain statistical preferences as regards the nouns with which they occur. Corpus analysis tells you that unmitigated occurs overwhelmingly more frequently with disaster than with other nouns, such as success, failure, joy, misfortune. And total occurs much more frequently with failure than with success. Then, on p. 160: “From a semantic point of view, there is no particular reason why unmitigated should collocate so strongly with disaster.” And the same as regards the preference of total for failure. Yet there is an obvious semantic generalization: only bad things are unmitigated, as the verb mitigate means ‘make less severe, serious or painful’. That the combination of unmitigated with success does occur with some frequency is probably due to the fact that many people have started to confuse unmitigated with undiluted. This may be an instance of semantic change caught in an incipient stage, but the semantic reason for the combination unmitigated disaster is clear. Likewise for total, which prefers negative nouns or adjectives: totally empty, rather than totally full, total absence or total lack, rather than total presence, etc. Note that total silence or a total wreck or total loss have no positive counterparts. This preference of total for negative heads may well again be an instance of incipient semantic change, this time a semantic specialization, not shared by synonyms such as entire(ly) or fully. Why this reluctance to look for and accept generalizations not based on frequency? I can see no other reason than a stubborn determination to reduce everything to frequency: competitors are not welcome.

The book is to a large extent meant to refute Chomskyan grammar, something I do, of course, sympathize with. But in all fairness, almost everything Taylor says against Chomskyan grammar is either based on insufficient understanding or simply lacks argument value. Moreover, Taylor never considers the possibility of different models of algorithmically organized grammar. One obvious possibility is a grammar that converts semantic representations into well-formed surface structures, as was proposed during the 1970s in Generative Semantics. One specific instantiation of such a model is proposed in my book Semantic Syntax (Blackwell, Oxford 1996), which frequency fans would do well to study in detail. This model is totally non-Chomskyan, yet algorithmic and thus rule-based and completely formalized. But Taylor does not even consider the possibility of such a model.

In Chapter 8 the laughable theory is proposed that skewed frequencies as found in any linguistic corpus are a design feature facilitating the acquisition of language—a thesis for which absolutely no evidence is adduced, of course not, as it is absurd. The chapters 10, 11 and 12 are just padding, without any significance for the thesis Taylor intends, but fails, to defend. They are, moreover, for the most part either trivial or confused. Chapter 12, for example, mixes up, without argument, (i) the well-known and universally recognized phenomenon of morphological blending (brunch from breakfast plus lunch, Amerindian, from American plus Indian, Oxbridge from Oxford plus Cambridge, webinar from web plus seminar, etc.), (ii) what he calls phrasal blending, normally called contamination, to be seen in ungrammatical cases like Do I have to put on my seatbelt on? or This is getting very difficult to cut this, or the most beautifulest girl in the world, (iii) what he calls the blending of words and constructions, as in The psychic will think your husband into another galaxy (which is, in fact, a mere extension of the well-known construction instantiated in, e.g., laugh out of court, fully productive in German and Dutch), (iv) transitive constructions such as The farmer killed the duckling, which, according to Taylor (p. 279), instantiates “the blending of the verbal and nominal elements with the formal and semantic properties of the transitive construction”, (as if the same would not hold for intransitives!) and, to cap it all, (v) the creation of intensional subdomains in a discourse domain, as in I suppose Claire thinks that John is married, where I suppose introduces a first intensional subdomain and Claire thinks a second, so that John is married is interpreted two domains away from the extensional or truth domain meant to represent situations as they really are in the world. (As would be expected, Taylor is totally unaware of the extensive literature on intensional subdomains in logic and the philosophy of language, as that literature does not deal with frequency.)

No indication at all is given of what unites these five classes of phenomena, yet it is said (p. 279, closing the chapter): “Seen in this light, blending is able to do the work traditionally assigned to generative rules; in other words, it is able to account for creativity, in the generativists’ sense. More than this, it is also able to accommodate innovative extensions which go beyond creativity in the narrow sense of the term understood by generative theorists.” I hope you will understand why this infuriated me: without wishing to defend Chomskyan grammar, which I consider untenable, I am incensed by Taylor’s pretension that a totally undefined and in fact quite fantastic, confused and murky notion such as his ‘blending’ can be said to do a better job than what is, after all, a formally reasonably well defined theory such as Chomskyan generative grammar.

I could go on and on expatiating on the faults of this terrible book, the worst book on linguistics I have seen in many, many years, perhaps ever. A complete refutation would fill an entire volume, which I won’t write because the book is not worth it. Apart from the inherent incoherence of Taylor’s ‘language-as-corpus’ view, the book’s main fault is a conspicuous absence of actual arguments: it’s all rhetoric, easily shown up to be empty when one applies ordinary standards of sound reasoning. In this respect, it represents a retrograde development in linguistics, after the enormous methodological and empirical gains of the past half century.

For the moment I think I have said enough about the frequency craze that has been raging in linguistics for the past few decades. I hope I have convinced you that this is a dead alley, or at least that there are serious doubts regarding this paradigm. Next time I will write about something quite different, and, I think, much more joyful and exciting.