John Taylor’s frequency linguistics – 1

by pieterseuren

As I said in my previous posting, Nigel Love, of the University of Capetown, drew my attention to the book The Mental Corpus by John R. Taylor (Oxford University Press, 2012), saying that my first posting on Mickey-Mouse linguistics read as if I had had that book in mind when I wrote it. In fact, however, I hadn’t seen that book yet, but I have now, and oh boy, what a treasure trove! Thank you, Nigel. It is the purest form of frequency worship I have so far encountered. So it’s worth having a good look at it. I will dedicate the next few postings to John Taylor’s book, hoping that this will help to reduce the frequency craze to rational proportions.

Just to be clear, I do not maintain that frequency plays no role at all in language use or in language acquisition or in language change. Far from it: frequency is no doubt an important factor in those areas. All I maintain is that frequency plays only a minor role in linguistic competence, that is, the speaker’s ability to express given meanings in well-structured sentences, realized as utterances. It plays a role, and only a minor one, only in so far as the speaker is faced with a choice between semantically equivalent alternative forms of expression. This applies to both the grammar and the lexicon, though it is more obvious in the latter. Both constructions, as the result of rule systems, and lexical items, including idioms, carry approximate frequency estimates in the minds of speakers, just as they are marked for social, dialectal and interactional register and who knows for other parameters as well. Whether a speaker uses the word car or vehicle will depend on the interactional and social situation, and the selection made may well be influenced by frequency considerations. Likewise in grammar. Whether a speaker says You will stay here, or You won’t go anywhere, or You won’t go nowhere, will depend, apart from the meaning he intends to express, on his choice of interactional and social register (assuming he has sufficient command of the registers concerned), whereby frequency considerations may play a limited role. Other than that, frequency plays no role in linguistic competence. In particular, once the speaker has set his sociolinguistic register, the rule systems he employs do not contain a frequency parameter. That is why frequency is not, and should not be, a term in grammar.

Taylor takes a different view. He maintains that frequency is the main parameter everywhere, in the use, acquisition, change and command of any given language. So that is how we differ. I will focus on competence, that is, grammar and lexicon, though I also find that he exaggerates the importance of frequency in other respects, but that is not what I am aiming at here.

Taylor’s main thesis is that a speaker’s competence or knowledge of his language is organized as if it were a ‘corpus’, in parallel with the corpuses of computational linguistics. For Taylor, language acquisition proceeds on the basis of token frequencies and language competence is organized and used, again, on the basis of token frequencies. In the Conclusion to Chapter 7 called “Frequency”, we read (p. 178):

Frequency effects are pervasive throughout the grammar. […] [F]requency effects are not simply a matter of language-as-used, but are an intrinsic aspect of a person’s language knowledge. Speakers know the relative frequencies of the various elements which make up their language. Frequency influences performance on all manner of linguistic tasks, not the least of which is the comprehension of potentially ambiguous sentences. Frequency is also implicated, albeit indirectly, in the productive application of schema’s to new instances. It is all as if, in the course of a lifetime’s exposure, speakers have been keeping a mental tally of the number of times they have encountered the words, the sounds, and the constructions of their language.

Yet in the book as a whole, there is no systematic limitation to each speakers’ various forms of personal frequency count. On the contrary, indiscriminate use is made of token frequencies in the language as whole, in a text genre, in one text or interaction, in one person’s speech, etc.: all these are mixed up without any intelligible indication as regards the way such a linguistic mental ‘corpus’, said to define a speaker’s competence, is defined and internally organized. Computational corpuses tend to be well-defined and well organized. But how is a mental linguistic ‘corpus’ defined and organized? This question is not answered. All one gets, at the very end of the book (p. 287), as if by way of an afterthought, is the following:

There are many dimensions along which the raw linguistic event can be categorized, classified, or analyzed. It can be analyzed into its fragments, primarily words and word-like units, but also combinations of words and the syllables and sounds which make up the words. These elements may in turn be categorized as instances of more schematic categories and cross-referenced to elements derived from other exemplars. [NB: the notion of ‘exemplar’ is introduced on the last page of the book and is, puzzlingly, defined as ‘a categorization, classification or analysis of a token’; PAMS] The exemplar, we may suppose, is also indexed with respect to features of the communicative situation, such as the identity of the speaker, the speaker’s gender and regional provenance, as well as the formality and subject matter of the speech exchange. Importantly, the exemplar will be associated with the presumed semantic intent of the speaker and its pragmatic relevance to the current discourse. Thus, a word encountered in a linguistic exchange will be remembered along numerous dimensions, including, amongst others, (a) its immediate lexical environment, (b) the containing syntactic construction and its semantic-pragmatic value, and (c) its pronunciation, indexed with respect to characteristics of the speaker and the communicative exchange. Multiple representations of identical features of the input will result in a strengthening of their mental representation. The empirical findings reported throughout this book lend support to each of these claims.

Not very helpful, to say the least. I, for one, am unable to make sense of this hodge-podge of type and token things presumably recorded in a speaker’s linguistic memory and defining the language used. According to this, those of you who are fortunate enough to read this posting will, from now on, have it as part of their knowledge of English that in the previous sentence the expression hodge-podge is preceded by unable to make sense of this and followed by of type and token things, whereas those speakers of English who do not have the privilege of reading this posting will have to get by with a language that lacks this bit of information. Or the voice quality and local accent of the ticket collector on the train I took from London to Oxford in November 2008 is allegedly recorded in my memory (it isn’t, I can assure you) and is taken to be part of my knowledge of English. I hope I can be forgiven if I venture the opinion that this is absurd. For Taylor, there are no rules of syntax, morphology or phonology: it is frequency all the way down and all the way up.

I will deal with other aspects of the book in later postings, but today I want to focus on the notion of frequency, central to the entire book, and I will show that the very notion of frequency makes Taylor’s view of frequency as the basis of linguistic knowledge incoherent. The argument is fairly simple. It amounts to showing that the attribution of sameness, required for frequency counts, to two or more distinct tokens often depends crucially on rule systems in that, without a rule system, the attribution of ‘sameness’ to any number of distinct tokens will be impossible.

In order to be able to speak of frequency at all, one needs the notion of repetition, which in turn requires the notion of sameness. But no two tokens are ever entirely identical. Absolute sameness (identity) between two objects does not exist: there always are differences, no matter how minute. If there is absolute identity, there is one object, one token. So what makes us decide that two tokens may be considered the same, or identical? This is known as the problem of the identification of indiscernibles, as formulated by Leibniz in the late 17th century.

Each individual object or event is a token, but when we speak of sameness, we must create a type at some level of abstraction (the term pair type vs token is due to the American philosopher Charles S. Peirce). Sameness is always relative to an analysis at some level of abstraction, usually associated with a specific functional purpose. When I say John and Harry have the same car, the extent to which the sameness is to be pushed for it to count as true will depend on the situation at hand. Normally, if John and Harry are not co-owners of one and the same vehicle, the two cars will have to be of the same make and of the same model. But how about the same year, the same colour, the same engine power, etc.?

Likewise in language. Phoneticians will tell you that no two sounds are ever absolutely identical (Bloomfield already stressed this around 1930). But when we know a language, we group the different sounds into functional units which we call phonemes. In a phonological transcription, two token occurrences of the same phoneme count as ‘the same’, but in a phonetic transcription ‘sameness’ is much more narrowly defined, taking into account the modifications due to phonetic environment (‘allophones’), but not, for example, differences due to the speaker’s age or gender. Morphemes count as ‘the same’ even in different phonological garb: the morpheme spelled as tiny is sometimes found pronounced as /tayni/ and sometimes as /teeny/ (please forgive the awkward transcription: you know what I mean), but it is the same word, at least at some given level of abstraction. (When I was at school, we had a French teacher who had the habit of saying, sometimes in immediate succession, I never say anything twice, the second time louder and at a higher pitch, which made us laugh, rightly so, I think.)

At an appropriate level of abstraction, sameness also allows for syntactic and lexical differences. At some level of abstraction, directly relevant to linguistic interaction, the sentences That may not be the case, That may not be so, That is possibly not so, Possibly, that is not so, It is possibly not like that, and quite a few others, count as ‘the same’ for speakers of the language, even if there is no computer program identifying them as such. They are just paraphrases of each other. (But note that the element of possibility always precedes the element of negation, both being scope-bearing operators; see below.) For certain purposes (syntax, information structure), the sentence I won’t be in Paris tomorrow counts as different from Tomorrow I won’t be in Paris, but for other purposes (conveyance of factual message) they count as the same. When I utter the first sentence two times and the second three times to the same person within a time span of, say, five minutes, my interlocutor is fully entitled to say You’ve told me that five times now.

Yet the preposing and postposing of adverbials is not always indifferent as regards the message conveyed. Take the two sentences:

(1)   a.         For two years I won’t be in Paris.

b.         I won’t be in Paris for two years.

On one reading, the two convey the same message, namely ‘for two years it will be the case that I will not be in Paris’: Paris will have to do without me for two long years. But on another reading, the one that says ‘it is not so that I will be in Paris for two years (but for less, or more)’, they are not the same, because it is only (1b) that can convey this message. In other words, (1b) is ambiguous in a way that (1a) is not.

This difference is systematic when there is an interaction between what logicians call scope-bearing operators, as in the following examples (see also the examples (8)–(11) in my first blog, on Mickey Mouse linguistics):

(2)   a.         Twice I saw someone leave. (non-ambiguous)

b.         I saw someone leave twice.               (ambiguous)

(3)   a.         Every morning I read two poems.    (non-ambiguous)

b.         I read two poems every morning.    (ambiguous)

(4)   a.         At six they had already left.             (non-ambiguous)

b.         They had already left at six.             (ambiguous)

(The ambiguity of (4b) stands out more clearly when you ask “at what time did the leaving take place?” and “at what time was the situation such that they had already left?”.)

These are scope differences. That is, the semantic difference can be expressed by changing, in the semantic representation, the order of the scope-bearing elements, such as for two years, not, twice, someone, every morning, two poems, at six, or the pluperfect tense had + past particple. There is a, probably universal, overall default constraint on grammars in that the operator hierarchy is expressed in terms of left-to-right order in surface sentence structures. In my publications, I call this the Scope Ordering Constraint, or SOC. The default can be overridden when an operator is grammatically defined as ‘landing’ in a clause-peripheral position, which is generally the case for operators surfacing as sentence adverbials like twice or every morning or at six. This accounts for the ambiguity of the (b)-sentences.

To know which terms are scope-bearing and which are not, one needs to have some knowledge of modern logic, where the negation not and quantifying expressions such as all, every, some, many, or numerals, are treated as operators which, by definition, have ‘scope’, in that they restrict their logico-semantic effect to what, in a logical analysis, is their scope. Definite terms, such as terms under a definite determiner (the, those, your) or definite adverbials (tomorrow, yesterday, last year), are not scope-bearing and thus do not exhibit the typical phenomenon demonstrated in (1) to (4).

Linguists are not used to scope differences. It is a common misunderstanding that definite and quantified terms behave identically in the grammars of languages. That this is not so is beautifully shown in, for example, German and Dutch, where the place of the negation in the sentence is seen to correspond with operator scope. Consider the following three German sentences:

(5)   a.         Ich sah den Mann nicht.     (‘I didn’t see the man’)

b.         Ich sah keinen Mann.          (‘I saw no man’)

c.         Ich sah einen Mann nicht.  (‘there was a man I didn’t see’)

Sentence (5a) reads ‘it is not so that I saw the man’, with the negation over the whole proposition ‘I saw the man’, which has the definite, scope-insensitive, direct object den Mann. Sentence (5b) is again a case of sentence negation: ‘it is not so that I saw a man’, with the negation over the whole proposition ‘I saw a man’. But now the direct object is indefinite, that is, it contains the existential quantifier ‘some’, which makes the direct object scope-sensitive. SOC now causes the negation in (5b), here expressed as the initial phoneme k before einen, to precede the direct obect. If it follows the indefinite direct object, as in (5c), the scope relations are changed in that now the negation nicht is in the scope of the indefinite, that is quantified, direct object, which, therefore, has to come before the negation. German (and Dutch, which is the same in this respect) allows for sentences like (5c), because in German (and Dutch) the default position of the negation is at the end of the clause. This default is overridden when, in order to reach the end of the clause, the negation would have to cross an operator with lower scope, which would violate SOC. In such cases, the negation stops before the operator in its scope, as shown in (5b). But in (5c), no violation os SOC occurs, since there einen Mann has larger scope than nicht, so that no violation of SOC occurs. English has no direct equivalent of (5c), because in English the negation has to be united with the finite verb form (an auxiliary or do in do-support cases). English, therefore, must resort to a special ploy in order not to violate SOC, such as the use of the quasi-logical periphrastic form There was a man I didn’t see.

The point here is that sameness relations between sentences, at the level of the message conveyed, are seen to be subject to rule systems, not to patterns of syntactic structural similarity that may or may not have been engrained in some memory. In the cases discussed, the rules are of the type that convert semantic representations in the form of logico-semantic structures to surface structures. These are transformational rules (though different from the transformations known from early Chomskyan grammar which are of a merely syntactic nature). This shows that it is impossible to determine sameness relations, necessary for frequency counts, without having recourse to rule systems. (An analogous argument can probably be set up for phonology, but, not knowing anough about phonology, I won’t do that.) This defeats Taylor’s research programme right from the start and for deep, principled reasons.

A further point is to do with the ideological background to the stance taken by Taylor and his cognitivist friends. In order to account for phenomena like those shown above one needs fairly ‘abstract’ rule systems and formalisms—something the Taylors of this world abhor. Why they do so is unclear, at least when one restricts oneself to purely academic argument. It becomes clearer when one is prepared to consider reasons of a more ideological nature, such as an ideological wish to see the human mind as a simple device definable in terms of a physical counting apparatus, just as the information technologists of the 1950s and 1960s wanted to reduce everything to mechanically recordable statistical frequency measures. This means that, if this is the reason for the frequency fans to close their eyes for anything not reducible to frequency, the cognitive revolution of the 1960s has passed them by.

Next time I will discuss other aspects of Taylor’s book. But next time may be a few weeks from now, since on April 9th I will have a cataract operation to my right eye, which is not serious but it will keep me away from the computer for a couple of weeks. See you soon, then, but not before the end of April.