Signs o’ the times – some/any invariant meanings and COCA

I am glad to be writing this particular (rushed, see end) post as it involves corpus linguistics and I have not done such a post for a while. It is also about my current interest – Columbia School linguistics.

I have been over the years less enamored of the power of corpus linguistics for language teaching. It is certainly very useful to access descriptions of language but that is not enough. Explanations are also needed. Columbia School (CS) linguistics is about analyzing invariant meanings that motivate choices in both grammar and lexis. It is about one form to one meaning mappings – an ideal aim when looking to help students.

Nadav Sabar in 2016 analyses the use of some and any. The following borrows heavily from this paper.

Most pedagogical grammars state (formal) rules such as “any is used in negative sentences and not in affirmative statements”. Yet such rules cannot account for why some is used in contexts that are said to be used for any. Sabar gives the following attested example:

1) When Yvonne lived in Italy, where it seems like the whole country is married, people always wanted to know about her personal life. I remember her telling me that every time she’d come back from a great vacation, the first question from married friends was, “Did you meet anybody?” It was as if the whole point of going on vacation was to meet someone. That she had a great time and saw something new and interesting didn’t matter. The entire vacation was cancelled or a flop because she didn’t meet someone. (http://www.yvonneandyvettetiquette.com/2008_09_01_archive.html)

Formal accounts could only say that any is also acceptable as in she didn’t meet anyone and is unconcerned with why the writer chose some in this case.

Formal accounts use the sentence as unit of analysis and see meaning as compositional – i.e. the meanings of individual words in a sentence add to the whole. CS uses signs (pairing of symbol to meaning) as the unit of analysis and sees meaning as instrumental rather than compositional. That is the individual meanings of signals need not add up to sentence meaning. There is a distinction between linguistic code that has an invariant meaning (that always corresponds to a linguistic signal) and interpretation of the code which is the subjective outcome of messages. Meanings are very sparse in that they do not encode messages but only offer prompts that may only suggest message elements.

The meaning hypotheses of some and any are shown below:

I.e. some as RESTRICTED suggests limits, internal divisions, boundaries while any as UNRESTRICTED suggests no boundaries, limits or divisions. Note that this does not mean that the domain in question in reality has no divisions or boundaries. Just that the reality is irrelevant to the message. Also note that in a pedagogical grammar such as Martin Parrott’s this meaning division between restricted and unrestricted is only described for stressed SOME and ANY.

Sabar uses the following as examples:

2) If you see something, say something. (New York City public safety slogan)
3) No parking any time (street sign)

In 2) some is used because the message suggested is a restriction on the set of things people see and say. The context drives the inference as to the nature of the restriction – suspicious looking things. Any could also have been used but that would not have been as effective a message – any would have suggested no restriction i.e. people should call no matter what they see.

Similarly in 3) any is used because there is no restriction on the domain of times of the day.

So now for 1) we can see some is used because the message suggests a restriction of the set of people Yvonne did not meet, and the context shows that this restriction as people who may qualify as marriage potential.

Now the interesting corpus linguistics part.

The methodology of CS first involves a qualitative step where some aspect of the sign in question is looked at. So for some which suggests restriction another element which suggests the same is looked for:

4) Some Feds [Federal workers] are held up as national heroes while others are considered a national joke. (ABC Nightline: Income Tax)

Here others is used to refer to a different subset of people within the domain of Federal workers. This message element is also suggested by some – RESTRICTED. This does not mean there is only one reason for the choice of these forms rather that this message feature of internal division is one reason out of many possible reasons that has motivated the choice of these two forms.

To test this claim generally we can look at a corpus to see if there is a higher than probable chance that others occurs with some more than others occurs with any.

We can do this in COCA by using these search terms:

COCA searches for others:

Favoured Disfavoured
some [up to 9 slots] others any [up to 9 slots] others

The following screenshot shows how to find some [up to 9 slots] others (do similar for any):

To find some with not others see the next screenshot (i.e. use the minus sign -):

And tabulating the data in a contingency table:

others present others absent
N % N %
some 19078 90 8946046 65
any 2022 10 4841946 35
Total 21100 100 13787992 100

p < .0001

The table percentages and significance test supports the claim that there is one message feature that motivates use of both some and others. Note that the meaning hypothesis itself is not directly tested; it is only indirectly tested via the counts in COCA. Sabar goes onto to test both qualitatively and quantitatively other signals that contribute to the meaning hypothesis of some – RESTRICTED and any – UNRESTRICTED.

I wondered how the singular other would distribute with any and some:

other present other absent
N % N %
any 39244 52 4811937 35
some 35175 48 8930621 65
Total 74419 100 13742558 100

p < .0001

Here can we say that singular other contributes to a message meaning of unrestricted? I have no idea as I have not had time to explore this further!

I hope dear reader you forgive the rushed nature of this post but I wanted to get something up before the risk of forgetting this due to holiday haze!

Thanks for indulging.

Update 1:

Thanks to heads up from some tweeters Michael Lewis in his book The English Verb in 1986 was also pointing to the primacy of meaning:

Update 2:

Nadav Sabar has pointed out that he looked for others in one direction i.e. following some/any whereas I looked at occurrence of others both following and before some/any.
Plus in a new version of his paper a window size of 2 is used instead of 9.

References:

Parrott, M. (2000). Grammar for English language teachers: with exercises and a key. Cambridge University Press.

Sabar, N. (2016). Using big data to test meaning hypotheses for any and some. In Otheguy, R., Stern, N., Reid, W. and Ruggles, J. (Eds.) Columbia School linguistics in the 21st century: advances in sign-based linguistics. Amsterdam/Philadelphia: John Benjamins. Retrieved from [https://www.academia.edu/33968803/Using_big_data_to_test_meaning_hypotheses_of_some_and_any]

Advertisements

Chomsky, he’s not the messiah, he’s a very misquoted linguist

Sean Wallis runs a great corpus linguistics blog. So I was intrigued as to a recent click bait post titled Why Chomsky was wrong about Corpus Linguistics. I thought initially he was going to go over the history that has been rightly critiqued by Jacqueline Léon in Claimed and Unclaimed Sources of Corpus Linguistics (pdf). In fact he uses an interview given by Chomsky in 2001. Further in developing his first point he takes as given Christina Behme’s assertion that Chomsky “acts now as if no data can challenge his own proposals”.

I think Wallis’ article about some major issues in corpus linguistics stands on its own well and does not need the Chomsky angle.

The part Behme quotes to the question What kind of empirical discovery would lead to the rejection of the strong minimalist thesis? is All the phenomena of language appear to refute it, she even emphasises the All!

I looked up the fuller quote she uses to make her claim about Chomsky dismissing any data that goes against his theory:

AB&LR:: What kind of empirical discovery would lead to the rejection of the strong minimalist thesis?

NC: All the phenomena of language appear to refute it, just as the phenomena of the world appeared to refute the Copernican thesis. The question is whether it is a real refutation. At every stage of every science most phenomena seem to refute it. People talk about Popper’s concept of falsification as if it were a meaningful proposal to get rid of a theory: the scientist tries to find refuting evidence and if refuting evidence is found then the theory is given up. But nothing works like that. If researchers kept to those conditions, we wouldn’t have any theories at all, because every theory, down to basic physics, is refuted by tons of evidence, apparently. So, in this case, what would refute the strong minimalist thesis is anything you look at. The question is, as in all these cases, is there some other way of looking at the apparently refuting phenomena, so as to preserve or preferably enhance explanatory power, where parts of the phenomena fall into place and others turn out to be irrelevant, like most of the phenomena of the world, because they are just the results of the interactions of too many factors?

Chomsky (2002), On Nature and Language, pg. 124

Looking at it one can clearly see Chomsky is expounding on the nature of scientific enquiry not denying data to his own theories. This pattern of Chomsky critics misquoting him for their own polemic appears often. I was still surprised that this one was so blatant. I did leave a comment on the Behme post so will update this post in the event of a response.

Thanks for reading and remember, Chomsky, he’s not the…ah you get the point.

Update:

Christina Behme responds, I think she accepts she was misquoting (if it makes me happy). You can read responses and decide for yourself, do comment either here or there should you wish to.

References:

Chomsky, N. (2002). On Nature and Language. Cambridge: Cambridge University Press.

What is the ideal title for a talk/poster at IATEFL and TESOL 2015?

Although a lot of criticism can be made of mainstream teaching conferences they are not going away (yet?). As a way to get ready for the IATEFL 2015 conference I thought it interesting to see what kind of titles were most common.

Professional development in the classroom – this is the ideal title if you want to present at IATEFL and TESOL 2015.

Of the 680 titles (including talks, posters, forums) at IATEFL 2015 and of the 1092 titles at TESOL 2015, the top 2 word bundles are:
professional development (16, 15)
in the (26, 42)
the classroom (11, 15)

Furthermore in + elt/tesol, in a are also common bundles.

If you want to aim for only IATEFL then use how to  bundle in your title as this tops IATEFL 2015 with 28 instances.

By contrast if you want to target TESOL then use strategies for bundle which has 16 instances.

In addition TESOL titles prefers to use language learners (14) while IATEFL prefers language learning (10). Do also make sure you give added value for TESOL since teaching and, and the are common bundles.

If you want you can download IATEFL 2015 and TESOL 2105 titles yourself to explore (there may be some errors in terms of duplicates and/or missing titles). I used AntConc to count the 2-gram bundles. All the bundles were taken from the top 20.

One could explore the top 20 keywords to add another perspective.  Or a count of titles over the years. A look at other major ELT conferences would also be interesting. Someone may also be interested in making a count of the gender of the presenters.

Thanks for reading.