Quick cup of COCA – compound words

A new quick cup of coca post, whayhay. Thanks to Mike Harrison (@harrisonmike) on Twitter who was asking about finding compound adjectives.

Here we can use wildcard asterix, with part of speech.

So say we were looking for adjectives starting with well, we could use [well-*].[j*] to give the following top ten results –

(click on words to see full search results)

To find all compound adjectives we would simply replace the first part of the compound with another wildcard asterix like so:


which gives us the following top 10 results:

(click on words to see full search results)

Similarly if you were looking for noun, adverb or verb compounds simply add the appropriate POS tag i.e. [n*], [r*] and [v*] respectively.

Note do double-check result in concordance lines as sometimes the POS tagging is off.

As an interesting aside a search for compound adjectives historically in COHA gives us a very nice ascending curve. Wonder what the significance of that is?

Compound adjectives over time in COHA (click on graph)

Finally do check out the previous quick cup of coca posts if you want help with searching in COCA.

Quick cup of COCA – lemma and POS

I was reading the following which is part of a forum discussion by a French poster:

This is clearly more complicated to port, but the benefit can be very important,

OpenPandora Boards comment

It caught my attention as I am interested in the uses French speakers of English make of the word important (e.g. see here). Often they use it instead of an appropriate size adjective, so in this case the forum poster could have written – the benefit can be large.

However the construction was still sounding a little odd to me, so I used COCA to look at the collocates of the noun of the lemma benefit – [benefit].[n*]. A lemma is all forms of the word and is indicated by square brackets. The part of speech can be selected from the POS (part of speech) List drop down box. To use a POS like this, you need to append it with a dot (full stop) to the word you are looking at.

From the results of this search, the rank 6 collocate is potential. Of course! Duh! That’s why the benefit can be sounds odd, whereas potential benefits are  would sound better.

Now you may be saying I did not need COCA to figure that out, sure I could have mulled it over the morning but COCA allowed me to get on with other trivial things than puzzling over this particular one. :)

That’s it for another quick cup of COCA. And if you haven’t already you can read some more quick cup of COCA posts.

Quick cup of COCA – quantifier/determiner + preposition + relative pronoun

As part of teaching relative clauses, getting good examples of a structure such as  one of which, many of which, some of whom, i.e. quantifier/determiner + preposition + relative pronoun had always been a bit tricky. Recently I used COCA to help me find some useful sentences.

The appropriate search term is [mc*]|[d*] of which|whom|whose|who.

[mc*] is the tag for singular cardinal number

[d*] is the tag for determiners

| is the syntax for OR operator

See here for the full list of the parts of speech tags, but usually the POS (part of speech) list drop down box is sufficient. And see here for info on the search syntax.


Click on above image to see results.

The results show that all of which, many of whom, and some of which are the top three.

Some of the interesting examples in the academic register are:

1. This study suggests several directions for further work, some of which we have already begun to investigate.

2. Bottlenecks at the Internet’s edge can easily move between the wireless access (when its bandwidth is low) and the provider s uplink, both of which can have highly variable bandwidths.

3. In its next generation of development, the Internet could make its way onto a wider range of instruments, all of which will offer viewers far sharper images, a much quicker connection, and a more reliable service than at present.

There are plenty of other avenues to explore here but that would not be a quick cup of COCA :).

Quick cup of COCA – bring to * boil


Click on image above to go to result screen.

Short post to show another example and power of the wildcard asterix. The image shows the results of comparing /bring to * boil/ in the American COCA and the British BNC.

A tweet  by @AnneHendler asked /In British English, is it considered more acceptable to say “Bring to the boil” or “bring to a boil”?/

@Marie_Sanako replied /I would always say ‘bring to the boil’./, both @cgoodey /I’d bring something to THE boil too!/ and @GemL1 agreed /”bring to the boil” is what I would say./ whilst @michaelegriffin added /if it’s about cooking I can only imagine myself or other am eng users sayin “a boil”. If metaphoric I dont know./.


I was reminded recently by a twitter exchange with  @rosemerebard that this online workshop is a great primer to using a corpus BYU: BNC & COCA.

One of the things I hope is clear with these Quick cup of COCA posts is that having a clearly stated problem/question will facilitate the search process.

Quick cup of COCA – synonym brackets equals

Click image to see results of search.

The class was looking at phrasal verbs and one of the example sentences had the word “chimney”, someone asked what are synonyms to chimney. I was stumped but a quick cuppa COCA came to the rescue. In order of most frequent to least COCA said: Pipe, stack, chimney, funnel, conduit, flue, smokestack. From this list I suggested smokestack and flue. A student added helpfully that stack is the word used in industry.

Bonus points to readers – can you guess the phrasal verb used in the example sentence with “chimney”?

Quick cup of COCA – wildcard asterix

Click image to see results of search.

This may or may not turn into a series of short posts on my experience of trying to use the COCA corpus in class. Inspired by the comments to this post by Kevin Stein/@kevchanwow.

A student in my TOEIC class, when we were looking at adjective endings -ED and -ING, asked what was the difference between “unmotivated” and “demotivated”. I replied that demotivated describes someone after some experience whereas unmotivated is a general state of being. I wasn’t too sure if that was sufficient so whilst the class was engaged in the following part of the lesson I used the wildcard asterix -> see image above.

And found out that the instances of “demotivated” were pretty low compared to “unmotivated”. I only transmitted the frequency information to the said student. If I had more time and a projector hooked up to the computer I would have looked at the example sentences in each case.

Do let me know of any searches you have done in class.