Counting countability – EFCAMDAT

This is a short note to my previous post on using the EFCAMDAT learner corpus. I read an interesting paper on Countability in World Englishes (HT Pascual Pérez-Parede@perezparedes) and thought it would be interesting to look at the mass nouns that the study used in the French learner corpus of EFCAMDAT that I had downloaded.

So using AntConc  I counted the total uses of the mass nouns and the number of (mis)uses of the mass noun as count nouns. The list below shows the percentage of these mass nouns used as countable nouns i.e. in plural form or with a/an in front of the noun:

  1. Luggage 30%
  2. Information 28%
  3. Software 27%
  4. Evidence 26%
  5. Baggage 25%
  6. Advice 24%
  7. Homework 23%
  8. Knowledge 15%
  9. Research 12%
  10. Furniture 10%
  11. Violence 10%
  12. Feedback 9%
  13. Equipment 7%

The frequency of these words compared to the whole sub-corpus of French nationality range from – 343 hits per million (Information) to 68 hits per million (Equipment).

So in the classroom I could test use of luggage (baggage), information, software, evidence, advice and homework.

NB1: The study on world englishes is critical of the overemphasis of language teaching on such mass nouns and argues that in terms of mutual intelligibility there is not much difference in using them correctly or not. In this sample of French learners the percentages are quite high (compared to the study) so it seems worth spending time on.

NB2: underwear had highest percentage of 100% but that is because the sole instance was the (mis)use i.e. 1 hit of underwears. 🙂

Thanks for reading.


3 thoughts on “Counting countability – EFCAMDAT

  1. Hi, thank you for your informative and inspiring posts!! I’ve been tinkering with the sub-corpus of Polish learners. It’s somewhat smaller – for example, the frequency of “information” is only 4 hits per 10,000 words…Not enough to draw any conclusions. But, based on my classroom observations these are very persistent mistakes, really difficult to eradicate 🙂
    Instead, I looked at the Key Word List to find out what some of the unusually (in)frequent words might be. It’s not quite clear – I don’t think it’s just the problem of interference from Polish only, but it’s also the question of level – over half as many words are from the preliminary levels than vantage or higher. Predictably perhaps, among those unusually frequent words I noticed some rather vague adjectives (good, nice, amazing, exciting), so perhaps with lower level students I could use these examples (selected concordance lines) and ask the students to look for some alternatives, expanding their vocabulary a little.
    Another idea, not very original really, will be to select concordance lines for various error correction tasks. It’ll be quick and easy with this sub-corpus. I often do something similar with my students’ essays, using them as a sort of mini-corpus – on paper. Here I’ll choose some (mis)used function words from the sub-corpus, select concordance lines and use them – even directly – in class, getting students to notice some useful chunks.
    What I found “very Polish” about this sub-corpus is the top position of the definite article “the” among the unusually infrequent words (negative key words) with the keyness of 1163.269. I’m not sure at all how significant this number really is, but it seems sky-high by comparison to the other words on the list. The thing is Polish has no definite or indefinite article, so using articles correctly is a problem for us and I guess this is reflected here – perhaps?
    All in all, seems an interesting and useful tool. I’ll surely continue to explore it.

  2. hi monika

    that’s some interesting info you have found about the Polish sub-corpus; as you say even though the normalised frequencies are low and we can’t generalise too much from it along with what we know of our students we can make some reasonable generalisations.

    i have very recently started a google plus group ( for people interested in using such tools and resources in the classroom; your experiences would be of great interest to the group, please consider joining 🙂


    1. Hi mura
      yes, I know very, very little about corpus linguistics, but – as you say – what I see in the classroom and what I see in the corpus is kind of complimentary, and together it makes sense.
      I’ve just joined your group (I hope 🙂 – thank you for inviting me 🙂


Penny for your thoughts

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.