Ginseng English @ginsenglish issued a poll on twitter asking:
Can you guess which form makes up over 50% of spoken English?
Guess and retweet! Answer in a BIG new grammar blog post soon! #ginsenglish
— Ginseng English (@ginsenglish) October 19, 2017
This is a good exercise to do on the new spoken BN2014 corpus. See instructions to get access to the corpus.
You need to get your head around the parts of speech (POS) tag. The BNC2014 uses CLAWS 6 tagset. For the past tense we can use past tense of lexical verbs and past tense of DO. Using the past tenses of BE and HAVE would also pull in their uses as auxiliary verbs which we don’t want. This could be a neat future exercise in figuring out how to filter out such searches. Another time! Onto this post.
Simple past:
[pos=”VVD|VDD”]
pos = part of speech
VVD = past tense of lexical(main) verbs
VDD = past tense of DO
| = acts like an OR operator
So the above looks for parts of speech tagged as either past tense of lexical verbs or past tense of DO.
Simple present
The search term for present simple is also relatively simple to wit:
pos=[“VVZ”]
VVZ -s form of lexical verb (e.g. gives, works)
Note the above captures third person forms, how can we also catch first and second person forms?
Present perfect
[pos = “VH0|VHZ”] [pos =”R.*|MD|XX” & pos !=”RL”]{0,4} [pos = “AT.*|APPGE”]? [pos = “JJ.*|N.*”]? [pos =”PPH1|PP.*S.*|PPY|NP.*|D.*| NN.*”]{0,2} [pos = “R.*|MD|XX”]{0,4} [pos = “V.*N”]
The search of present perfect may seem daunting; don’t worry the structure is fairly simple, the first search term [pos = “VH0|VHZ”] is saying look for all uses of HAVE and the last term [pos = “VVN”] is saying look for all past participles of lexical verbs.
The other terms are looking for optional adverbs and noun phrases that may come in-between namely
“adverbs (e.g. quite, recently), negatives (not, n’t) or multiword adverbials (e.g. of course, in general); and noun phrases: pronouns or simple NPs consisting of optional premodifiers (such as determiners, adjectives) and nouns. These typically occur in the inverted word order of interrogative utterances (Has he arrived? Have the children eaten yet?)” – Hundt & Smith (2009).
Present progressive
[pos = “VBD.*|VBM|VBR|VBZ”] [pos =”R.*|MD|XX” & pos !=”RL”]{0,4} [pos = “AT.*|APPGE”]? [pos = “JJ.*|N.*”]? [pos =”PPH1|PP.*S.*|PPY|NP.*|D.*| NN.*”]{0,2} [pos = “R.*|MD|XX”]{0,4} [pos = “VVG”]
A similar structure to the present perfect search. The first term [pos = “VBD.*|VBM|VBR|VBZ”] is looking for past and present forms of BE and the last term [pos = “VVG”] for all ing participle of lexical verb. The terms in between are for optional adverb, negatives and noun phrases.
Note that all these searches are approximate – manual checking will be needed for more accuracy.
So can you predict the order of these forms? Let me know in the comments the results of using these search terms in frequency per million.
Thanks for reading.
Other search terms in spoken BNC2014 corpus.
Update:
Ginseng English blogs about frequencies of forms found in one study. Do note that as there are 6 inflectional categories in English – infinitive, first and second person present, third person singular present, progressive, past tense, and past participle, the opportunities to use the simple present form is greater due to the 2 categories of present.
References:
Hundt, M., & Smith, N. (2009). The present perfect in British and American English: Has there been any change, recently. ICAME journal, 33(1), 45-64. (pdf) Available from http://clu.uni.no/icame/ij33/ij33-45-64.pdf