Corpora Helped Me With Some Language Problems That Dictionaries Cannot

Two weeks ago, in one of my student’s writings, he wrote: “…global warming effects which causes dramatic damages to people at scale.” Putting the grammatical problem with “causes” aside, I was particularly into the phrase “at scale”. “Where on earth has he learned this awkward phrase?”, I wondered. As usual, I went for dictionaries to check whether “at scale” is a legit phrase. By legit, I mean whether English language users have this phrase or not. Unfortunately, they led me nowhere.

The next day I talked to my student, I explicitly told him that “at scale” is not a correct phrase and to my atonishment, he showed me a UNICEF 5-minute reading material with that particular phrase used. I was mixed, to be honest, being surprised and embarrassed at the same time. Now, we have some problems to deal with, or questions to answer: Do other language users of English, besides the UNICEF material writer and my student, use “at scale”? In what cases and kinds of text is “at scale” used? Of course, the dictionaries have not been that helpful, so I need another place.

At this point, some of you may raise a concern – Isn’t it the dictionaries that set the standard for language learning? I have asked myself the same question many times. Here is the thing: that question is not a well-framed one from the beginning. What is “standard” and who join this “language learning” process? So, here is my refined version – What can we expect from a dictionary? Undoubtedly, there is a huge collection of lexical items in a dictionary. Each entry is given multiple useful pieces of information such as parts of speech, meaning, examples and perhaps, collocations and related words. Dictionaries, in their essence, are a tool to reflect language, rather than to set a standard for language use. Entries on dictionaries normally capture the most commonly used senses of words and phrases. In other words, what we can see from the dictionaries is not necessarily all. This also explains why dictionaries are constantly updated (because language changes). For your information, the meanings of “men” and “women” on Cambridge Dictionary have been currently updated. Anyway, when dictionaries don’t really help us solve language problems, we need a tool that helps us look further into the use of language.

After a while considering possible solutions, I now think it’s time for a corpus exploration. Long story short, corpus is a collection of texts in written and spoken forms. In other words, corpus is a source of language data. The purpose of corpus, by definition, is to find out how language is used. Corpora (plural form) may provide users with different types of texts in various time frames (depending on how they are constructed and what sources are included). The ( so far has been a website of great help, where we can find some “most widely used online corpora” (according to the description of the site).

Have a look at what I can find for the string “at scale” on iWeb corpus. These are the first 10 examples (or technically, concordance lines) among the 13,479 situations where people use the phrase “at scale”. The iWeb corpus contains 14 billion words in 22 million web pages. You can access the corpus here ( Basically, until now, the first question – Do other language users of English, besides the UNICEF material writer and my student, use “at scale”? – has been anwered – YES.

Part of the second question has also been answered: this phrase are found on web pages, in 2017. However, I believe more specific contexts should be examined. The second picture below includes the expanded context for the concordance line recorded in (line 1 in the first picture). Using the co-textual context of the sentence, “faced with increasing demand for our courses”, we can deduce that “at scale” here is used with a similar meaning found in the UNICEF material and my student’s writing – extensively, to many people, or on a large scale.

Let’s switch the genre for the moment and look through another example from TV shows. The picture below illustrates an example from The TV Corpus ( I’ve reached the same conclusion about the meaning of “at scale” as when I examined the examples from The iWeb Corpus.

These are just some examples of “at scale” that I have found on the two mentioned corpora. I have just only showed you the interesting parts of “List” in searching tool and “expanded context” in data record. There are more and, of course, very diverse results in other corpora. The texts collected in each corpus do really affect the results of our search queries. That said, we’ve got the answers to the questions about “at scale” at the beginning, knowing that my student was justified in using that phrase. Problem solved! Bravo!

In the scope of this blog, I have only focused on using corpora for finding more examples of a language item. Surely, there are more intriguing things with corpora. I’ll come back soon with another blog on those. Until next time, enjoy exploring!

P.S: Many parts of my interest are intensified by a post of my lecturer in Contrastive Linguistics and Discourse Analysis (Timeline photos ( If you have stayed until this line, thank you so much, Hoa Ninh!

