Problems with CEFR

Let’s play a game. Which word doesn’t belong?

  • solecism
  • ambergris
  • brobdingnagian
  • bully

How about this list?

  • accountability
  • assassination
  • atrocity
  • bat

If you picked ‘bully’ and ‘bat’, you are just like most people. But believe it or not, all of the words on the first list are CEFR C2 level words, and all of the words on the second are C1 level!

While it is true that some sources list words like bat and bully earlier, it seems that CEFR itself is somewhat misunderstood as a concept. While it’s stated goal is ‘…to provide a standardized way to describe language proficiency, which helps language professionals create consistent syllabuses, curriculum guidelines, and examinations across different countries.’ in reality it is a kind of ongoing academic study of which words appear in English texts by frequency. So it isn’t really a useful way to decide which words to teach first. It’s more like a survey of words in popular media. Since words like ‘bat’ are uncommon in general speech and writing, they score very low on the list despite being extremely easy to understand. Simply put, CEFR does not align with the natural order hypothesis. Ironically, the idea that it could tell you which words were ‘easier’ is the reason why it got popular. Well it’s time to wake up! CEFR isn’t about that at all.

The solution is a “new CEFR” with a different mandate and a different modus operandi. The idea is simple; go back to those lists above; it is very clear that in the first list, the word ‘bully’ is easier than ‘solecism’, or even ‘brobdingnagian’ — a word I have never encountered in my entire life. It is very clear that the word ‘bat’ is easier than ‘accountability’ or ‘coherent’.

Therefore, we have already established a primary means of ranking; what does an experienced English speaker believe is more or less common? In order to do this, we simply take two words and compare them. To these words then, a weight is added where one word is placed on the left and one word is placed on the right.

Another idea is to group words by the general time it is expected that someone should know them. For example, there are ‘common animal words’ which are essentially common pets and common zoo animals. These words will be dominated by animals that appear commonly in the home, on TV in cartoons, or commonly on farms. Dog, cat, fish, mouse, and then likely words like elephant, horse, cow, pig, chicken, etc. (especially chicken, since it is also a food word).  However even here, we see that ‘dog’ and ‘cat’ seem easier — somehow, than elephant. What to do?

What is needed then, is a new score! We have CEFR, but maybe we also need NOGL — ahh, yes, noggles! You have ‘sefir’ (CEFR) and now you have noggles (NOGL)! It stands for Natural Order Graded Level. The idea is that the words will be ordered based on where they are expected to be in the Natural Order Hypothesis.

Note that this will be very heavily weighted by words that appear in school textbooks, so to a lesser extent NOGL will be influenced by lists like CEFR since they may be used to construct textbooks for children. Never-the-less, the idea is that they will learn some words easier or faster than others — so even within a level like A1, there may be separated ‘easy A1’ and ‘the difficult half of A1’. Given CEFR, I would bet money that several of the easy A2 words would be easier than the difficult A1 words! This problem is solved with NOGL.

Another example, where this will have a direct practical application is on lists like the JLPT N5 or the MoE’s “800 words for children” or “2,000 words for highschool” and the like. There’s going to be a separation in these lists where some words will generally be taught first before other words, and this creates an expectation that they will be known. But also, children will pick up words on their own naturally. NOGL must reflect both realities in order to be useful, while differing away from such academic extravagances just enough to allow users to lean into the natural order hypothesis to supercharge their English teaching ability.

In general, for a case-study game like a spelling bee or multiple choice flashcard tests, using a NOGL-based score will provide a better grading for what a student will actually, practically know, than CEFR. And, since the NOGL will align with what people actually know, it will be the most efficient way to find which words to study next.

Towards a strong definition of NOGL

Lets be a bit specific here, although this might not be the final idea, it’s a shot in that direction.

  1. Take a list of words. Attempt to organize them into two lists ‘easy’ and ‘difficult’, thus creating two lists from one. It does not really matter how you do this; you could compare two words at random, but it would probably work better to scan for the easiest words, move those, then scan for the most difficult words and move those, and repeat. You may even want to create three or four or five lists at the same time like this, but follow the KISS principle; keep it simple here.
  2. Repeat the process on the sub-lists until you have about 10 groups of words.
  3. If you did 3 rounds of simple comparison, you would have 8 groups of words. Four rounds is 16 groups and is probably accurate enough to grade the entire language. It’s also probably too many groups for practical use. 7 groups is ABCDEFG, and has already more than CEFR (6) or JLPT (5). You can also have an A1/A2 designation within the groups for 14 levels, even though each group is considered a unified thing. The A1/A2 is just which ones come first i.e. an ‘approach rating’. You could even have A3, A4, A5, etc. which signifies it’s order.
  4. Each NOGL rated word has a kind of ‘elo’ score for comparing against other words; and the entire list is then split into 7 groups.
  5. But, if we had 10 groups (NOGL-1, NOGL-2, etc.) we could call it N1, N2, N3 etc. So the NOGL is it’s number and then we need a way to signify it’s group, or room system.

So ultimately NOGL is intended to approximate n from input theory; if you need n+1, first, what is n. Then, we can place them into groups under the NOGL banner.

NOGL grading is unscientific, but here’s an idea. Ungraded words all have a value of 1 (but they’re ungraded, so this isn’t shown). When a word enters the icon lex, it is compared aganist some words from the corpus. The eaiest one All words are expressed a NOGL number based on their CEFR (to start). Then, words within each CEFR level are compared and the more difficult word changes it’s value to be at least one higher than the value of the card it is carrying. Methods like these may allow us to quickly come up with a usable version of “NOGL”.

I’ll think this over and come back with some experimental data… soon!

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *