Why replicate?

The CEFR is an enormously influential document. Its authority derives partly from the fact that the process of constructing its various scales was based on an empirical study. And yet surprisingly little has been done by way of replication of that study.


The original 1994 study was in fact replicated the following year by the same researchers: Brian North and Günther Schneider. In the second study they were able to involve teachers and learners of French and German, as well as English. They reported similar results, which supported the view that the scale values obtained for the descriptors could be generalised to languages other than English.


Since then there have been various initiatives to validate some or all of the CEFR scales using other approaches: self-assessment by learners, expert judgment, corpus analysis, but none that has applied the same method as the original researchers, in which teachers used the descriptors to assess learners and to rate actual samples of learners’ production.


The main reason for replicating is, then, to provide independent evidence as to how North and Schneider’s methodology works when applied over twenty years later to a different set of teachers and learners. This issue applies both to the method itself and to the results obtained, In other words, it can be formulated as two questions:


  • Does the method yield statistically robust results that can be used to scale the language level descriptors?
  • If so, how similar the results to those originally reported, for the same descriptors, by North and Schneider?


Regarding the teachers and learners involved in my partial replication, the most obvious difference, relative to the original study, is that these will be form a wide variety of linguistic and cultural backgrounds. For good reasons, North and Schneider’s study was limited to teachers and learner based in Switzerland. Thanks to the internet, I am in a position to involve teachers and learners anywhere in the world.


In addition to the above reasons, I propose to pursue two subsidiary research objectives


  • To obtain scale values for CEFR descriptors that were not calibrated by North and Schneider. This applies especially to the descriptors for writing (the original study focused mainly on spoken interaction) and for this reason my study will involve assessment of samples of writing.
  • To obtain scale values for some of the some of the extended set of CEFR descriptors that are currently under review.