Appendix 2: Recoding and grouping of English language proficiency responses

Last updated on January 29, 2026

For analysis, English language proficiency responses can be recoded and grouped according to the 9-way classification method used by Scotland’s Census: English language skills: 9-way classification

To reduce the number of language ability variations that can result from a 4x4 grid question, responses from the language ability matrix can be re-coded into three broad proficiency categories and/or 10 main derived values for analysis as follows:

English language proficiency: 3 broad proficiency categories per skill

Recoding language ability question responses can be grouped into three broader categories for simplified reporting on each of the proficiency skills separately (read write, speak or understand):

  1. Having English proficiency: individuals who selected “very well” or “well” have proficiency
  2. Limited English proficiency: individual who have selected “not very well” in at least one proficiency and “not at all” in the other proficiencies
  3. No proficiency: individuals that selected ‘not at all’ for all four proficiencies

English language proficiency: 10 derived variables combining all skills

If needed, combinations of responses can also be grouped into one of 10 derived variables to report on multi-skill combinations:

  1. Understands spoken English only (does not speak, read or write English)
  2. Speaks, understands, reads and writes English
  3. Speaks and understands, but does not read or write English
  4. Speaks, understands and reads, but does not write English
  5. Reads, but does not speak, understand or write English
  6. Writes, but does not speak, understand or read English
  7. Reads and writes, but does not speak or understand English
  8. Other combinations of proficiencies in English
  9. No proficiencies in English
  10. No response

These 10 groups can also be rolled up into broader groups as needed as long as clear rationale for the groupings used is noted in any publication or communication of the research.

For interoperability the raw data of the respondents’ choices for each language proficiency category value (understand, speak, read, writes) should be retained so the data can be used or shared for other research under the Anti-Racism Data Act that may use different response groupings.