From Coverage to Distribution: Exploring Lexical Features of the National Matriculation English Test

Authors

  • Tao Yang
  • Zhenhui Liang

DOI:

https://doi.org/10.56028/aehssr.9.1.178.2024

Keywords:

text coverage; word frequency; word dispersion; language corpora.

Abstract

Lexical feature has long been a pivotal element of almost all high-stakes language tests. Since the implementation of the “Experimental Curriculum Criteria” in China, few studies have reported investigating lexical features of the ensuing National Matriculation English Test with corpus methodology, and notably none was conducted on word dispersion. To address this problem, Python programming was employed in the present study to perform a corpus-based two-way coverage and visualized distribution analysis between the National Matriculation English Test and Experimental Curriculum Criteria lexicon. It was found that: 1) text coverage of the National Matriculation English Test reached the minimal (95%) threshold yet not the optimal (98%) one for adequate comprehension; 2) word-list coverage of the Experimental Curriculum Criteria was disproportionate and insufficient, suggesting that a large volume (42.905%) of the prescribed lexicon has never been used during the 13 years of implementation; 3) a relatively few (N = 90) high-frequency words, most (74.444%) of which were significantly overused compared with their corresponding BNC frequency, constituted over half (51.403%) of the text coverage; and 4) a vast majority (93.333%) of high-frequency words was homogeneous in dispersion, confirming the overuse with fresh distribution evidence. The results are discussed in terms of implications for test development.

Downloads

Published

2024-01-30