Kolmogorov complexity metrics in assessing L2 proficiency: An information-theoretic approach
Published in Frontiers in Psychology (SSCI), 2022
Abstract: Based on 774 argumentative writings produced by Chinese English as a foreign language (EFL) learners, this study examined the extent to which Kolmogorov complexity metrics can distinguish the proficiency levels of beginner, lower-intermediate, and upper-intermediate second language (L2) English learners. Kolmogorov complexity metric is a holistic informationtheoretic approach, which measures three facets of linguistic complexity, i.e., overall, syntactic, and morphological complexity simultaneously. To assess its validity in distinguishing L2 proficiency, Kolmogorov complexity metric is compared with traditional syntactic and morphological complexity metrics as well as fine-grained syntactic complexity metrics. Results showed that Kolmogorov overall and syntactic complexity could significantly distinguish any adjacent pair of L2 levels, serving as the best separators explored in the present study. Neither Kolmogorov morphological complexity nor other complexity metrics at both the syntactic and morphological levels can distinguish between all pairs of adjacent levels. Results of correlation analysis showed that Kolmogorov syntactic complexity was not or weakly correlated with all the fine-grained syntactic complexity metrics, indicating that they may address distinct linguistic features and can complement each other to better predict different proficiency levels.
摘要: 本研究基于774篇中国英语学习者撰写的议论文文章,探讨了Kolmogorov复杂度指标在多大程度上能够区分初级、中低级和中高级二语(L2)英语学习者的水平。Kolmogorov复杂度指标是一种整体的信息论方法,同时衡量语言复杂度的三个方面:整体复杂度、句法复杂度和形态复杂度。为了评估其在区分二语水平方面的有效性,本文将Kolmogorov复杂度指标与传统的句法和形态复杂度指标以及细粒度的句法复杂度指标进行了比较。结果显示,Kolmogorov整体复杂度和句法复杂度能够显著区分任何相邻的二语水平对,成为本研究中探讨的最佳区分指标。Kolmogorov形态复杂度以及其他句法和形态复杂度指标均无法区分所有相邻水平对。相关性分析结果表明,Kolmogorov句法复杂度与所有细粒度句法复杂度指标之间的相关性较弱或无相关性,表明它们可能涉及不同的语言特征,可以相互补充,从而更好地预测不同的水平。(翻译自GPT-4o)
Contribution: Li Wang, Hui Wang, and Gui Wang conceptualized the study. Hui Wang wrote the Introduction and Literature Review sections. Gui Wang wrote the Methodology section. Both Hui Wang and Gui Wang collaborated on writing the Results and Discussion sections. Gui Wang was responsible for the primary statistical analysis and visualization. Hui Wang played a significant role in refining the study’s design. Hui Wang, Li Wang, and Gui Wang all actively reviewed, revised, and contributed to the finalization of the manuscript. Li Wang provided constructive feedback on the finalization of the manuscript. Thanks to the open resource about the calculation of Kolmogorov complexity generously provided by Ehret (2017).
Recommended citation: Wang, G.†, Wang, H.†, & Wang, L. (2022). Kolmogorov complexity metrics in assessing L2 proficiency: An information-theoretic approach. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.1024147