Predicting CEFR levels in learners of English: The use of microsystem criterial features in a machine learning approach

Gaillat, Thomas; Simpkin, Andrew; Ballier, Nicolas; Stearns, Bernardo; Sousa, Annanda; Bouyé, Manon; Zarrouk, Manel

Published in

Cambridge University Press, ReCALL, 2(34), p. 130-146, 2021

DOI: 10.1017/s095834402100029x

Tools

Export citation

Search in Google Scholar

Predicting CEFR levels in learners of English: The use of microsystem criterial features in a machine learning approach

Journal article published in 2021 by Thomas Gaillat

, Andrew Simpkin

, Nicolas Ballier

, Bernardo Stearns

, Annanda Sousa

, Manon Bouyé

, Manel Zarrouk

This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Preprint: archiving allowed

Upload

Postprint: archiving forbidden

Published version: archiving forbidden

Policy details

Data provided by

Abstract

AbstractThis paper focuses on automatically assessing language proficiency levels according to linguistic complexity in learner English. We implement a supervised learning approach as part of an automatic essay scoring system. The objective is to uncover Common European Framework of Reference for Languages (CEFR) criterial features in writings by learners of English as a foreign language. Our method relies on the concept of microsystems with features related to learner-specific linguistic systems in which several forms operate paradigmatically. Results on internal data show that different microsystems help classify writings from A1 to C2 levels (82% balanced accuracy). Overall results on external data show that a combination of lexical, syntactic, cohesive and accuracy features yields the most efficient classification across several corpora (59.2% balanced accuracy).

Published in

Links

Tools

Predicting CEFR levels in learners of English: The use of microsystem criterial features in a machine learning approach

Abstract