Lexical complexity prediction: an overview

North, Kai, Zampieri, Marcos and Shardlow, Matthew ORCID: https://orcid.org/0000-0003-1129-2750 (2023) Lexical complexity prediction: an overview. ACM Computing Surveys, 55 (9). p. 179. ISSN 0360-0300

Preview

Accepted Version
Download (683kB) | Preview

Official URL: https://dl.acm.org/doi/10.1145/3557885

Abstract

The occurrence of unknown words in texts significantly hinders reading comprehension. To improve accessibility for specific target populations, computational modeling has been applied to identify complex words in texts and substitute them for simpler alternatives. In this article, we present an overview of computational approaches to lexical complexity prediction focusing on the work carried out on English data. We survey relevant approaches to this problem which include traditional machine learning classifiers (e.g., SVMs, logistic regression) and deep neural networks as well as a variety of features, such as those inspired by literature in psycholinguistics as well as word frequency, word length, and many others. Furthermore, we introduce readers to past competitions and available datasets created on this topic. Finally, we include brief sections on applications of lexical complexity prediction, such as readability and text simplification, together with related studies on languages other than English.

Item Type:	Article
Peer-reviewed:	Yes
Date Deposited:	31 Mar 2023 09:50
Publisher:	Association for Computing Machinery
Additional Information:	© Association for Computing Machinery, 2023. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM Computing Surveys, http://dx.doi.org/10.1145/3557885
Divisions:	Faculties > Science and Engineering Research Centres > Centre for Advanced Computational Science
Subject terms:	08 Information and Computing Sciences, Information Systems
URI:	https://mmu-uat.leaf.cosector.com/id/eprint/631704
DOI:	https://doi.org/10.1145/3557885
ISSN	0360-0300
e-ISSN	1557-7341

Impact and Reach

Statistics

DownloadsShow export options

Activity Overview

6 month trend

56Downloads

6 month trend

29Hits

Additional statistics for this dataset are available via IRStats2.

Altmetric

Actions (login required)

View Item