Exploring automatic readability assessment for science documents within a multilingual educational context

Uçar, Suna-Şeyma; Aldabe Arregi, Itziar; Aranberri Monasterio, Nora; Arruarte Lasa, Ana Jesús

View/Open

Artículo (599.0Kb)

Date

2024

Author

Uçar, Suna-Şeyma

Aldabe Arregi, Itziar

Aranberri Monasterio, Nora

Arruarte Lasa, Ana Jesús

Metadata

Show full item record

Estadisticas en RECOLECTA
(LA Referencia)

International Journal of Artificial Intelligence in Education 34 : 1417-1459 (2024)

URI

http://hdl.handle.net/10810/73037

Abstract

Current student-centred, multilingual, active teaching methodologies require that teachers have continuous access to texts that are adequate in terms of topic and language competence. However, the task of finding appropriate materials is arduous and time consuming for teachers. To build on automatic readability assessment research that could help to assist teachers, we explore the performance of natural language processing approaches when dealing with educational science documents for secondary education. Currently, readability assessment is mainly explored in English. In this work we extend our research to Basque and Spanish together with English by compiling context-specific corpora and then testing the performance of feature-based machine-learning and deep learning models. Based on the evaluation of our results, we find that our models do not generalize well although deep learning models obtain better accuracy and F1 in all configurations. Further research in this area is still necessary to determine reliable characteristics of training corpora and model parameters to ensure generalizability.

Collections

Artículos

© The Author(s) 2024. This article is licensed under a Creative Commons Attribution 4.0 International License

Except where otherwise noted, this item's license is described as © The Author(s) 2024. This article is licensed under a Creative Commons Attribution 4.0 International License