Common Writing Errors in Latvian: Corpus-Driven Error Analysis and Text Correction

Year 2024 Jan–2026 Dec
Funding Latvian Council of Science
Fundamental and Applied Research Projects
Abstract The aim of the project is to create a semi-automatically error-annotated corpus of texts produced by native speakers of Latvian, in which the most common errors of the Latvian language will be documented, corrected and explained. The methodology of corpus creation and data will be used to analyze how language errors affect the grammatical system of the Latvian language and to develop state-of-the-art corpus-based guidelines for improving written language quality. Error-annotated corpus is also required for the development of high-level grammar checkers that could spot complex structural errors in addition to low-level spell checkers.