When Data Meet Tools: Using the Monitor Corpus for the Analysis of Language Development

Autori

Václav Cvrček
Martin Stluka
Klára Pivoňková

DOI:

https://doi.org/10.2478/jazcas-2025-0014

Kľúčové slová:

diachronic research, corpus querying, annotation, language change, monitor corpus, frequency, Czech

Abstrakt

The aim of this paper is to introduce an infrastructure developed within the HiČKoK project to enable full-fledged corpus-based diachronic research of Czech. The individual sections of the paper present the components of this infrastructure, which links well-balanced, representative and annotated data with tailor-made tools for diachronic research. The forthcoming monitor corpus, covering the entire period of written Czech, along with its composition and annotation strategies, is briefly introduced. In the following sections, the potential of the application and its four modules—simple query, comparison, time-based associations, and diachronic collocations—are demonstrated through mini case studies. Combining large-scale data (as representative as possible) with a tool that enhances standard corpus functionalities, enriches them with a diachronic perspective, and enables result visualization makes diachronic research on language change more accessible and comprehensive.

Sťahovanie

PDF (English)

Publikované

05-01-2026

Číslo

Ročník 76 Číslo 1 (2025): Jazykovedný časopis

Rubrika

Štúdie

Licencia

Táto práca je licencovaná pod Medzinárodnou licenciou Creative Commons Attribution-NonCommercial-NoDerivatives 4.0.

Ako citovať

When Data Meet Tools: Using the Monitor Corpus for the Analysis of Language Development. (2026). Jazykovedný časopis, 76(1), 157-166. https://doi.org/10.2478/jazcas-2025-0014

Stiahnuť citáciu