When Data Meet Tools: Using the Monitor Corpus for the Analysis of Language Development
DOI:
https://doi.org/10.2478/jazcas-2025-0014Kľúčové slová:
diachronic research, corpus querying, annotation, language change, monitor corpus, frequency, CzechAbstrakt
The aim of this paper is to introduce an infrastructure developed within the HiČKoK project to enable full-fledged corpus-based diachronic research of Czech. The individual sections of the paper present the components of this infrastructure, which links well-balanced, representative and annotated data with tailor-made tools for diachronic research. The forthcoming monitor corpus, covering the entire period of written Czech, along with its composition and annotation strategies, is briefly introduced. In the following sections, the potential of the application and its four modules—simple query, comparison, time-based associations, and diachronic collocations—are demonstrated through mini case studies. Combining large-scale data (as representative as possible) with a tool that enhances standard corpus functionalities, enriches them with a diachronic perspective, and enables result visualization makes diachronic research on language change more accessible and comprehensive.
Sťahovanie
Publikované
Číslo
Rubrika
Licencia
Copyright (c) 2026 Václav Cvrček, Martin Stluka, Klára Pivoňková
Táto práca je licencovaná pod Medzinárodnou licenciou Creative Commons Attribution-NonCommercial-NoDerivatives 4.0.