When Data Meet Tools: Using the Monitor Corpus for the Analysis of Language Development

Authors

  • Václav Cvrček
  • Martin Stluka
  • Klára Pivoňková

DOI:

https://doi.org/10.2478/jazcas-2025-0014

Keywords:

diachronic research, corpus querying, annotation, language change, monitor corpus, frequency, Czech

Abstract

The aim of this paper is to introduce an infrastructure developed within the HiČKoK project to enable full-fledged corpus-based diachronic research of Czech. The individual sections of the paper present the components of this infrastructure, which links well-balanced, representative and annotated data with tailor-made tools for diachronic research. The forthcoming monitor corpus, covering the entire period of written Czech, along with its composition and annotation strategies, is briefly introduced. In the following sections, the potential of the application and its four modules—simple query, comparison, time-based associations, and diachronic collocations—are demonstrated through mini case studies. Combining large-scale data (as representative as possible) with a tool that enhances standard corpus functionalities, enriches them with a diachronic perspective, and enables result visualization makes diachronic research on language change more accessible and comprehensive.

Downloads

Published

2026-01-05

How to Cite

When Data Meet Tools: Using the Monitor Corpus for the Analysis of Language Development. (2026). Jazykovedný časopis [Journal of Linguistics], 76(1), 157-166. https://doi.org/10.2478/jazcas-2025-0014