Do Frequency Types Matter in Lexicography?
DOI:
https://doi.org/10.2478/jazcas-2025-0027Keywords:
corpus annotation, semi-automatic dictionary drafting, Dictionary Express, word frequency, frequency type, absolute frequency, document frequency, ALDF, ARF, CzechAbstract
Word frequency in a corpus can be calculated in several different ways. Amongst the most common frequency types are the absolute frequency, the document frequency, ALDF and ARF. This paper focuses on comparing these four types in terms of “word correctness.” For determining whether a word is correct or not, we use the data gathered for the Czech lexicon used for the recent Czech Dictionary Express project. In this project, each of the top 100,000 most frequent headwords was reviewed by several Czech native speakers, who decided whether the word should be accepted or rejected or has some minor issues. The quality of the “word correctness” is further discussed in the paper.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Marek Blahuš, Vojtěch Kovář, František Kovařík

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.