Do Frequency Types Matter in Lexicography?
DOI:
https://doi.org/10.2478/jazcas-2025-0027Kľúčové slová:
corpus annotation, semi-automatic dictionary drafting, Dictionary Express, word frequency, frequency type, absolute frequency, document frequency, ALDF, ARF, CzechAbstrakt
Word frequency in a corpus can be calculated in several different ways. Amongst the most common frequency types are the absolute frequency, the document frequency, ALDF and ARF. This paper focuses on comparing these four types in terms of “word correctness.” For determining whether a word is correct or not, we use the data gathered for the Czech lexicon used for the recent Czech Dictionary Express project. In this project, each of the top 100,000 most frequent headwords was reviewed by several Czech native speakers, who decided whether the word should be accepted or rejected or has some minor issues. The quality of the “word correctness” is further discussed in the paper.
Sťahovanie
Publikované
Číslo
Rubrika
Licencia
Copyright (c) 2025 Marek Blahuš, Vojtěch Kovář, František Kovařík
Táto práca je licencovaná pod Medzinárodnou licenciou Creative Commons Attribution-NonCommercial-NoDerivatives 4.0.