TY - JOUR
T1 - Word Use Equivalence and Hierarchical Word Tiers
AU - Burch, Brent
AU - Egbert, Jesse
N1 - Publisher Copyright: © 2022 Informa UK Limited, trading as Taylor & Francis Group.
PY - 2023
Y1 - 2023
N2 - A ranked word list provides information about the position of each word in the list. However, retaining and employing the measure used to generate the ranked list can yield additional information about the words. If (Formula presented.) denotes the prevalence of a word in a corpus, then not only can the values of (Formula presented.) be ordered, their values can be compared to one another, and words having similar values can be grouped together into equivalence classes. Measures of word prevalence include mean text frequency, the dispersion of words across texts in a corpus, or a measure that combines frequency and dispersion. In this paper, we examine the concepts of word equivalence classes and hierarchical word tiers and apply these concepts to the words in the British National Corpus (BNC). Hierarchical word tiers can be constructed without the knowledge of all pairwise comparisons of the words under study. By grouping words that have similar values of prevalence, the ranked ordered list reduces to an informative set of hierarchical word tiers where each tier contains words that are similar to one another in terms of their use in the corpus.
AB - A ranked word list provides information about the position of each word in the list. However, retaining and employing the measure used to generate the ranked list can yield additional information about the words. If (Formula presented.) denotes the prevalence of a word in a corpus, then not only can the values of (Formula presented.) be ordered, their values can be compared to one another, and words having similar values can be grouped together into equivalence classes. Measures of word prevalence include mean text frequency, the dispersion of words across texts in a corpus, or a measure that combines frequency and dispersion. In this paper, we examine the concepts of word equivalence classes and hierarchical word tiers and apply these concepts to the words in the British National Corpus (BNC). Hierarchical word tiers can be constructed without the knowledge of all pairwise comparisons of the words under study. By grouping words that have similar values of prevalence, the ranked ordered list reduces to an informative set of hierarchical word tiers where each tier contains words that are similar to one another in terms of their use in the corpus.
UR - http://www.scopus.com/inward/record.url?scp=85140091384&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85140091384&partnerID=8YFLogxK
U2 - 10.1080/09296174.2022.2129377
DO - 10.1080/09296174.2022.2129377
M3 - Article
SN - 0929-6174
VL - 30
SP - 104
EP - 124
JO - Journal of Quantitative Linguistics
JF - Journal of Quantitative Linguistics
IS - 1
ER -