Fractal visualization of corpus data
Kretzschmar, William; Coats, Steven (2022-12-07)
Kretzschmar, William
Coats, Steven
Helsingin yliopisto
07.12.2022
Kretzschmar, William & Steven Coats. 2023. “Fractal visualization of corpus data”. Data visualization in corpus linguistics: Critical reflections and future directions (Studies in Variation, Contacts and Change in English 22), ed. by Ole Schützler & Lukas Sönning. Helsinki: VARIENG. https://urn.fi/URN:NBN:fi:varieng:series-22-6
https://rightsstatements.org/vocab/InC/1.0/
© 2023 William Kretzschmar & Steven Coats; series © 2007– VARIENG
https://rightsstatements.org/vocab/InC/1.0/
© 2023 William Kretzschmar & Steven Coats; series © 2007– VARIENG
https://rightsstatements.org/vocab/InC/1.0/
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202401031045
https://urn.fi/URN:NBN:fi:oulu-202401031045
Tiivistelmä
Abstract
The relationship between word frequency and rank order, when considering the lexical types of a given text and their frequencies, was first noted and described by George Zipf; it was later interpreted by Mandelbrot in terms of fractal dimensionality. In this paper, we discuss some properties of rankfrequency profiles and demonstrate use of the ZipfExplorer tool, an online app for the visualization of shared lexis in two texts, to compare the lexical types in well-known novels. We demonstrate that the alpha parameter of a power law function as well as several other measures can be used to quantify the shared lexical diversity of two texts. In addition, visual examination of the A-curves of rank-frequency profiles can help to interpret similarities and differences between texts and corpora.
The relationship between word frequency and rank order, when considering the lexical types of a given text and their frequencies, was first noted and described by George Zipf; it was later interpreted by Mandelbrot in terms of fractal dimensionality. In this paper, we discuss some properties of rankfrequency profiles and demonstrate use of the ZipfExplorer tool, an online app for the visualization of shared lexis in two texts, to compare the lexical types in well-known novels. We demonstrate that the alpha parameter of a power law function as well as several other measures can be used to quantify the shared lexical diversity of two texts. In addition, visual examination of the A-curves of rank-frequency profiles can help to interpret similarities and differences between texts and corpora.
Kokoelmat
- Avoin saatavuus [38840]