An Online Tool Developed for Post-Editing the New Skolt Sami Dictionary
Hämäläinen, Mika; Alnajjar, Khalid; Rueter, Jack; Lehtinen, Miika; Partanen, Niko
Hämäläinen, Mika
Alnajjar, Khalid
Rueter, Jack
Lehtinen, Miika
Partanen, Niko
Lexical Computing CZ s.r.o.
Hämäläinen, M., Alnajjad, K., Rueter, J., Lehtinen, M., & Partanen, N. (2021). An online tool developed for post-editing the new Skolt Sami dictionary. Electronic lexicography in the 21st century (eLex 2021): Post-editing lexicography, Proceedings of the eLex 2021 conference (pp. 653-664). https://elex.link/elex2021/wp-content/uploads/eLex_2021-proceedings.pdf
https://creativecommons.org/licenses/by-sa/4.0/
This work is licensed under the Creative Commons Attribution ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-sa/4.0/
https://creativecommons.org/licenses/by-sa/4.0/
This work is licensed under the Creative Commons Attribution ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-sa/4.0/
https://creativecommons.org/licenses/by-sa/4.0/
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202603042038
https://urn.fi/URN:NBN:fi:oulu-202603042038
Tiivistelmä
Abstract
In this paper, we present our free and open-source online dictionary editing system that has been developed for editing the new edition of the Finnish-Skolt Sami dictionary. We describe how the system can be used in post-editing a dictionary and how NLP methods have been incorporated as a part of the workflow. In practice, this means the use of FSTs (finite-state transducers) to enhance connections between lexemes and to generate inflection paradigms automatically. We also discuss our work in the wider context of lexicography of endangered languages. Our solutions are based on the open-source work conducted in the Giella infrastructure, which means that our system can be easily extended to other endangered languages as well. We have collaborated closely with Skolt Sami community lexicographers in order to build the system for their needs. As a result of this collaboration, the latest Finnish-Skolt Sami dictionary was edited and published using our system.
In this paper, we present our free and open-source online dictionary editing system that has been developed for editing the new edition of the Finnish-Skolt Sami dictionary. We describe how the system can be used in post-editing a dictionary and how NLP methods have been incorporated as a part of the workflow. In practice, this means the use of FSTs (finite-state transducers) to enhance connections between lexemes and to generate inflection paradigms automatically. We also discuss our work in the wider context of lexicography of endangered languages. Our solutions are based on the open-source work conducted in the Giella infrastructure, which means that our system can be easily extended to other endangered languages as well. We have collaborated closely with Skolt Sami community lexicographers in order to build the system for their needs. As a result of this collaboration, the latest Finnish-Skolt Sami dictionary was edited and published using our system.
Kokoelmat
- Avoin saatavuus [42045]
