Methods for Phonetic Scraping of Youtube Videos
Meli, Adrien; Coats, Steven; Ballier, Nicolas (2023-12-01)
Meli, Adrien
Coats, Steven
Ballier, Nicolas
Association for Computational Linguistics
01.12.2023
Adrien Meli, Steven Coats, and Nicolas Ballier. 2023. Methods for Phonetic Scraping of Youtube Videos. In Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023), pages 244–249, Online. Association for Computational Linguistics.
https://creativecommons.org/licenses/by/4.0/
ACL materials are Copyright © 1963–2024 ACL; other materials are copyrighted by their respective copyright holders. Materials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.
https://creativecommons.org/licenses/by/4.0/
ACL materials are Copyright © 1963–2024 ACL; other materials are copyrighted by their respective copyright holders. Materials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.
https://creativecommons.org/licenses/by/4.0/
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202401301501
https://urn.fi/URN:NBN:fi:oulu-202401301501
Tiivistelmä
Abstract
This paper discusses two pipelines for the automatic collection of automatic speech recognition (ASR) transcripts and audio content from YouTube videos and subsequent phonetic analysis: PEASYV (Phonetic Extraction and Alignment of Subtitled YouTube Videos) and YTPP (YouTube Phonetics Pipeline). The pipelines differ somewhat in terms of processing steps as well as the tools used for forced alignment, but produce comparable results. The two pipelines may be useful for large-scale collection of acoustic data for phonetic analysis.
This paper discusses two pipelines for the automatic collection of automatic speech recognition (ASR) transcripts and audio content from YouTube videos and subsequent phonetic analysis: PEASYV (Phonetic Extraction and Alignment of Subtitled YouTube Videos) and YTPP (YouTube Phonetics Pipeline). The pipelines differ somewhat in terms of processing steps as well as the tools used for forced alignment, but produce comparable results. The two pipelines may be useful for large-scale collection of acoustic data for phonetic analysis.
Kokoelmat
- Avoin saatavuus [38824]