LogPM: Character-Based Log Parser Benchmark
Hashemi, Shayan; Nyyssölä, Jesse; Mäntylä, Mika V. (2024-07-16)
Hashemi, Shayan
Nyyssölä, Jesse
Mäntylä, Mika V.
IEEE
16.07.2024
S. Hashemi, J. Nyyssölä and M. V. Mäntylä, "LogPM: Character-Based Log Parser Benchmark," 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Rovaniemi, Finland, 2024, pp. 705-710, doi: 10.1109/SANER60148.2024.00077
https://rightsstatements.org/vocab/InC/1.0/
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists,or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists,or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202503252198
https://urn.fi/URN:NBN:fi:oulu-202503252198
Tiivistelmä
Abstract
Log parsers transform free-form textual log messages into categorical data and are important tools in automated log analysis pipelines. However, selecting a suitable log parsing algorithm poses a formidable obstacle, thereby underscoring the importance of having a comprehensive benchmark to facilitate decision-making. This paper introduces a novel log parsing benchmark, focusing on predicted template precision at the character level rather than accurate grouping used in the past. We present a new metric called Parameter Mask Agreement that measures template accuracy at the character level alongside a dataset tailored for the task. We identified several challenges that parsers encounter for each dataset, which can aid in developing new log parsers. Moreover, a small empirical study was conducted using the proposed benchmark, evaluating the performance of three renowned parsers: Drain, Spell, and Lenma. The findings revealed that Lenma demonstrated the highest parsing accuracy, whereas Drain exhibited superior parsing speed. Finally, we propose that our benchmark is more appropriate than previous approaches in scenarios where accurate template detection is essential and computational efficiency needs to be assessed.
Log parsers transform free-form textual log messages into categorical data and are important tools in automated log analysis pipelines. However, selecting a suitable log parsing algorithm poses a formidable obstacle, thereby underscoring the importance of having a comprehensive benchmark to facilitate decision-making. This paper introduces a novel log parsing benchmark, focusing on predicted template precision at the character level rather than accurate grouping used in the past. We present a new metric called Parameter Mask Agreement that measures template accuracy at the character level alongside a dataset tailored for the task. We identified several challenges that parsers encounter for each dataset, which can aid in developing new log parsers. Moreover, a small empirical study was conducted using the proposed benchmark, evaluating the performance of three renowned parsers: Drain, Spell, and Lenma. The findings revealed that Lenma demonstrated the highest parsing accuracy, whereas Drain exhibited superior parsing speed. Finally, we propose that our benchmark is more appropriate than previous approaches in scenarios where accurate template detection is essential and computational efficiency needs to be assessed.
Kokoelmat
- Avoin saatavuus [38865]