Fuzz testing large language models
Bruun, Juho (2025-08-14)
Bruun, Juho
J. Bruun
14.08.2025
© 2025 Juho Bruun. Ellei toisin mainita, uudelleenkäyttö on sallittu Creative Commons Attribution 4.0 International (CC-BY 4.0) -lisenssillä (https://creativecommons.org/licenses/by/4.0/). Uudelleenkäyttö on sallittua edellyttäen, että lähde mainitaan asianmukaisesti ja mahdolliset muutokset merkitään. Sellaisten osien käyttö tai jäljentäminen, jotka eivät ole tekijän tai tekijöiden omaisuutta, saattaa edellyttää lupaa suoraan asianomaisilta oikeudenhaltijoilta.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202508225543
https://urn.fi/URN:NBN:fi:oulu-202508225543
Tiivistelmä
The ever increasing prevalence of artificial intelligence and large language models in the modern world brings with it a need for proper testing and security assessment measures for systems utilizing such technologies. Concerns over the robustness and trustworthiness of large language model implementations as a part of a wide variety of applications give purpose to advancement in their robustness testing.
The need for automated testing methods being the motivation for this thesis, the goal was to explore and benchmark automatic methods of fuzz-testing large language model implementations and assess their overall effectiveness in the task.
An automated testing setup was implemented for the purposes of this thesis using a set of contemporary large language model tools, with two different methods of fuzz-testing being used for the tests, one being proposed in this paper and based on word- and string-level mutation, and the other being an attack that utilizes a language model to generate input as proposed in an earlier research article. The findings suggest directions for future work in improving fuzz-testing setups for large language model systems and the current caveats of the fuzz-testing approach against large language models.
The need for automated testing methods being the motivation for this thesis, the goal was to explore and benchmark automatic methods of fuzz-testing large language model implementations and assess their overall effectiveness in the task.
An automated testing setup was implemented for the purposes of this thesis using a set of contemporary large language model tools, with two different methods of fuzz-testing being used for the tests, one being proposed in this paper and based on word- and string-level mutation, and the other being an attack that utilizes a language model to generate input as proposed in an earlier research article. The findings suggest directions for future work in improving fuzz-testing setups for large language model systems and the current caveats of the fuzz-testing approach against large language models.
Kokoelmat
- Avoin saatavuus [43406]

