Hyppää sisältöön
    • FI
    • ENG
  • FI
  • /
  • EN
OuluREPO – Oulun yliopiston julkaisuarkisto / University of Oulu repository
Näytä viite 
  •   OuluREPO etusivu
  • Oulun yliopisto
  • Avoin saatavuus
  • Näytä viite
  •   OuluREPO etusivu
  • Oulun yliopisto
  • Avoin saatavuus
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

From Reinvention to Reuse: An Empirical Example Study On Technical Debt Dataset

Rantala, Leevi; Mäntylä, Mika; Sridharan, Murali (2024-11-27)

 
Avaa tiedosto
nbnfioulu-202502071536.pdf (366.8Kt)
Huom!
Sisältö avataan julkiseksi
: 27.11.2025
URL:
https://doi.org/10.1007/978-3-031-78386-9_8

Rantala, Leevi
Mäntylä, Mika
Sridharan, Murali
Springer
27.11.2024

Rantala, L., Mäntylä, M.V., Sridharan, M. (2025). From Reinvention to Reuse: An Empirical Example Study on Technical Debt Dataset. In: Pfahl, D., Gonzalez Huerta, J., Klünder, J., Anwar, H. (eds) Product-Focused Software Process Improvement. PROFES 2024. Lecture Notes in Computer Science, vol 15452. Springer, Cham. https://doi.org/10.1007/978-3-031-78386-9_8

https://rightsstatements.org/vocab/InC/1.0/
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
https://rightsstatements.org/vocab/InC/1.0/
doi:https://doi.org/10.1007/978-3-031-78386-9_8
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202502071536
Tiivistelmä
Abstract

Self-Admitted Technical Debt (SATD) is a subset of Technical Debt (TD), where the developer leaves a comment on the source, thus marking the place where debt has been taken. Previous research on SATD relies on either the creation of new datasets or the reuse of existing ones. One seminal SATD dataset containing over 4,000 SATD comments and their classification into five different TD categories was published by Maldonado et al. [14]. The drawback of the dataset is its lack of any other information, e.g. static analysis, seriously limiting its possible use cases. We remedy this situation by reforming the dataset. We combine the original comments with contextual information and static analysis from the source codes and recreate the dataset as an SQLite database. Our reformed dataset contains over 13,000 files, nearly 14,000 classes, almost 100,000 methods, and over 650,000 code violation instances. The reformed dataset allows varied and detailed analyses in the future, which we demonstrate by examining the relationship of SATD comments to code violations. The results show that on the method level, the most important predictors are the number of code violations in total as well as the number of violations labelled as Priority 3 or belonging to the Documentation Rule Set. On the file level, LOC is an important predictor alongside the number of violations from the Documentation Rule Set or having a Priority 2 classification. Overall, our example study demonstrates the potential of what reforming existing datasets can have.
Kokoelmat
  • Avoin saatavuus [38840]
oulurepo@oulu.fiOulun yliopiston kirjastoOuluCRISLaturiMuuntaja
SaavutettavuusselosteTietosuojailmoitusYlläpidon kirjautuminen
 

Selaa kokoelmaa

NimekkeetTekijätJulkaisuajatAsiasanatUusimmatSivukartta

Omat tiedot

Kirjaudu sisäänRekisteröidy
oulurepo@oulu.fiOulun yliopiston kirjastoOuluCRISLaturiMuuntaja
SaavutettavuusselosteTietosuojailmoitusYlläpidon kirjautuminen