Comparative study of classification, detection, and segmentation models for underground utility component identification
Koirala, Sharamsh (2025-06-09)
Koirala, Sharamsh
S. Koirala
09.06.2025
© 2025, Sharamsh Koirala. Tämä Kohde on tekijänoikeuden ja/tai lähioikeuksien suojaama. Voit käyttää Kohdetta käyttöösi sovellettavan tekijänoikeutta ja lähioikeuksia koskevan lainsäädännön sallimilla tavoilla. Muunlaista käyttöä varten tarvitset oikeudenhaltijoiden luvan.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202506094262
https://urn.fi/URN:NBN:fi:oulu-202506094262
Tiivistelmä
Rapid expansion of underground utility networks, combined with increasing regulatory and corporate demand to automate the documentation process, has spurred the development of solutions to address this market need. This thesis presents a comparative evaluation of deep learning-based computer vision models for classifying, detecting, and segmenting underground utility components in Groundhawk Oy’s trench documentation application. A dataset of 5,200 images and 8,305 objects across 18 classes was curated from images captured with Groundhawk devices. These images were then annotated with multilabel classification tags, oriented bounding boxes, and polyline segmentations for a comprehensive assessment of models to reliably predict underground utility components. This study evaluates the performance of 12 pre-trained models, which include ResNet50, MobileNet, EfficientNet, Vision Transformer, YOLOv8 and its variants, Faster R-CNN, RetinaNet, SSD, and Mask RCNN across these three separate computer vision tasks. All models were trained using hyperparameters selected through Optuna-based trials optimized to minimize average validation loss. The models were then exported into standard ONNX format after training, to uniformly benchmark them using standardized evaluation metrics such as mAP, F1 score, and inference latency. The results emphasize a tradeoff between accuracy and computational efficiency, as more complex models achieved higher precision at the cost of inference speed, while low-complexity models offered real-time deployability potential with reduced precision. Despite limited data and deployment challenges associated with exporting YOLO based models in ONNX format, the findings from the research offer actionable insights into selecting models for embedded, field-deployable utility documentation. The results illustrate potential use of ensembling techniques as well as a hybrid deployment strategy, with smaller models running on edge devices and larger models operating in the cloud. The implementation of reproducible benchmarking pipelines, tailored to real-world constraints, contributes as a stepping stone towards bridging the gap between academic research and production-grade machine vision systems.
Kokoelmat
- Avoin saatavuus [38329]