Hyppää sisältöön
    • FI
    • ENG
  • FI
  • /
  • EN
OuluREPO – Oulun yliopiston julkaisuarkisto / University of Oulu repository
Näytä viite 
  •   OuluREPO etusivu
  • Oulun yliopisto
  • Avoin saatavuus
  • Näytä viite
  •   OuluREPO etusivu
  • Oulun yliopisto
  • Avoin saatavuus
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Evaluating Text Summarization Techniques and Factual Consistency with Language Models

Islam, Md Moinul; Muhammad, Usman; Oussalah, Mourad (2025-01-16)

 
Avaa tiedosto
nbnfioulu-202505263918.pdf (3.217Mt)
Lataukset: 

URL:
https://doi.org/10.1109/BigData62323.2024.10826032

Islam, Md Moinul
Muhammad, Usman
Oussalah, Mourad
IEEE
16.01.2025

M. M. Islam, U. Muhammad and M. Oussalah, "Evaluating Text Summarization Techniques and Factual Consistency with Language Models," 2024 IEEE International Conference on Big Data (BigData), Washington, DC, USA, 2024, pp. 116-122, doi: 10.1109/BigData62323.2024.10826032.

https://rightsstatements.org/vocab/InC/1.0/
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/
doi:https://doi.org/10.1109/bigdata62323.2024.10826032
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202505263918
Tiivistelmä
Abstract

Standard evaluation of automated text summarization (ATS) methods relies on manually crafted golden summaries. With the advances in Large Language Models (LLMs), it is legitimate to question whether these models can now potentially complement or replace human-crafted summaries. This study examines the effectiveness of several language models (LMs) in specifically addressing the issue of preserving factual consistency. By conducting a thorough assessment of various conventional and state-of-the-art performance metrics, such as ROUGE, BLEU, BERTScore, FActScore, and LongDocFACTScore across diverse datasets, our findings highlight the important relationship between linguistic eloquence and factual accuracy. The findings suggest that whereas LLMs, such as GPT and LLaMA, demonstrate considerable competence in producing concise and contextually-aware summaries, there remain difficulties in ensuring factual accuracy, particularly in domain-specific situations. Moreover, this work enhances the existing knowledge on summarization dynamics and highlights the need of developing more reliable and tailored evaluation techniques that minimize the probability of factual errors in text generated by ATS. In particular, the findings advance the current domain by providing a rigorous assessment of the balance between linguistic fluency and factual correct- ness, highlighting the limitations of current ATS frameworks and metrics to enhance the factual reliability of LM-generated summaries.
Kokoelmat
  • Avoin saatavuus [38618]
oulurepo@oulu.fiOulun yliopiston kirjastoOuluCRISLaturiMuuntaja
SaavutettavuusselosteTietosuojailmoitusYlläpidon kirjautuminen
 

Selaa kokoelmaa

NimekkeetTekijätJulkaisuajatAsiasanatUusimmatSivukartta

Omat tiedot

Kirjaudu sisäänRekisteröidy
oulurepo@oulu.fiOulun yliopiston kirjastoOuluCRISLaturiMuuntaja
SaavutettavuusselosteTietosuojailmoitusYlläpidon kirjautuminen