Aiding Software Root Cause Analysis withLarge Language Models: Evaluation of the Effectiveness of Fine-tuned T5, GPT, and RAG in the handling Customer Fault Reports
2025 (engelsk)Independent thesis Advanced level (degree of Master (Two Years)), 80 poäng / 120 hp
Oppgave
Abstract [en]
Software systems generate a substantial number of fault reports during pre-deployment customer testing, making manual root cause analysis (RCA) both time-consuming and error-prone. This study explores the use of large language models (LLMs)—specifically T5, GPT-2, and a retrieval-augmented generation (RAG) model—to automate and enhance the RCA process in a domain-specific software engineering setting. Using a curated dataset of real-world fault descriptions and resolutions, the models were fine-tuned and evaluated using BLEU-4, ROUGE, and BERT-based semantic similarity metrics. Results indicate that T5 outperforms GPT-2 in lexical and structural fidelity (e.g., BLEU-4: 0.1810 vs. 0.1210), while RAG achieves the highest semantic similarity (BERT score: 0.7715). These findings suggest that combining T5’s precision in technical phrasing with RAG’s contextual understanding may offer a promising direction for developing intelligent RCA assistance tools that improve both accuracy and relevance in software fault diagnosis. Future work will focus on hybrid model optimization and user-centered system integration for real-world engineering workflows.
sted, utgiver, år, opplag, sider
2025. , s. 49
Emneord [en]
software development, fault reports, root cause analysis (RCA), Large-Language Model (LLM), hybrid dataset, supervised data, unsupervised data, prototype, decision support, scalability
HSV kategori
Identifikatorer
URN: urn:nbn:se:mau:diva-82407OAI: oai:DiVA.org:mau-82407DiVA, id: diva2:2034314
Utdanningsprogram
TS Computer Science: Applied Data Science
Veileder
Examiner
2026-02-022026-02-012026-02-02bibliografisk kontrollert