Research Publications

Now showing 1 - 3 of 3

Embargo
Evaluating Large Language Models for Software Testing: A Systematic Review of Metrics and Practices
(Springer Science and Business Media Deutschland GmbH, 2026) Perera V.I.T; De Silva D.I.
The recent advancements in Large Language Models (LLMs) present substantial potential for revolutionizing software testing practices, particularly through automated test case generation. This review synthesizes contemporary research on LLM-driven software testing methods, with a specific focus on evaluation metrics. A systematic literature review was conducted using databases like IEEE Xplore, ResearchGate, and Google Scholar, targeting literature published between 2020 and 2024, specifically focusing on LLM-based test case generation. Selection criteria included relevance to automated testing and practical application insights. This review analyzes 15 key studies that span multiple test domains, and the key findings reveal significant advancements in using LLMs for diverse testing types, including unit, property-based, security, and user acceptance testing. Despite substantial benefits, issues such as test case validity, reliability, and prompt engineering complexity remain challenging. The review concludes with recommendations for developing a standardized metric-driven evaluation framework for better assessing LLM-generated tests. This comprehensive approach aims to effectively measure and optimize the practical utility and reliability of LLM-generated software tests, ultimately guiding future research directions and improving adoption within the software industry. The key contribution of this review is a comprehensive metric-focused evaluation of LLM-driven software testing techniques offering a foundation for developing standardize evaluation methodologies and practical testing frameworks.
Open Access
Evaluating the impact of Large Language Models on problem-solving skills in programming debugging of IT undergraduates
(Taylor and Francis Ltd., 2026) Riztha, F; Wickramarachchi, R; Asanka, P P G D; Dissanayake, M. A
This study investigates the impact of Large Language Models (LLMs) on problem-solving skills in source code debugging among IT undergraduates. A pre-, mid-, and post-experimental design was employed, including pre-test, mid-test, post-test (Prior), and post-test (Recent) phases to assess debugging performance with and without LLM assistance. The sample consisted of 87 students from the Department of Industrial Management, University of Kelaniya, Sri Lanka, stratified by gender, academic level, A/L stream, Z-score, and GPA. Results showed significant improvement in debugging accuracy, increasing from 46.53% in the pre-test to 69.51% in the post-test (Prior), indicating skill retention. Task efficiency also improved, with completion time reduced from 18 minutes to 10 minutes. However, transferability to new problems was moderate, with a post-test (Recent) accuracy of 58.40%. Higher academic levels, technical A/L streams, and mid-range GPAs were associated with better retention and adaptability. While LLMs enhanced immediate performance, the findings highlight the need to balance their use with independent practice to support long-term skill development. Limitations include resource constraints and short study duration, suggesting the need for longitudinal research. The study recommends structured integration of LLMs to optimize programming education outcomes.
Open Access
Leveraging LLMs for Dynamic Content Generation and Creating Contextual Quizzes to Enhance Learning Outcomes in Personalized Education
(School of Education, Faculty of Humanities and Sciences, SLIIT, 2025-10-10) Wanigasekara, S; Gunawardhane, K; Kumara, S
The research study developed an adaptive learning system based on LLMs and RAG technology to deliver customized educational content. The system differentiates from traditional LLM educational software by accepting complete lecture materials, which ensure quiz responses and feedback match the specific content of the current course. The application retrieves dynamic, relevant content from lecture slides to provide focused, structured learning that goes beyond standardized, pre-trained responses. Pinecone serves as a vector database for semantic content retrieval, and OpenAI provides GPT for natural language generation from the system architecture. The educational materials undergo Sentence Transformers processing to create semantic embeddings that enable both precise content retrieval as well as contextual adjustments.

Research Publications

Browse

Filters

Advanced Search

Filter by

Settings

Sort By

Results per page

Search Results