Dr. Tamim Ahmed Khan
PhD Theme/Topic: Software Fault Prediction Using Large-Language Models
Supervisor: Dr. Tamim Ahmed Khan, Sr. Professor
Contact #: 0346-5340804
Email:
Campus/School/Dept: BSEAS H-11/SE
RAC Approved Supervisor for Research Areas:
Supervisory Record:
PhD Produced: 04
PhD Enrolled: 01
MS/MPhil Produced: 61
MS/MPhil Enrolled: 04
Topic Brief Description:
Software Fault Prediction (SFP) helps improving software reliability by identifying fault-prone components early in the development phase. Traditional machine learning approaches often provide limited interpretability, while recent Large Language Models (LLMs) remain underexplored for specialized software engineering tasks. We plan to introduce a RAG-Fault Predictor framework that integrates CK metrics with the Qwen2.0-0.5B-Instruct model using Retrieval-Augmented Generation (RAG) and parameter-efficient fine-tuning. A specialized dataset of Java code, CK metrics, fault labels, and natural language explanations needs investigation to enhance both predictive accuracy and interpretability.
Research Objectives/Deliverables:
- How can we enhance traditional CK metrics-based datasets with metric-grounded natural language descriptions and code patterns for the LLM-based RAG (Retrieval-Augmented Generation) methodological framework for fault predictions?
- How can we develop an LLM and RAG-based model implementing NLP for superior performance in fault prediction accuracy that is integrated with explainability?
- How can we prove effectiveness of such an LLM for automated test case and test oracle preparation?
Research Questions:
- To design and construct an enhanced dataset by augmenting traditional CK metrics with code patterns and metric-grounded natural language descriptions.
- To use this dataset to develop and evaluate an LLM-based RAG framework for generating accurate software fault predictions with calibrated confidence scores.
- To develop and validate an LLM and RAG-based model that uses NLP for accurate, explainable software fault prediction, delivering calibrated confidence scores and actionable natural language insights.
Candidate’s Eligibility Profile:
- The applicant must have an MS/MPhil/Equivalent degree in software engineering / computer science with CGPA > 3.0. Besides, applicants must have a strong background in mathematics, optimization theory and related fields.
- Experience with programming languages such as Python is advantageous. Candidates should have excellent communication skills to actively contribute to team research efforts.
- Proficiency in spoken and written English is essential. We value independence and responsibility while promoting teamwork and collaboration among colleagues.