Detecting inaccurate stack traces in bug reports
Bheree, Meher K.
University of Lethbridge. Faculty of Arts and Science
Lethbridge, Alta. : University of Lethbridge, Dept. of Mathematics and Computer Science
The generally held opinion in the software engineering community is that incorrect information in bug reports is often found in non-structural fields such as bug descriptions and steps to reproduce. However, structural information such as software stack traces can be inaccurate, increasing the project costs due to wasted time in fixed faults. Regarding the occurrence of inaccurate stack traces in bug reports, there is little empirical evidence. Therefore, we seek to provide such evidence by conducting an empirical study on the bug reports containing stack traces from the Eclipse and Apache projects. We propose an approach to classify the stack traces as either “Accurate” or “Inaccurate” by comparing the file names found in a stack trace in a bug report and the corresponding commit history for its fix. Thus, we determine the occurrence of inaccurate stack traces and identify the frequently occurring exception types that appears in the inaccurate stack traces for each project. Finally, we investigate training three supervised machine learning algorithms (Naive Bayes, Support Vector Machines and Logistic Regression), on features extracted from stack traces to create recommender that labels stack traces in bug reports as either ”Accurate” or ”Inaccurate”. The Logistic Regression algorithm was found to perform better with a F1- score up to 87% for the investigated Eclipse projects and 96% for the investigated Apache projects.
Stack traces , Bug reports , Bug report quality , Crash reports , Software debugging , Inaccurate information , Exceptions , Recommender creation