Conference Paper


The consolidated tree construction algorithm in imbalanced defect prediction datasets

Abstract

In this short paper, we compare well-known rule/tree classifiers in software defect prediction with the CTC decision tree classifier designed to deal with class imbalance. It is well-known that most software defect prediction datasets are highly imbalance (non-defective instances outnumber defective ones). In this work, we focused only on tree/rule classifiers as these are capable of explaining the decision, i.e., describing the metrics and thresholds that make a module error prone. Furthermore, rules/decision trees provide the advantage that they are easily understood and applied by project managers and quality assurance personnel. The CTC algorithm was designed to cope with class imbalance and noise datasets instead of using preprocessing techniques (oversampling or undersampling), ensembles or cost weights of misclassification. The experimental work was carried out using the NASA datasets and results showed that induced CTC decision trees performed better or similar to the rest of the rule/tree classifiers.

Attached files

Authors

Ibarguren-Arrieta, I
Perez, J
Muguerza, J
Rodriguez, D
Harrison, R

Oxford Brookes departments

Faculty of Technology, Design and Environment\Department of Computing and Communication Technologies

Dates

Year of publication: 2017
Date of RADAR deposit: 2017-04-26



© 2017 IEEE


Related resources

This RADAR resource is the Accepted Manuscript of The consolidated tree construction algorithm in imbalanced defect prediction datasets

Details

  • Owner: Rosa Teira Paz
  • Collection: Outputs
  • Version: 1 (show all)
  • Status: Live
  • Views (since Sept 2022): 334