An update of the J48Consolidated WEKA’s class: CTC algorithm enhanced with the notion of coverage
Date
2016-02-12Author
Ibarguren Arrieta, Igor
Muguerza Rivero, Javier Francisco
Metadata
Show full item recordAbstract
This document aims to describe an update of the implementation of the J48Consolidated class within WEKA platform. The J48Consolidated class implements the CTC algorithm [2][3] which builds a unique decision tree based on a set of samples.
The J48Consolidated class extends WEKA’s J48 class which implements the well-known C4.5 algorithm. This implementation was described in the technical report "J48Consolidated: An implementation of CTC algorithm for WEKA".
The main, but not only, change in this update is the integration of the notion of coverage in order to determine the number of samples to be generated to build a consolidated tree. We define coverage as the percentage of examples of the training sample present in –or covered by– the set of generated subsamples. So, depending on the type of samples that we use, we will need more or less samples in order to achieve a specific value of coverage.