Information
Gain

Gain
Ratio

Delta

 

Initial
Accuracy

Pruned Accuracy

Delta

brcancer.dat

m

73.5483

69.2472

4.301

 

66.7741

66.0214

-0.7526

s

4.2166

2.9643

3.4378

 

 4.4895

 4.0614

 1.831

census.dat

m

63.6936

75.9759

-12.2822

 

74.8047

73.6335

-1.1711

s

2.2403

2.193

1.9536

 

 3.3182

 3.3493

 1.6895

credit.dat

m

56.4219

74.6788

-18.2568

 

78.1651

80.1834

2.0183

s

4.0281

4.6166

7.3801

 

 1.9244

 1.5259

 1.3188

diabetes.dat

m

56.5234

63.7109

-7.1874

 

59.0624

56.3671

-2.6953

s

2.4981

3.4521

2.8119

 

 5.9606

 4.5369

 3.4026

glass.dat

m

72.361

67.361

4.9999

 

66.5277

64.3055

-2.2221

s

3.8428

6.5227

6.4549

 

 4.0598

 4.9017

 3.1535

heart.dat

m

57.4746

58.9898

-1.5151

 

61.414

59.3938

-2.0201

s

2.584

5.557

4.885

 

 7.8502

 7.4653

 4.2589

iris.dat

m

95

93.8

1.2

 

94.6

94.2

-0.4

s

2.8674

3.5839

2.3475

 

 3.4058

 4.3665

 2.7968

promoters.dat

m

70.5555

74.9999

-4.4443

 

73.8888

76.9444

3.0555

s

4.9344

6.6769

6.7025

 

 6.3071

 4.3528

 4.0253

rotate.dat

m

85.105

85.105

0

 

86.3362

86.126

-0.2101

s

1.5259

1.5259

0

 

 1.4173

 1.4901

 0.6178

voting.dat

m

95.2413

95.1723

0.0689

 

94.4827

93.6551

-0.8275

s

2.1917

1.7507

1.3185

 

 1.7806

 1.9722

 1.9721

 


Discussion

 

While for some databases the current implementation of decision trees performs relatively well (with accuracies over 90%) it is also true that for some other databases presented decreased performance. Comparing the results with my teammate and with other students in the class, it was obvious that for some databases like (credit.dat) the performance of multi-way branching on continuous attributes was lower than binary branching. It appears that the Gain Ratio splitting criterion alleviates the situation, but still does not compensate completely.

 

Additionally, pruning has less of an impact in the current implementation because of the use of multi-way branching. A result of multi-way branching, trees tend to be wide and shallow, with few nodes. Therefore, pruning a node implies removing many branches, especially in the case of continuous attributes. Therefore pruning a node implies removing many splits and merging their examples. In a binary split architecture, pruning only means merging two of the possible values into a single node. Also, in a binary architecture splits on an attribute with multiple values, might be distributed all over the three. However, in multi-way branching, attributes tend to be split only once in a single place of the tree. This is even more notorious on continuous attributes than on discrete.

 

Although the decision tree effectively learns different concepts, it appears that has relevant drawbacks. An interesting improvement on the current implementation would be to support binary splits, such that it could be evaluated the results of pruning, using binary splitting compared to multi-way splitting.