Decision Tree Algorithm

PUBLISHED: MAY 2, 20262 MIN READ

Decision Tree Algorithm✅ Steps to Construct a Decision TreePlace the best feature (attribute) of the dataset at the root of the tree.Split the training set into

Divya Sachan
Divya SachanAuthor
remoraid
#223
remoraid

Decision Tree Algorithm

Steps to Construct a Decision Tree

  1. Place the best feature (attribute) of the dataset at the root of the tree.
  2. Split the training set into subsets, where each subset contains data with the same value for a feature.
  3. Repeat steps 1 and 2 recursively for each subset until:
    • All branches lead to leaf nodes.
    • Leaf nodes contain class labels (decisions).
  • ID3 (Iterative Dichotomiser 3) – Developed by Ross Quinlan
  • C4.5 – Successor to ID3, also developed by Ross Quinlan
  • CART (Classification and Regression Trees)
  • OneR (One Rule Algorithm) – Developed by Robert Holte

📌 The ID3 Algorithm (based on Information Gain)

Used when:

  • There are two class labels, "+" (positive) and "−" (negative)

🔤 Notation

SymbolDescription
Set of examples (training data)
Set of class labels {+, -}
Set of features (attributes)
feature in
Set of possible values of feature
A single value from
Subset of where

🔁 ID3 Algorithm – Steps

  1. Create a root node for the tree.
  2. If all examples in 𝑆 are positive, → return leaf node labeled "+"
  3. If all examples in 𝑆 are negative, → return leaf node labeled "−"
  4. If there are no more features, return a leaf node with the most common class in 𝑆.
  5. Else:
    • Choose feature A with the highest Information Gain.
    • Assign A to the root node.
    • For each value v in V(A):
      • Create a branch below the root labeled A = v
      • If 𝑆ᵥ is empty, add a leaf node with the most common label in 𝑆.
      • Else, recursively apply the algorithm on 𝑆ᵥ and features F \ {A}

🧪 Training Dataset

DayOUTLOOKTEMPHUMIDITYWINDPLAY TENNIS
D1SunnyHotHighWeakNo
D2SunnyHotHighStrongNo
D3OvercastHotHighWeakYes
D4RainMildHighWeakYes
D5RainCoolNormalWeakYes
D6RainCoolNormalStrongNo
D7OvercastCoolNormalStrongYes
D8SunnyMildHighWeakNo
D9SunnyCoolNormalWeakYes
D10RainMildNormalWeakYes
D11SunnyMildNormalStrongYes
D12OvercastMildHighStrongYes
D13OvercastHotNormalWeakYes
D14RainMildHighStrongNo

🔧 Step 1: Calculate Initial Entropy of the Dataset

  • Total = 14
  • Yes = 9, No = 5
Entropy(S)=(914)log2(914)(514)log2(514)Entropy(S)=-\left(\frac{9}{14}\right)\log_2\left(\frac{9}{14}\right)-\left(\frac{5}{14}\right)\log_2\left(\frac{5}{14}\right)
=0.940=0.940

Entropy() = 0.940

🔍 Step 2: Calculate Information Gain for Each Attribute

Attribute 1: OUTLOOK (Sunny, Overcast, Rain)

Sunny: = [No, No, No, Yes, Yes]

E=(35log235)(25log225)E=-\left(\frac{3}{5}\log_2\frac{3}{5}\right)-\left(\frac{2}{5}\log_2\frac{2}{5}\right)
E=(0.6log20.6)(0.4log20.4)E=-\left(0.6\log_2{0.6}\right)-\left(0.4\log_2{0.4}\right)
E=(0.6×0.737)(0.4×1.322)E = -(0.6 \times - 0.737) - (0.4 \times - 1.322)
E=0.971E=0.971

Sunny: = [Yes, Yes, Yes, Yes]

Pure Node

E=0E=0

Sunny: = [Yes, Yes, Yes, No, No]

E=(35log235)(25log225)E=-\left(\frac{3}{5}\log_2\frac{3}{5}\right)-\left(\frac{2}{5}\log_2\frac{2}{5}\right)
E=(0.6log20.6)(0.4log20.4)E=-\left(0.6\log_2{0.6}\right)-\left(0.4\log_2{0.4}\right)
E=(0.6×0.737)(0.4×1.322)E = -(0.6 \times - 0.737) - (0.4 \times - 1.322)
E=0.971E=0.971

=Entropy(S)SsunnySEntropy(Ssunny)SovercastSEntropy(Sovercast)SrainSEntropy(Srain)=Entropy(S^{\prime})-\frac{|S_{sunny}|}{|S|}\cdot Entropy(S_{sunny})\\-\frac{|S_{overcast}|}{|S|}\cdot Entropy(S_{overcast})\\-\frac{|S_{rain}|}{|S|}\cdot Entropy(S_{rain})
=0.94(5140.971+4140+5140.971)=0.94-\left(\frac{5}{14}\cdot0.971+\frac{4}{14}\cdot0+\frac{5}{14}\cdot0.971\right)
=0.2464=0.2464

Gain(Outlook) = 0.2464