Lecture 3: Structured Data Processing (Part I)
(Last updated: Jan 27, 2026)
This lecture explains the theory of Decision Tree and Random Forest models that are used in the structured data processing module.
Check the GenAI usage policy if you are using the course materials with GenAI for self-study and fact-checking.
Preparation
Read the required course readings.
Lecture
Below are the slides:
Required Course Readings
- Section 3.1, 3.2, 3.3, and 3.4 (including 3.4.1 and 3.4.2) about decision tree learning in book Machine Learning (Mitchell, 1997)
- Section 8.2.1 (Bagging) and 8.2.2 (Random Forests) in book An Introduction to Statistical Learning (James et al., 2013)
Optional Course Readings
- Section 5.4 (Estimators, Bias and Variance) in book Deep Learning (Goodfellow et al., 2016).
- Section 2.2.2 (The Bias-Variance Trade-Off) and 12.2 (Principal Components Analysis, including 12.2.1, 12.2.2 ) in book An Introduction to Statistical Learning (James et al., 2013)
Additional Resources
Below are videos from StatQuest that explains the Decision Tree model and PCA nicely: