Structured Data Processing#

(Last updated: Jan 26, 2024)1

All the content in this repository is licensed under CC BY 4.0. This module is about processing structured data and has the following learning goals:

  • Goal 1: Connect steps in the structured data processing pipeline to a real-world case.

  • Goal 2: Preprocess structured data and prepare features/labels for modeling using pandas.

  • Goal 3: Understand how Principal Component Analysis can help explore data.

  • Goal 4: Understand how cross-validation works for time-series data.

  • Goal 5: Have a general understanding of Decision Tree and Random Forest.

  • Goal 6: Understand the concept of permutation feature importance.

  • Goal 7: Experiment with different feature sets and reflect on the choice of features.

Table of Contents#


1

Credit: this teaching material is created by Yen-Chia Hsu.