Lecture 4: Structured Data Processing (Part II)
(Last updated: Jan 24, 2025)
This lecture gives a tutorial about the Smell Pittsburgh project using a Jupyter Notebook. The learning goals can be found on this link.
Preparation
Do the preparation for the structured data processing module.
Materials
Below is the link to the online notebook:
Follow the steps on the assignment page to set up the notebook.
Additional Resources
Below are the videos that students found useful in understanding more about the course materials:
- Pandas Resample
- Python Rolling Window Functions explained in 4 minutes
- How to Use Pandas Rolling - A Simple Illustrated Guide
- Continuous vs Discrete Data
- Introduction to Trees (Data Structures & Algorithms #9)
Below are websites that could be useful in understanding more about the course materials:
- A Practical Guide to Implementing a Random Forest Classifier in Python
- Loss Functions in Machine Learning Explained
Below are websites related to this tutorial:
- The Smell Pittsburgh Dataset
- The scikit-learn API
- The seaborn API
- The plotly API
- The pandas API
- The numpy API
Below are papers related to this tutorial: