Lecture 2: Data Science Fundamentals

(Last updated: Jan 15, 2025)

This lecture recaps the fundamentals of data science, such as table operations, classification, and regression.

Preparation

This project is an example of a data science pipeline.
Reading this work will help you get a basic understanding of data science pipelines.
You do not need to understand all techniques in the paper. Some of the techniques can be too difficult for you. Try your best to get the big picture.

The paper below studies various data science pipelines at different scale, which can give you a good understanding of common data science practices:

Below are website for data visualization inspirations:

Below are interesting data science case studies:

The textbook below contains more information about how to select models:

Section 11.8 Comparing Different Models in book: Introduction to Statistics and Data Analysis

The websites below contains exercises for Python pandas: