Lecture 2: Data Science Fundamentals

(Last updated: Jan 23, 2024)

This lecture recaps the fundamentals of data science, such as table operations, classification, and regression.

Preparation

Read the Smell Pittsburgh paper.

  • This project is an example of a data science pipeline.
  • Reading this work will help you get a basic understanding of data science pipelines.
  • You do not need to understand all techniques in the paper. Some of the techniques can be too difficult for you. Try your best to get the big picture.

Materials

Additional Resources

The paper below studies various data science pipelines at different scale, which can give you a good understanding of common data science practices:

Below are website for data visualization inspirations:

Below are interesting data science case studies:

The textbook below contains more information about how to select models:

The websites below contains exercises for Python pandas: