Lecture 7: Text Data Processing (Part I)
(Last updated: Feb 29, 2024)
This lecture gives a tutorial about text data processing using a Jupyter Notebook. The learning goals can be found on this link.
Preparation
Do the preparation for the text data processing module.
Materials
Below is the link to the online notebook:
Follow the steps below to set up the notebook:
- Have the JupyterLab environment ready.
 - Download the text data processing module from the GitHub repository. Or you can also download the zip file from this link.
 - Open the notebook file (
docs/tutorial-text-data.ipynb) and start working on the tasks. 
Additional Resources
Below are the videos that students found useful in understanding more about the course materials:
- Python Pandas Lambda Function Tutorial With EXAMPLE
 - Part of Speech Tagging : Natural Language Processing
 - Word Embedding and Word2Vec, Clearly Explained!!!
 
Below are the materials that students found useful in complementing the course materials:
- What is Topic Modeling? An Introduction With Examples
 - Topic Analysis: The Ultimate Guide
 - Python Lambda Function
 
Below are websites related to this tutorial:
- NLTK: Natural Language Toolkit
 - spaCy: Industrial-Strength Natural Language Processing
 - PyTorch: An Open Source Machine Learning Framework
 
The following video explains how to build ChatGPT: