Course Syllabus
NLP 243 – Machine Learning for Natural Language Processing
Winter 2020
Course Information
Lecture times: Mon & Wed, 5:20pm – 6:50pm
Virtual Classroom
Instructor Information
Dr. Dilek Hakkani-Tür
email: dhakkani @ ucsc [dot] edu
Office Hours: I’ll stay on the class meeting channel 30 min after each class for questions. You can also send me an email to get an appointment for other times.
Zoom Link for classes: https://zoom.us/j95852116173 Links to an external site.
Teaching Assistant
Rishi Rajasekaran
Email: rrajasek @ ucsc [dot] edu
Office hours
- Time: Wednesdays, 2-3:30pm
- Zoom Link: https://ucsc.zoom.us/j/96938066796?pwd=dnVXZjVZM25iNlN4TVFKbVk5RS9PQT09 Links to an external site.
Sections
- You must attend the section weekly. We will take attendance!
- Time: Mondays, 2-4PM
- Zoom Link: https://ucsc.zoom.us/j/96434245012?pwd=cHR3NUZkZlNRU3dpZmRnSVlJc0pJQT09 Links to an external site.
- All Section Slides are available under Files > Section Slides
- The section recordings are available under YuJa > All Courses > NLP-243-01
- Link to Python self-test: https://colab.research.google.com/drive/1yiYE9LdUjkriAAG7krmecqA-7BJRZwNj?usp=sharing Links to an external site.
- Section 2 Python Basics: https://colab.research.google.com/drive/1LkBmNPi8ZXtmSr-iW2Uypa3sbhNq4gk6?usp=sharing Links to an external site.
- Section 3 - Basics of SciKit Learn: https://colab.research.google.com/drive/1Ldma3WPhLexR6ttqMaPYO4auUwJ6JDh-?usp=sharing Links to an external site.
- Section 4 - PyTorch basics:
https://colab.research.google.com/drive/1vgTuMLpkK7DTuKxxfyUZhZhW5nBu1oaD?usp=sharing Links to an external site. - Section 5 - PyTorch Multilayer Perceptron and Convolutional Neural Networks: https://colab.research.google.com/drive/1UCpug78_XvieSJhSp0v4fAE7E6sEBr0y?usp=sharing Links to an external site.
- Section 6 - PyTorch RNNs: https://colab.research.google.com/drive/1fVXNmNi_g1o77-oU4QaX4XeRof8myIJj?usp=sharing Links to an external site.
- Section 7 - Sequence Tagging using RNNs: https://colab.research.google.com/drive/1hy0-T3oK6-nmZN9AUVyHNIdQv8DLRSba?usp=sharing Links to an external site.
Course Description
Introduction to machine learning models and algorithms for Natural Language Processing. Introduces learning models from fields of statistical decision theory, artificial intelligence, and deep learning. Topics include an introduction to standard neural network learning methods such as feed-forward neural networks, recurrent neural networks, convolutional neural networks, with applications to natural language processing problems such as utterance classification and sequence tagging. Requirements include 3 programming assignments and a final project.
Textbooks:
Dive Into Deep Learning, Ashton Zhang, Zack C. Lipton, Mu Li, Alex Smola. http://d2l.ai Links to an external site.
Natural Language Processing with PyTorch. Delip Rao and Brian McMahan. https://proquest-safaribooksonline-com.oca.ucsc.edu/9781491978221
Foundations of Statistical NLP. Chris Manning, Hinrich Schuetze. https://nlp.stanford.edu/fsnlp Links to an external site.
Speech and Language Processing. Daniel Jurafsky and James Martin. https://web.stanford.edu/~jurafsky/slp3 Links to an external site.
I will also provide pointers to other reading when needed.
Canvas Link: https://canvas.ucsc.edu/courses/37453
Piazza Link: https://piazza.com/ucsc/fall2020/nlp24301 Links to an external site.
(Access code: ucsc-nlp-243)
Grading:
- Attendance (5%)
- Homeworks and Final Project: 55%
- HW1: 8%
- HW2: 12%
- HW3: 15%
- Final: 20%
- Midterm 20%
- Final 20%
Homework Delivery:
We will organize one in-class competition and a leaderboard for each homework (i.e., on Kaggle or codalab). Every student should create a CodaLab account to participate in. In the CodaLab in-class competition, students are given the training data and labels. They need to train the requested models and submit their predictions on test data on CodaLab. CodaLab will rank their results according to evaluation metric (e.g. accuracy and F1 score). Students also need to turn in one report (must be PDF only) and a zip file with their code on Canvas assignments. Grades will consider both the ranking on leaderboard and the reports: 25% of grading will be based on performance on leaderboard, 50% will be based on the report accompanying the homework, and 25% will be based on the code.
Schedule
Schedule for reading and homework assignments are shown in the syllabus below.
- THIS SCHEDULE IS SUBJECT TO CHANGE
- Check Canvas for specific due dates and times of all assignments.
SYLLABUS
Week 1:
Oct 5th:
Topics:
- Class Logistics
- What is natural language processing?
- What is machine learning?
- What is deep learning?
Readings:
Oct 7th:
Topics
- Preliminaries:
- Linear Algebra
- PyTorch Basics
- Probability
- Basics
- Conditional Probability and Independence
- Calculus – Derivatives and Differentiation
Readings:
Week 2:
Oct 12th:
- ML and NLP Basics
- Review NLP toolkits (NLTK, Spacy, sklearn for homework)
- Background on commonly used ML approaches for NLP
- Naïve Bayes
Readings:
- Ch4 of Jurafsky and Martin book Links to an external site.
- Ch Links to an external site. 16 of Manning and Links to an external site.Schuetze Links to an external site. book Links to an external site.
Oct 14th:
- Background on commonly used ML approaches for NLP (cont.)
- Decision Trees
- Support Vector Machines
- Getting ready for homeworks: knowledge graphs and querying knowledge graphs
Readings:
- Ch Links to an external site. 16 of Manning and Links to an external site.Schuetze Links to an external site. book Links to an external site.
Week 3:
Oct 19th:
- Review of Possible Topics for Final Projects
- K-nearest neighbors
- Linear Regression
- Homework 1 assigned
Readings:
- KNN chapter Links to an external site.
- Ch 16 of Manning and Schuetze Links to an external site.book Links to an external site.
Oct 21st:
- Linear Regression (cont.)
- Gradient Descent (and versions)
- Practical Tips
Readings:
- Ch2 of NLP with PyTorch
- Ch3 Links to an external site.of Dive into Deep Learning Links to an external site.
Week 4:
Oct 26th:
- Final Project Teaming up event
Oct 28th:
- Homework 1 due date
- Sign up teams of 3 people for the final project.
- Activation and Loss Functions Using PyTorch
- Multi-layer perceptron
Readings:
Week 5:
Nov 2nd:
- Homework 2 assigned
- Multi-Layer Perceptron (cont.)
- Computation Graphs
- Back-propagation
- Overfitting Revisited
- Weight Decay
- Dropout
- Distributional Similarity
- Words, vectors and co-occurrence matrices
- Word Embeddings
- What unexpected things might we learn with word embeddings?
Readings:
- Continuing Ch4 of Dive into Deep Learning Links to an external site.
- A good review paper: Yoav Goldberg. A Primer on Neural Network Models for Natural Language Processing Links to an external site.
- Chapter 5 of NLP with PyTorch
-
Other suggested reading:
- Chapter 6 Links to an external site.of the Speech and Language Understanding book by D. Jurafsky and J. Martin
Nov 4th: Final Project Proposal Presentations (also due date for proposal write-ups)
Week 6:
Nov 9th: Midterm during class time
Nov 11th: Veterans day holiday, no class.
Week 7:
Nov 16th:
- Glove Embeddings
- Playing with word embeddings
- Convolutional Neural Networks
- Text Classification Using Convolutional Neural Networks
- Convolutional Neural Networks (cont.)
- Text Classification with CNNs
- CNNs in PyTorch
Readings:
Nov 18th:
- Homework 2 due date
- Homework 3 assigned
- Language Modeling
- Recurrent Neural Networks
- Sequence Classification Tasks
- Homework 3 introduction
Readings:
- Starting Ch 6. of NLP with PyTorch Book
- Ch 7 of the Dive into Deep Learning book Links to an external site.
Week 8:
Nov 23rd:
- Quick review of RNNs from previous lecture
- Case Study: Natural Language Understanding in Conversational Systems
- Homework 3 discussion
- Implementing RNNs
Nov 25th:
- Implementing RNNs (continuing from previous lecture)
- Long Short Term Memory (LSTM)
- Implementing LSTMs
- Gated Recurrent Units (GRU)
Readings:
- Starting Chapter 9 of Dive Into Deep Learning Links to an external site.
- Chapter 7 of NLP with PyTorch
Week 9:
Nov 30th:
- Discussion of midterm grades review and HW2 questions
- Encoder-Decoder Architecture
- Sequence-to-sequence (S2S) models
- Beam Search
- Attention
Readings:
- Continuing Chapter 9 of Dive Into Deep Learning Links to an external site.
- Chapter 8 of NLP with PyTorch
- Bahdanu et al., Neural Machine Translation by Jointly Learning to Align and Translate. ICLR, 2015. Links to an external site.
Dec 2nd:
- Homework 3 due date
- Applications for RNNs and Attention: Task Specific Variations of Network Topologies
- SLU in Dialogue Systems
- Seq2seq Models with Attention
- Representations of Conversation Context
- Scaling to new domains
- Scaling to new languages
- S2S models for Response Generation in Social Dialogue Systems
- Hierarchical RNNs for Conversation Context
- Memory Networks for Knowledge Integration
- Pointer-Generator Networks
- Generating Diverse Responses
Readings:
- Links for papers covered are in the slide deck
Week 10:
Dec 7th: Final project presentations.
Project | Members |
Topical Chat Bot | Austin King, Devavrat Joshi, Morgan Eidam |
Emotion Detection | Angela Ramirez, Christopher-Garcia Cordova, Mamon Alsahily |
Visual Question Answering | Raghav Chaudhary, Sam Shamsan, Adam Fidler |
Sentiment Analysis | Tianxiao Zhang, Youyou Zhao, Phill Lee |
Dec 9th: Final project presentations.
Project | Members |
Generating Creative Content for Dialogue | Kevin Bowden, Eduardo Zamora, Jeshwanth Bheemanpally |
Fake News Detection | Alex Lue, Nilay Patel, Kaleen Shreshta |
Question and Answering Machine | Zachary Sweet, John Lara, Kit Lao |
Financial News Sentiment Analysis | Cecilia Li, Liren Wu, David Li |
Dec 13th: Final Project reports due.
Finals week:
Dec 14-18: Final, date TBD.
Course Summary:
Date | Details | Due |
---|---|---|
Mon Oct 5, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Assignment Setting up a Python Machine Learning Environment | due by 11:59pm | |
Wed Oct 7, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Mon Oct 12, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Wed Oct 14, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Mon Oct 19, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Assignment Python Basics and SciKit Learn | due by 11:59pm | |
Wed Oct 21, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Mon Oct 26, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Assignment PyTorch Basics | due by 11:59pm | |
Wed Oct 28, 2020 | Assignment Homework 1: Relation Extraction from Natural Language | due by 4:59pm |
Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm | |
Mon Nov 2, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Assignment Pytorch Basics Contd + MLP and CNNs | due by 11:59pm | |
Wed Nov 4, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Assignment Project Proposal Presentation | due by 5pm | |
Assignment Project Proposal Report | due by 5pm | |
Mon Nov 9, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Quiz Midterm | due by 7:20pm | |
Assignment Midterm Review | due by 11:59pm | |
Wed Nov 11, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Fri Nov 13, 2020 |
Quiz
Midterm
(1 student)
|
due by 2:30pm |
Mon Nov 16, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Assignment PyTorch MLPs and CNNs (contd.) | due by 11:59pm | |
Wed Nov 18, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Sat Nov 21, 2020 | Assignment Homework 2: Relation Extraction from Natural Language using PyTorch | due by 12pm |
Mon Nov 23, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Assignment Pytorch RNNs | due by 11:59pm | |
Wed Nov 25, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Mon Nov 30, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Assignment Sequence Labeling using RNNs | due by 11:59pm | |
Wed Dec 2, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Sun Dec 6, 2020 | Assignment Homework 3: Slot Tagging for Natural Language Utterances | due by 11:59pm |
Mon Dec 7, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Assignment Project Presentation Slides | due by 5pm | |
Wed Dec 9, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Sun Dec 13, 2020 | Assignment Final Project Report | due by 11:59pm |
Assignment Project Code | due by 11:59pm | |
Mon Dec 14, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Wed Dec 16, 2020 | Calendar Event Machine Learning for Natural Language Processing | 5pm to 8pm |
Quiz Final | due by 5:30pm | |
Quiz
Final
(2 students)
|
due by 6:45pm |