My intent is for this course to be as self-contained as possible. Toward this goal, the pop quizzes and exam will only concern the content discussed during lectures. Likewise, the homework assignments will require you to apply the lecture content to solve problems, which involves significant programming. We expect students to already have a strong foundation in programming and machine learning. If you do not already know how to program in TensorFlow or PyTorch, you will need to pick it up as you go (as we will not have time in class to teach such). These incredibly popular and useful frameworks make machine learning work significantly easier, so your experience with them will serve you well beyond this course. The research project, by design, will require you to take initiative to learn about NLP beyond what is covered in class, and to make a novel contribution.
There is a wealth of resources available online to help you fill in any gaps and to supplement your knowledge. It can be incredibly fruitful to read/hear others discuss the same content that I cover in lecture, as it not only reiterates what you already know, but it can provide an additional perspective to help you master the material. Thus, I highly encourage everyone to consider the following, phenomenal resources:
BOOKS
NLP
- Speech and Language Processing by Jurafsky and Martin. 2020.
- Natural Language Processing by Eisenstein. 2018.
- Introduction to Deep Learning by my PhD adviser Eugene Charniak (hardcopy. not free)
MACHINE LEARNING
- Dive into Deep Learning by Zhang et al. 2020.
- An Introduction to Statistical Learning (aka ISLR) Edition 1 by James et al.
- Probabilistic Machine Learning Books 0, 1, and 2 by Kevin Murphy 2012-2022
- Mathematics for Machine Learning
- Pattern Recognition and Machine Learning by Bishop.
- The Modern Mathematics of Deep Learning by Berner et al. May 2021.
- Understanding Machine Learning: From Theory to Algorithms by Shalev-Shwartz and Ben-David. 2014.
- A blitz through classical statistical learning theory (blog) by Boaz Barack. 2021.
MATH
- Introduction to Probability by Blitzstein and Hwang. 2019.
COURSES (MOST HAVE VIDEOS)
NLP
- UMass Amherst’s CS685: Advanced NLP by Mohit Iyyer
- CMU’s CS:11-747: Neural Networks for NLP (Spring 2021) by Graham Neubig
- MIT’s 6.806: Natural Language Processing by Jacob Andreas and Jim Glass
- ETH Zurich’s NLP (Spring 2021) by Ryan Cotterell
- NYU’s Natural Language Understanding and Computational Semantics (Spring 2020) by Sam Bowman
- Michael Collins’ lecture about DeepLearning + NLP progress at large (YouTube)
MACHINE LEARNING
- Stanford CS229: Machine Learning by Andrew Ng
- MIT’s 6.S191: Intro to Deep Learning
- NYU’s Deep Learning (Spring 2020) by Alfredo Canziani
- University of Tübingen’s Probabilistic Machine Learning by Philipp Hennig
- Berkeley’s CS182: Deep Learning (Spring 2021) by Sergey Levine
- Princeton’s COS324: Introduction to Machine Learning (Fall 2018) by Ryan Adams
- Google’s Machine Learning Crash Course
- Stanford’s CS229: Machine Learning cliff notes by Shervine and Afshine
ONE-OFF
- MIT’s Deep Learning Basics (1 lecture) by Lex Fridman
- Very gentle explanation of Backpropagation by Andrew Glassner
- 3Blue1Brown’s 4 videos about NN’s
MATH
- Harvard’s Stat 110 by Blitzstein
TRANSFORMERS
BLOGS/WRITE-UPS
- The AI Summer’s Attention blog post
- HuggingFace’s hihg-level summary of various transformer models
- High-level, light-hearted blog post
Jay Alammar’s famous blog posts:
- Visualizing A Neural Machine Translation Model
- The Illustrated Transformer
- The Illustrated BERT, ELMo, and co
YOUTUBE
- Chris McCormick’s series about BERT
- The original authors of the Transformer present their paper
- Waterloo lecture on Attention and Transformers
- Fast.ai’s explanation of Key/Value/Query + code in the description
CODE
OTHER
- MIT’s The Missing Semester A great run-through of important computing tools and basics
- Weights & Biases A phenomenal site that helps you keep track of many aspects of your ML experiments
- Understanding how Python handles functions
PyTorch: