CS109a: Introduction to Data Science


Fall 2020

Pavlos Protopapas, Kevin A. Rader, and Chris Tanner

Additional Instructor: Eleni Kaxiras


Welcome to CS109a/STAT121a/AC209a, also offered by the DCE as CSCI E-109a, Introduction to Data Science. This course is the first half of a one‐year course to data science. We will focus on the analysis of data to perform predictions using statistical and machine learning methods. Topics include data scraping, data management, data visualization, regression and classification methods, and deep neural networks. You will get ample practice through weekly homework assignments. The class material integrates the five key facets of an investigation using data:

1. data collection ‐ data wrangling, cleaning, and sampling to get a suitable data set
2. data management ‐ accessing data quickly and reliably
3. exploratory data analysis – generating hypotheses and building intuition
4. prediction or statistical learning
5. communication – summarizing results through visualization, stories, and interpretable summaries

Only one of CS 109a, AC 209a, or Stat 121a can be taken for credit. Students who have previously taken CS 109, AC 209, or Stat 121 cannot take CS 109a, AC 209a, or Stat 121a for credit.

Important Dates:
Tuesday 9/8 - HW1 released
Wednesday 9/9 - HW0 due at 11:59pm EST
Thursday 9/10 - 'Study Break' at 8:30pm EST
Friday 9/11 - 'Study Break' at 10:15am EST
Friday 9/11 - Sections start (1:30pm EST) See syllabus/calendar for weekly times

Helpline: cs109a2020@gmail.com

Lectures: Mon , Wed, & Fri 9:00‐10:15 am & 3:00-4:14 pm (identical material in a single day)
Sections: Fri 1:30-2:45 pm & Mon 8:30-9:45 pm (identical material) [starts 9/11]
Advanced Sections: Wed 12pm [starts 9/23]
Office Hours: TBD
Course material can be viewed in the public GitHub repository.

Previous Material