STA 199: Introduction to Data Science and Statistical Thinking

This page contains an outline of the topics, content, and assignments for the semester. Note that this schedule will be updated as the semester progresses and the timeline of topics and assignments might be updated throughout the semester.

WEEK DATE PREPARE TOPIC MATERIALS DUE
1 Wed, Jan 10

Lab 0: Hello, World and STA 199!

πŸ’» lab 0



Thu, Jan 11

Welcome to STA 199

πŸ–₯️ slides 00
⌨️ ae 00


2 Mon, Jan 15

No lab - Martin Luther King Jr. Day holiday




Tue, Jan 16

πŸ“— r4ds - intro
πŸ“˜ ims - chp 1

Meet the toolkit

πŸ–₯️ slides 01
⌨️ ae 01



Thu, Jan 18

πŸ“— r4ds - chp 1
πŸŽ₯ Data and visualization
πŸŽ₯ Visualising data with ggplot2

Grammar of graphics

πŸ–₯️ slides 02
⌨️ ae 02
βœ… ae 02


3 Mon, Jan 22

πŸ“— r4ds - chp 2

Lab 1: Data visualization

πŸ’» lab 1
βœ… lab 1



Tue, Jan 23

πŸ“˜ ims - chp 4
πŸ“˜ ims - chp 5
πŸŽ₯ Visualizing numerical data
πŸŽ₯ Visualizing categorical data

Visualizing various types of data

πŸ–₯️ slides 03
⌨️ ae 02 (cont.)
βœ… ae 02



Thu, Jan 25

πŸ“˜ ims - chp 6

Data visualization overview

πŸ–₯️ slides 04
⌨️ ae 03
βœ… ae 03


4 Mon, Jan 29

πŸŽ₯ Grammar of data wrangling
πŸ“— r4ds - chp 3.1-3.5

Lab 2: Data wrangling

πŸ’» lab 2
βœ… lab 2

Lab 1 at 8 am


Tue, Jan 30

πŸŽ₯ Working with a single data frame
πŸ“— r4ds - chp 3.6-3.7
πŸ“— r4ds - chp 4

Grammar of data wrangling

πŸ–₯️ slides 05
⌨️ ae 04
βœ… ae 04



Thu, Feb 1

πŸŽ₯ Tidying data
πŸ“— r4ds - chp 5

Tidying data

πŸ–₯️ slides 06
⌨️ ae 05
βœ… ae 05


5 Mon, Feb 5

πŸŽ₯ Working with multiple data frames

Lab 3: Data tidying and joining

πŸ’» lab 3
βœ… lab 3

Lab 2 at 8 am


Tue, Feb 6

πŸ“— r4ds - chp 19.1-19.3

Joining data

πŸ–₯️ slides 07
⌨️ ae 06
βœ… ae 06



Thu, Feb 8

πŸŽ₯ Data types
πŸŽ₯ Data classes
πŸ“— r4ds - chp 16

Data types and classes

πŸ–₯️ slides 08
⌨️ ae 07
βœ… ae 07


6 Mon, Feb 12

Work on Exam 1 Review

πŸ“ exam 1 review
βœ… exam 1 review

Lab 3 at 8 am


Tue, Feb 13

Exam 1 Review

πŸ–₯️ slides 09



Thu, Feb 15

Exam 1 - In-class + take-home released



7 Mon, Feb 19

Project milestone 1 - Working collaboratively

πŸ““ project milestone 1

Exam 1 take-home at 8 am


Tue, Feb 20

πŸŽ₯ Importing data
πŸŽ₯ Recoding data
πŸ“— r4ds - chp 7
πŸ“— r4ds - chp 17.1 - 17.3

Importing and recoding data

πŸ–₯️ slides 10
⌨️ ae 08
βœ… ae 08



Thu, Feb 22

πŸŽ₯ Web scraping
πŸŽ₯ Scraping top 250 movies on IMDB
πŸŽ₯ Web scraping considerations
πŸ“— r4ds - chp 24.1 - 24.6

Web scraping

πŸ–₯️ slides 11
⌨️ ae 09
⌨️ ae 09
βœ… ae 09


8 Mon, Feb 26

Lab 4: Web scraping and ethics

πŸ’» lab 4
βœ… lab 4

Project milestone 1 at 8 am


Tue, Feb 27

πŸŽ₯ Functions
πŸŽ₯ Iteration
πŸ“— r4ds - chp 25.1 - 25.2

Working with Chat GPT

πŸ–₯️ slides 12
⌨️ ae 09
βœ… ae 09



Thu, Feb 29

πŸŽ₯ Misrepresentation
πŸŽ₯ Data privacy
πŸŽ₯ Algorithmic bias
πŸ“• mdsr - chp 8
πŸŽ₯ Alberto Cairo - How charts lie
πŸŽ₯ Joy Buolamwini - How I’m fighting bias in algorithms

Data science ethics

πŸ–₯️ slides 13


9 Mon, Mar 4

Lab 5: Topic TBA

πŸ’» lab 5
βœ… lab 5

Lab 4 at 8 am


Tue, Mar 5

πŸŽ₯ The language of models
πŸ“˜ ims - chp 7.1

The language of models

πŸ–₯️ slides 14
⌨️ ae 10
βœ… ae 10



Thu, Mar 7

πŸŽ₯ Fitting and interpreting models
πŸŽ₯ Modeling nonlinear relationships
πŸ“˜ ims - chp 7.2

Linear regression with a single predictor

πŸ–₯️ slides 15
⌨️ ae 11
βœ… ae 11


10 Mon, Mar 11

🌴 No lab - Spring Break




Tue, Mar 12

🌴 No lecture - Spring Break




Thu, Mar 14

🌴 No lecture - Spring Break



11 Mon, Mar 18

Project milestone 2 - Project proposals

πŸ““ project milestone 2

Lab 5 at 8 am


Tue, Mar 19

πŸŽ₯ Models with multiple predictors
πŸŽ₯ More models with multiple predictors
πŸ“˜ ims - chp 8.1-8.2

Linear regression with multiple predictors I

πŸ–₯️ slides 16
⌨️ ae 12
βœ… ae 12



Thu, Mar 21

πŸ“˜ ims - chp 8.3-8.5

Linear regression with multiple predictors II

πŸ–₯️ slides 17


12 Mon, Mar 25

Lab 6: Modeling I

πŸ’» lab 6
βœ… lab 6

Project milestone 2 at 8 am


Tue, Mar 26

πŸŽ₯ Logistic regression
πŸŽ₯ Prediction and overfitting

Model selection and overfitting

πŸ–₯️ slides 18
⌨️ ae 13
βœ… ae 13



Thu, Mar 28

πŸ“˜ ims - chp 9

Logistic regression

πŸ–₯️ slides 19
⌨️ ae 14
βœ… ae 14


13 Mon, Apr 1

Lab 7: Modeling II

πŸ’» lab 7
βœ… lab 7

Lab 6 at 8 am


Tue, Apr 2

πŸŽ₯ Quantifying uncertainty
πŸŽ₯ Bootstrapping
πŸ“˜ ims - chp 12

Quantifying uncertainty with bootstrap intervals

πŸ–₯️ slides 20
⌨️ ae 15
βœ… ae 15



Thu, Apr 4

πŸ“˜ ims - chp 11

Making decisions with randomization tests

πŸ–₯️ slides 21
⌨️ ae 16
βœ… ae 16


14 Mon, Apr 8

Work on Exam 2 Review

πŸ“ exam 2 review
βœ… exam 2 review

Lab 7 at 8 am


Tue, Apr 9

Exam 2 Review

πŸ–₯️ slides 22



Thu, Apr 11

Exam 2 - In-class + take-home released



15 Mon, Apr 15

Project milestone 3 - Peer review

πŸ““ project milestone 3

Exam 2 take-home at 8 am
Project milestone 3 at the end of lab session


Tue, Apr 16

πŸŽ₯ Tips for effective data visualization
πŸ“˜ ims - chp 6
πŸ“— r4ds - chp 10

Communicating data science results effectively

πŸ–₯️ slides 23
⌨️ ae 17
βœ… ae 17



Thu, Apr 18

πŸŽ₯ Doing data science

Customizing Quarto reports and presentations

πŸ–₯️ slides 24
⌨️ ae 18


16 Mon, Apr 22

Project milestone 4 - Project presentations

πŸ““ project milestone 4

Project presentations at the beginning of lab session


Tue, Apr 23

Looking further: Interactive web applications with Shiny

πŸ–₯️ slides 25
⌨️ ae 19



Wed, Apr 24


Project writeup at 8 am