Intermediate Data Science
Fall 2025
Office Location:
Duke Hall #209
Phone: 909-748-8630
E-Mail: joanna_bieri@redlands.edu
(Email or Teams are my preferred contact methods)
Data Science Lab:
TBA
Office Hours:
Click here for my schedule.
You can also email me for an appointment!
-
Link to our Canvas - for submitting work and checking grades:
-
Important Course Documents
Python for Data Analysis, Wes McKinney
Course Syllabus
Schedule of Topics
NOTE: as the semester progresses we may change up the schedule a bit to suit our class pace and interests. The most recent schedule will be posted here.
Link to our GitHub - for getting assignments and version control
-
Daily Assignments - Reading - Handouts
-
Day 1 - Wednesday - 9/3 - Click Here
PRE-CLASS:Most days there will be videos, homework assignments, and reading that you are expected to complete before class. It is okay if you don't get all the homework done, but you should attempt every part and keep trying until you are completely stuck and have questions to bring to class.
CLASS TIME:Notes - Computer Setup - Review
ANNOUNCEMENTS:
Slides - Computer Setup - Review
Video - Set up your computer and be successful in this class (important that this is done this week!)
Remember to pull the class files to your local machine and make a copy in your working directory! This is where you will find HW_day1.ipynb - we will work on this in class.
Reading: Python for Data Analysis
Chapters 2.3 and all of Chapter 3 of our book are a good review of python basics. Please glance through these chapters before starting your prep for Day 2! We will be working a lot in Pandas, but it is important that you know how to deal with Python lists, dictionaries, sets, and tuples (p.47-64). It is helpful to learn how to do list comprehensions p.64, but as long as you can do a for loop you will be fine. We will make use of functions and lambdas as a way to organize our code p.65-76. We will practice at reading in different data types (mostly with pandas but sometimes with other methods) p.79-80.
Finish HW_day1.ipynb - come Friday if you need help!
Start Prepping for Day 2 - click on the Day 2 link and complete the PRE-CLASS materials.
-
Day 2 - Monday - 9/8 - Click Here
PRE-CLASS:Reading: Python for Data Analysis: Chapter 5
CLASS TIME:
NOTE: The book is a great reference for general data analysis. Don't feel like you have to read it line by line. The lecture notes will follow somewhat closely to the book. Have a Jupyter Notebook open so you can try some of the commands!
Notes - Pandas
Slides - Pandas - summary of functions
Video - Pandas Review-Advanced
Remember to pull the class files to your local machine and make a copy in your working directory! This is where you will find HW_day2.ipynb - we will work on this in class.
Rememer to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.Pandas!
ANNOUNCEMENTS:
Work on finishing up HW_day1.ipynb.
Go through warm-up you try problems.
Work on Titanic Data.
Start Prepping for Day 3 - click on the Day 3 link and complete the PRE-CLASS materials.
-
Day 3 - Wednesday - 9/10 - Click Here
PRE-CLASS:Reading: Python for Data Analysis: Chapter 6
CLASS TIME:
Reading: Build a Career in Data Science: Chapter 1.1 and 1.2. This is only 11 pages - please read it and take notes and come to class to chat about your thoughts!
Notes - Data Reading Writing and File Types
Slides - Data Reading Writing and File Types
Video - Data Reading Writing and File Types - Part1 Overview
Video - Data Reading Writing and File Types - Part2 Code Walkthrough
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.Go through warm-up you try problems.
ANNOUNCEMENTS:
Work on HW_Day3 - reading your own data.
Start Prepping for Day 4
-
Day 4 - Monday - 9/15 - Click Here
PRE-CLASS:Reading: Python for Data Analysis: Chapter 7
CLASS TIME:
Reading: Build a Career in Data Science: Chapter 1.3. Choosing your path!
Notes - Data Cleaning and Preparation
Slides - Data Cleaning and Preparation
Video - Data Cleaning and Preparation
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.Go through warm-up you try problems.
ANNOUNCEMENTS:
Work on HW_Day4 - cleaning up messy data.
Start Prepping for Day 5
-
Day 5 - Wednesday - 9/17 - Click Here
PRE-CLASS:Reading: Python for Data Analysis: Chapter 8
CLASS TIME:
Notes - Data Wrangling
Slides - Data Wrangling
Video - Data Wrangling
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Please do the Check in Quiz - I really appreciate your feedback!
Go through warm-up you try problems.
ANNOUNCEMENTS:
Work on HW_Day5 - wrangling - joining - merging - pivoting - melting.
Start Prepping for Day 6
-
Day 6 - Monday - 9/22 - Click Here
PRE-CLASS:Reading: Python for Data Analysis: Chapter 9
CLASS TIME:
Reading: Build a Career in Data Science: Chapter 2.1. Data Science Companies - Massive Tech
Notes - Visualization
Slides - Visualization
Video - Visualization
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Show and Tell -- How do you avoid misrepresentation when creating visualizations? Please bring to class at least one example to share of data misrepresentation through visualization. You can look up examples online or create one of your own.Go through warm-up you try problems.
ANNOUNCEMENTS:
Work on HW_Day6 - plotting.
Start Prepping for Day 7.
-
Day 7 - Wednesday - 9/24 - Click Here
PRE-CLASS:Reading:
CLASS TIME:
We will take a break from new content to talk about how to do high quality writing in data science. Here are some articles that I would like you to read before class
The Most Undervalued Skill for Data Scientists
Practical Advice for Data Science Writing
Show and Tell -- How do you avoid misrepresentation when creating visualizations? Please bring to class at least one example to share of data misrepresentation through visualization. You can look up examples online or create one of your own.Writing in Data Science
ANNOUNCEMENTS:
Avoiding Misrepresentation
Quarto Quarto Install
There is no new homework for Day 7. Please use the extra time to catch up on past homework and really go back through to practice good writing and formatting.
Start Prepping for Day 8.
-
Day 8 - Wednesday - 10/1 - Click Here
PRE-CLASS:Reading: Python for Data Analysis: Chapter 10
CLASS TIME:
Reading: Build a Career in Data Science: Chapter 2.2. Data Science Companies - Established Retailers
Notes - Data Aggregation and Grouping
Slides - Data Aggregation and Grouping
Video - Data Aggregation and Grouping
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
What are your thoughts - Jobs in Massive Tech vs Retail?
ANNOUNCEMENTS:
Work on Data Aggregation
Start Prepping for Day 9.
-
Day 9 - Monday - 10/6 - Click Here
PRE-CLASS:Start Thinking: What kind of project do you want to do for your final project in this class. What is your area of interest? Can you start looking to see what kind of data is out there? Here are a few places to look:
CLASS TIME:
Kaggle Datasets — community-shared datasets, competitions, notebooks and easy CSV downloads.
UCI Machine Learning Repository — classic machine-learning datasets (small to medium size).
Google Dataset Search — search engine for datasets across many repositories.
AWS Open Data Registry — large public datasets hosted on Amazon Web Services.
Data.gov — U.S. government open data portal (census, health, transport, etc.).
EU Open Data Portal — datasets from European Union institutions and agencies.
Zenodo — general-purpose research data repository (DOI-enabled).
Figshare — research outputs and datasets, often with metadata and DOIs.
DataHub — community-curated datasets and data packages in multiple formats.
U.S. Federal Data Catalog — searchable catalog of federal datasets and APIs.
data.world — social platform for datasets, collaborative projects, and SQL queries.
GitHub — Datasets Collection — many dataset repos and curated lists hosted on GitHub.
NASA Open Data — Earth observation, space science, and mission datasets.
World Bank Open Data — global development indicators and country statistics.
FiveThirtyEight Data — cleaned datasets used in FiveThirtyEight stories (great for teaching/analysis).
Microsoft Research Open Data — research datasets from Microsoft Research.
KDnuggets Datasets Directory — curated links and dataset lists for ML and data science.
NOAA (National Centers for Environmental Information) — weather, climate, and oceanographic datasets.
FRED (Federal Reserve Economic Data) — U.S. and global economic time series (GDP, CPI, interest rates, etc.).
Reading: Python for Data Analysis: Chapter 11
Notes - Time Series Data
Slides - Time Series Data
Video - Time Series Data
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Group brainstorming on Project Ideas
ANNOUNCEMENTS:
Time Series Data
Start Prepping for Day 10.
-
Day 10 - Wednesday - 10/8 - Click Here
PRE-CLASS:Reading: Build a Career in Data Science: Chapter 2.3 and 2.4 The early-stage startup The late-stage successful startup
CLASS TIME:
Notes - Web Scraping
Slides - Web Scraping
Video - Web Scraping - Intro
Video - Web Scraping - Dynamic Content
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Please do the Check In
What are your thoughts - Jobs at a Start-Up
ANNOUNCEMENTS:
Get your Selenium Working
Work on Web Scraping
Choose groups for Ethics Discussion
Start Prepping for Day 11.
Exam 1 will be handed out in class on Wednesday 10/15 - and you will have time to work on it over the weekend.
-
Day 11 - Wednesday - 10/15 - Click Here
PRE-CLASS:Reading: Part of the homework is reading the case study. Make sure to do the group work BEFORE CLASS!
CLASS TIME:
Notes - Data Management and Ethics
Slides - Data Management and Ethics
Video - Data Management and Ethics
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Groups lead in Class Discussion
ANNOUNCEMENTS:
Talk about the Exam
Start Prepping for Day 12.
Exam 1 will be due Sunday 10/19 at 11:59pm.
-
Day 12a - Monday - 10/20 - Pause and Get it Done - Click Here
PRE-CLASS: - you really need to run the code before class. We have been spending WAY too much class time installing packages that should be installed outside of class when you start your homework!Review: What do you need help with? What do you need to finish up before we move to ML?
CLASS TIME:Put it together, get it done, day!
ANNOUNCEMENTS:Start Prepping for Day 12b.
-
Day 12b - Wednesday - 10/22 - Click Here
PRE-CLASS: - you really need to run the code before class. We have been spending WAY too much class time installing packages that should be installed outside of class when you start your homework!Reading: Build a Career in Data Science: Chapter 2.5 and 2.7 Government Contractors and Interview
CLASS TIME:
Reading: Intro to Machine Learning with Python Chapter 1.7
Career Building - Do a job search using Linked In, Indeed, DataJobs, Kaggle, and others. Start a list of jobs that interest you along with qualifications, job descriptions, pay, and location.
Notes - Intro to ML
Slides - Intro to ML
Video - Intro to ML
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Introduction to ML
ANNOUNCEMENTS:
Work on HW.Start Prepping for Day 13.
-
Day 13 - Monday - 10/27 - Click Here
PRE-CLASS: - you really need to run the code before class. We have been spending WAY too much class time installing packages that should be installed outside of class when you start your homework!Reading: Make sure you did the reading from last class! We will talk about it today in class! Reading: Intro to Machine Learning with Python Chapter 2.1 and 2.2
CLASS TIME:
Career Building - Do a job search using Linked In, Indeed, DataJobs, Kaggle, and others. Start a list of jobs that interest you along with qualifications, job descriptions, pay, and location.
Notes - Linear and Logistic Regression
Slides - Linear and Logistic Regression
Video - Linear and Logistic Regression
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Talk about Job Search and Government Contractors
ANNOUNCEMENTS:
Work on HW.
Start Prepping for Day 14.
-
Day 13b - Monday - 10/27 - Click Here
Pause to Catch Up and Improve Understanding! CLASS TIME:Work on HW.
ANNOUNCEMENTS:
Start Prepping for Day 14.
-
Day 14 - Monday - 11/3 - Click Here
Career Building - Continue your job search using Linked In, Indeed, DataJobs, Kaggle, and others. Start a list of jobs that interest you along with qualifications, job descriptions, pay, and location. This will help you focus on our next career steps: Job Portfolios, Resumes, and Practice Interviews.
CLASS TIME:
Notes - KNN and Linear Models
Slides - KNN and Linear Models
Video - KNN and Linear Models
Remember to push your class work to git before class starts - this is to show your attempt and a timestamp on your work.
Talk about KNN vs Linear Models.
ANNOUNCEMENTS:
Work on Homework.
Start Prepping for Day 15.
-
Day 14b - Wednesday - 11/5 - Click Here
Pause for Understanding.
CLASS TIME:Live Coding - working on the life expectancy data.
ANNOUNCEMENTS:
Start Prepping for Day 15.
You should be thinking about what data set you want to use for your final project. Download it and start doing some EDA!!
-
Day 15 - Monday - 11/10 - Click Here
Career Building - Building a Portfolio!
CLASS TIME:
Reading: Build a Career in Data Science: Chapter 4 Building a portfolio.
What might you add to your Data Science portfolio? Where might you publish your portfolio for employers to see? What do you need to do to add to your portfolio?
Notes - Regularization and Standard Scaler
Slides - Regularization and Standard Scaler
Video - Regularization and Standard Scaler
Remember to push your class work to git before class starts - this is to show your attempt and a timestamp on your work.
Portfolio Ideas
ANNOUNCEMENTS:
Talk about Regularization.
Work on Homework.
Start Prepping for Day 16.
You should be thinking about what data set you want to use for your final project. Download it and start doing some EDA!!
Project Proposals Due: 11/19
-
Day 16 - Wednesday - 11/12 - Click Here
Career Building - Building a Portfolio!
CLASS TIME:
Notes - Introduction to Naive Bayes Classifiers
Slides - Introduction to Naive Bayes Classifiers
Video - Introduction to Naive Bayes Classifiers
Video - SPAM Detection - Example
Remember to push your class work to git before class starts - this is to show your attempt and a timestamp on your work.
Portfolio Ideas
ANNOUNCEMENTS:
Talk about Naive Bayes
Start Prepping for Day 17.
You should be thinking about what data set you want to use for your final project. Download it and start doing some EDA!!
Project Proposals Due: 11/19
-
Day 17 - Monday - 11/17 - Click Here
Career Building - Resumes and Cover Leters
CLASS TIME:
Reading: Build a Career in Data Science: Chapter 6 Resumes and Cover Letters.
By now you should have a small list of jobs that you might be interested in applying for. This will help you in focusing your resume.
Notes - Introduction to Decision Trees
Slides - Introduction to Decision Trees
Video - Introduction to Decision Trees
Remember to push your class work to git before class starts - this is to show your attempt and a timestamp on your work.
Decision Trees
ANNOUNCEMENTS:
Building a Resume
Overleaf LaTeX ResumeStart Prepping for Day 18.
You should be thinking about what data set you want to use for your final project. Download it and start doing some EDA!!
Project Proposals Due: 11/19
-
Day 18 - Wednesday - 11/19 - Click Here
Career Building - Resumes and Cover Leters
CLASS TIME:
Notes - Dimensionality Reduction PCA and t-SNE
Slides - Dimensionality Reduction PCA and t-SNE
Video - Dimensionality Reduction PCA and t-SNE
Remember to push your class work to git before class starts - this is to show your attempt and a timestamp on your work.
Dimensionality Reduction
ANNOUNCEMENTS:
Building a Resume
Overleaf LaTeX ResumeStart Prepping for Day 19.
Project Proposals Due: 11/19
-
Day 19 - Monday - 11/24 - Click Here
Notes - Cross-Validation
CLASS TIME:
Slides - Cross-Validation
Video - Cross-Validation
Remember to push your class work to git before class starts - this is to show your attempt and a timestamp on your work.
Cross-Validation
ANNOUNCEMENTS:
Final Projects
What do you need to get caught up on>
-
Monday 12/1 and Wednesday 12/3 - Click Here
CLASS TIME:Work on Final Projects - you are welcome to use the classroom as a work space where you can get help from your peers.
ANNOUNCEMENTS:What do you need to get caught up on?
-
Monday - 12/8 - Click Here
CLASS TIME:We will have class where I can answer questions you have about the models you are building and go over cross-validation.
ANNOUNCEMENTS:What do you need to get caught up on?
Each day I will post the lecture videos, homework, reading, and other information. Make sure to check here for each day of class. -
Day 1 - Wednesday - 9/3 - Click Here
-
Homework Solutions - Exam Review
All Practice Problems and Programming Assignment solutions are available on Canvas
Class Canvas
