Intermediate Data Science
Fall 2025
Office Location:
Duke Hall #209
Phone: 909-748-8630
E-Mail: joanna_bieri@redlands.edu
(Email or Teams are my preferred contact methods)
Data Science Lab:
TBA
Office Hours:
Click here for my schedule.
You can also email me for an appointment!
-
Link to our Canvas - for submitting work and checking grades:
-
Important Course Documents
Python for Data Analysis, Wes McKinney
Course Syllabus
Schedule of Topics
NOTE: as the semester progresses we may change up the schedule a bit to suit our class pace and interests. The most recent schedule will be posted here.
Link to our GitHub - for getting assignments and version control
-
Daily Assignments - Reading - Handouts
-
Day 1 - Wednesday - 9/3 - Click Here
PRE-CLASS:Most days there will be videos, homework assignments, and reading that you are expected to complete before class. It is okay if you don't get all the homework done, but you should attempt every part and keep trying until you are completely stuck and have questions to bring to class.
CLASS TIME:Notes - Computer Setup - Review
ANNOUNCEMENTS:
Slides - Computer Setup - Review
Video - Set up your computer and be successful in this class (important that this is done this week!)
Remember to pull the class files to your local machine and make a copy in your working directory! This is where you will find HW_day1.ipynb - we will work on this in class.
Reading: Python for Data Analysis
Chapters 2.3 and all of Chapter 3 of our book are a good review of python basics. Please glance through these chapters before starting your prep for Day 2! We will be working a lot in Pandas, but it is important that you know how to deal with Python lists, dictionaries, sets, and tuples (p.47-64). It is helpful to learn how to do list comprehensions p.64, but as long as you can do a for loop you will be fine. We will make use of functions and lambdas as a way to organize our code p.65-76. We will practice at reading in different data types (mostly with pandas but sometimes with other methods) p.79-80.
Finish HW_day1.ipynb - come Friday if you need help!
Start Prepping for Day 2 - click on the Day 2 link and complete the PRE-CLASS materials.
-
Day 2 - Monday - 9/8 - Click Here
PRE-CLASS:Reading: Python for Data Analysis: Chapter 5
CLASS TIME:
NOTE: The book is a great reference for general data analysis. Don't feel like you have to read it line by line. The lecture notes will follow somewhat closely to the book. Have a Jupyter Notebook open so you can try some of the commands!
Notes - Pandas
Slides - Pandas - summary of functions
Video - Pandas Review-Advanced
Remember to pull the class files to your local machine and make a copy in your working directory! This is where you will find HW_day2.ipynb - we will work on this in class.
Rememer to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.Pandas!
ANNOUNCEMENTS:
Work on finishing up HW_day1.ipynb.
Go through warm-up you try problems.
Work on Titanic Data.
Start Prepping for Day 3 - click on the Day 3 link and complete the PRE-CLASS materials.
-
Day 3 - Wednesday - 9/10 - Click Here
PRE-CLASS:Reading: Python for Data Analysis: Chapter 6
CLASS TIME:
Reading: Build a Career in Data Science: Chapter 1.1 and 1.2. This is only 11 pages - please read it and take notes and come to class to chat about your thoughts!
Notes - Data Reading Writing and File Types
Slides - Data Reading Writing and File Types
Video - Data Reading Writing and File Types - Part1 Overview
Video - Data Reading Writing and File Types - Part2 Code Walkthrough
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.Go through warm-up you try problems.
ANNOUNCEMENTS:
Work on HW_Day3 - reading your own data.
Start Prepping for Day 4
-
Day 4 - Monday - 9/15 - Click Here
PRE-CLASS:Reading: Python for Data Analysis: Chapter 7
CLASS TIME:
Reading: Build a Career in Data Science: Chapter 1.3. Choosing your path!
Notes - Data Cleaning and Preparation
Slides - Data Cleaning and Preparation
Video - Data Cleaning and Preparation
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.Go through warm-up you try problems.
ANNOUNCEMENTS:
Work on HW_Day4 - cleaning up messy data.
Start Prepping for Day 5
-
Day 5 - Wednesday - 9/17 - Click Here
PRE-CLASS:Reading: Python for Data Analysis: Chapter 8
CLASS TIME:
Notes - Data Wrangling
Slides - Data Wrangling
Video - Data Wrangling
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Please do the Check in Quiz - I really appreciate your feedback!
Go through warm-up you try problems.
ANNOUNCEMENTS:
Work on HW_Day5 - wrangling - joining - merging - pivoting - melting.
Start Prepping for Day 6
-
Day 6 - Monday - 9/22 - Click Here
PRE-CLASS:Reading: Python for Data Analysis: Chapter 9
CLASS TIME:
Reading: Build a Career in Data Science: Chapter 2.1. Data Science Companies - Massive Tech
Notes - Visualization
Slides - Visualization
Video - Visualization
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Show and Tell -- How do you avoid misrepresentation when creating visualizations? Please bring to class at least one example to share of data misrepresentation through visualization. You can look up examples online or create one of your own.Go through warm-up you try problems.
ANNOUNCEMENTS:
Work on HW_Day6 - plotting.
Start Prepping for Day 7.
-
Day 7 - Wednesday - 9/24 - Click Here
PRE-CLASS:Reading:
CLASS TIME:
We will take a break from new content to talk about how to do high quality writing in data science. Here are some articles that I would like you to read before class
The Most Undervalued Skill for Data Scientists
Practical Advice for Data Science Writing
Show and Tell -- How do you avoid misrepresentation when creating visualizations? Please bring to class at least one example to share of data misrepresentation through visualization. You can look up examples online or create one of your own.Writing in Data Science
ANNOUNCEMENTS:
Avoiding Misrepresentation
Quarto Quarto Install
There is no new homework for Day 7. Please use the extra time to catch up on past homework and really go back through to practice good writing and formatting.
Start Prepping for Day 8.
-
Day 8 - Wednesday - 10/1 - Click Here
PRE-CLASS:Reading: Python for Data Analysis: Chapter 10
CLASS TIME:
Reading: Build a Career in Data Science: Chapter 2.2. Data Science Companies - Established Retailers
Notes - Data Aggregation and Grouping
Slides - Data Aggregation and Grouping
Video - Data Aggregation and Grouping
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
What are your thoughts - Jobs in Massive Tech vs Retail?
ANNOUNCEMENTS:
Work on Data Aggregation
Start Prepping for Day 9.
-
Day 9 - Monday - 10/6 - Click Here
PRE-CLASS:Start Thinking: What kind of project do you want to do for your final project in this class. What is your area of interest? Can you start looking to see what kind of data is out there? Here are a few places to look:
CLASS TIME:
Kaggle Datasets — community-shared datasets, competitions, notebooks and easy CSV downloads.
UCI Machine Learning Repository — classic machine-learning datasets (small to medium size).
Google Dataset Search — search engine for datasets across many repositories.
AWS Open Data Registry — large public datasets hosted on Amazon Web Services.
Data.gov — U.S. government open data portal (census, health, transport, etc.).
EU Open Data Portal — datasets from European Union institutions and agencies.
Zenodo — general-purpose research data repository (DOI-enabled).
Figshare — research outputs and datasets, often with metadata and DOIs.
DataHub — community-curated datasets and data packages in multiple formats.
U.S. Federal Data Catalog — searchable catalog of federal datasets and APIs.
data.world — social platform for datasets, collaborative projects, and SQL queries.
GitHub — Datasets Collection — many dataset repos and curated lists hosted on GitHub.
NASA Open Data — Earth observation, space science, and mission datasets.
World Bank Open Data — global development indicators and country statistics.
FiveThirtyEight Data — cleaned datasets used in FiveThirtyEight stories (great for teaching/analysis).
Microsoft Research Open Data — research datasets from Microsoft Research.
KDnuggets Datasets Directory — curated links and dataset lists for ML and data science.
NOAA (National Centers for Environmental Information) — weather, climate, and oceanographic datasets.
FRED (Federal Reserve Economic Data) — U.S. and global economic time series (GDP, CPI, interest rates, etc.).
Reading: Python for Data Analysis: Chapter 11
Notes - Time Series Data
Slides - Time Series Data
Video - Time Series Data
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Group brainstorming on Project Ideas
ANNOUNCEMENTS:
Time Series Data
Start Prepping for Day 10.
-
Day 10 - Wednesday - 10/8 - Click Here
PRE-CLASS:Reading: Build a Career in Data Science: Chapter 2.3 and 2.4 The early-stage startup The late-stage successful startup
CLASS TIME:
Notes - Web Scraping
Slides - Web Scraping
Video - Web Scraping - Intro
Video - Web Scraping - Dynamic Content
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Please do the Check In
What are your thoughts - Jobs at a Start-Up
ANNOUNCEMENTS:
Get your Selenium Working
Work on Web Scraping
Choose groups for Ethics Discussion
Start Prepping for Day 11.
Exam 1 will be handed out in class on Wednesday 10/15 - and you will have time to work on it over the weekend.
-
Day 11 - Wednesday - 10/15 - Click Here
PRE-CLASS:Reading: Part of the homework is reading the case study. Make sure to do the group work BEFORE CLASS!
CLASS TIME:
Notes - Data Management and Ethics
Slides - Data Management and Ethics
Video - Data Management and Ethics
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Groups lead in Class Discussion
ANNOUNCEMENTS:
Talk about the Exam
Start Prepping for Day 12.
Exam 1 will be due Sunday 10/19 at 11:59pm.
-
Day 12 - Monday - 10/20 - Click Here
PRE-CLASS: - you really need to run the code before class. We have been spending WAY too much class time installing packages that should be installed outside of class when you start your homework!Review: What do you need help with? What do you need to finish up before we move to ML?
CLASS TIME:Put it together, get it done, day!
ANNOUNCEMENTS:Start Prepping for Day 14.
-
Day 13 - Wednesday - 10/22 - Click Here
PRE-CLASS: - you really need to run the code before class. We have been spending WAY too much class time installing packages that should be installed outside of class when you start your homework!Reading: Build a Career in Data Science: Chapter 2.5 and 2.7 Government Contractors and Interview
CLASS TIME:
Reading: Intro to Machine Learning with Python Chapter 1.7
Career Building - Do a job search using Linked In, Indeed, DataJobs, Kaggle, and others. Start a list of jobs that interest you along with qualifications, job descriptions, pay, and location.
Notes - Intro to ML
Slides - Intro to ML
Video - Intro to ML
Remember to push your class work to git before class starts - this is to show you attempt and a timestamp on your work.
Introduction to ML
ANNOUNCEMENTS:
Work on HW.Start Prepping for Day 13.
Each day I will post the lecture videos, homework, reading, and other information. Make sure to check here for each day of class. -
Day 1 - Wednesday - 9/3 - Click Here
-
Homework Solutions - Exam Review
All Practice Problems and Programming Assignment solutions are available on Canvas
Class Canvas