Applied Data Science with Python¶
Course Information
- Instructor: Christopher Seaman
- EA's: Samantha Chan Samantha.Chan2@ucsf.edu & Eric Yang Eric.Yang2@ucsf.edu
- Dates: January 15th - March 19th, 2026 (11 class meetings)
- Lecture: Wednesday, 1:00 PM - 3:00 PM, Mission Hall 1407
- Lab: Wednesday, 3:00 PM - 4:30 PM, Mission Hall 1407
- GitHub: https://github.com/christopherseaman/datasci_223
- Course Website: https://christopherseaman.github.io/datasci_223/
- CLE: DATASCI 223: Applied Data Science with Python (Winter 2026)
This repository contains the course materials for UCSF DataSci 223: Applied Data Science with Python.
Course Topics (Winter 2026 - 11 Lectures)¶
Foundational (L01-L04)¶
- Setup + Debugging - Notebook hygiene, defensive programming, VS Code debugger
- Larger-than-Memory Data - Polars lazy evaluation, out-of-core processing, parquet
- SQL for Data Analysis - SELECT, JOIN, GROUP BY, window functions, pandas integration
- NLP Foundations - Text preprocessing, embeddings, sentiment, clinical text applications
ML/AI Progression (L05-L08)¶
- Classification - Train/test splits, evaluation metrics, Random Forest, XGBoost
- Neural Networks - MLP, CNN, RNN/LSTM, PyTorch training loop
- Transformers & Deep Learning - Attention mechanism, Hugging Face, tokenization
- LLMs - DIY -> API, Agentic & Workflows - nanoGPT walkthrough, embeddings, fine-tuning concepts, API integration, agents, prompt engineering
Applied / Student Choice (L09-L11)¶
09-11. Student Vote (TBD) - Some options: - Computer Vision (transfer learning, medical imaging) - Visualization & Dashboards (Altair, Streamlit, MkDocs reports) - Time Series & Forecasting (ARIMA, ML regressors) - A/B Testing (causal inference, power analysis) - Distributed Computing (threads/processes, HPC intro) - End-to-End Project (CRISP-DM, capstone guidance) - Jobs, technical interviews, & impostor syndrome - Generative AI with Images - Feature Engineering and Selection - Algorithms and complexity notation - Local setup with the "Modern Data Stack" - Deploying a basic model/app to the web