Statistics 154/254: Statistical Machine Learning

UC Berkeley, Spring 2025

Course Content and Calendar

This course is an introduction to the statistical concepts that underpin our understanding of modern machine learning.

The core concepts are:

  • A taxonomy of ML tasks
    • Inference versus prediction
    • Regression models
    • Classification models
    • Unsupervised learning
  • Loss minimization
    • Task–appropriate loss functions
    • Generalization from empirical risk to population risk
    • Estimating risk in practice
  • Complexity
    • How to form classes of expressive models
    • Implicit and explicit regulariziation
    • The costs and benefits of complexity (e.g., the bias / variance tradeoff)
  • Computation and estimation
    • Black-box optimization
    • Stochastic optimization
    • Automatic differentiation

The following schedule is aspirational and subject to change as we go.

Calendar (tentative)
Week Date Day Note Unit Topic Reading Assignment
1 Jan 22 Wednesday Unit 0: Introduction and review Course policies and introduction
1 Jan 24 Friday What statistical prediction is and isn’t
2 Jan 27 Monday Unit 1: Regression Population loss minimization
2 Jan 29 Wednesday Linear regression as empirical loss minimization
2 Jan 31 Friday Making new features out of old HW0 due
3 Feb 3 Monday Bias / variance tradeoff with feature selection
3 Feb 5 Wednesday L2 penalization
3 Feb 7 Friday Guest or recorded lecture L1 penalization
4 Feb 10 Monday Unit 2: Risk and complexity Uniform laws and generalization error
4 Feb 12 Wednesday A uniform law for smooth functions
4 Feb 14 Friday VC dimension, zero–one loss, and generalization HW1 due
5 Feb 17 Monday Administrative holiday
5 Feb 19 Wednesday (missed due to sickness)
5 Feb 21 Friday Review and quiz Quiz1
6 Feb 24 Monday Cross validation and held-out sets
6 Feb 26 Wednesday Cross validation for model selection
6 Feb 28 Friday Homework Q&A HW2 due
7 Mar 3 Monday Unit 3: Classification Classification loss and proxy loss functions
7 Mar 5 Wednesday Discriminative and generative losses
7 Mar 7 Friday Review and quiz Quiz2
8 Mar 10 Monday Guest or recorded lecture ROC curves for classification
8 Mar 12 Wednesday Guest or recorded lecture The perceptron algorithm
8 Mar 14 Friday Guest or recorded lecture Support vector (max-margin) classifiers (no quiz or HW)
9 Mar 17 Monday Floating unit: Optimization Gradient descent
9 Mar 19 Wednesday Stochastic gradient descent
9 Mar 21 Friday Homework Q&A HW3 due
10 Mar 24 Monday Spring Break
10 Mar 25 Tuesday Spring Break
10 Mar 26 Wednesday Spring Break
10 Mar 27 Thursday Spring Break
10 Mar 28 Friday Spring Break
11 Mar 31 Monday Canceled class Unit 4: Trees and weak learners No lecture
11 Apr 2 Wednesday Regression and classification trees
11 Apr 4 Friday Review and quiz Quiz 3
12 Apr 7 Monday Bagging
12 Apr 9 Wednesday Boosting
12 Apr 11 Friday Review and quiz Quiz X (Review)
13 Apr 14 Monday Unit 5: Kernels and interpolators Inner product spaces and the polynomial kernel
13 Apr 16 Wednesday The kernel trick in SVMs and ridge regression
13 Apr 18 Friday Positive definite kernels HW 4 due
14 Apr 21 Monday Reproducing kernel Hilbert spaces
14 Apr 23 Wednesday The representer theorem and interpreting kernels
14 Apr 25 Friday Review and quiz Quiz 4
15 Apr 28 Monday Class review Class review
15 Apr 30 Wednesday Class review
15 May 2 Friday Class review HW5 due