Syllabus and Course Structure

STAT 154/254: Statistical Machine Learning

Instructors

Ryan Giordano

Term

Course Objectives

To be able to, in theory and in practice, perfom the following tasks:

  • Probabilistic ML
    • Differentiate between inference and prediction problems
    • Apply and criticize the framework of empirical risk minimization under model misspecification
    • Apply and criticize the assumption of random training data
  • Evaluation of ML methods
    • Understand complexity bounds, uniform laws of large numbers, and their limitations
    • Use test sets and cross validation for hyperparameter selection and model evaluation
    • Generate and recognize expressive function classes and regularization schemes
  • Familiarity with common ML methods
    • Understand and use linear models under misspecification
    • Understand and use classification models with proxy loss functions
    • Recognize and employ common optimization and approximate optimization methods
    • Draw connections between linear models and more sophisticated ML models (trees, kernels)

Assignments, Exams, and Grading

Grading.

The weighting for the grades will be:

  • Homework completion: 25%
  • Quizzes (each weighted equally): 30%
  • Final exam: 15%
  • Lab attendance: 15%
  • Lab assignments: 15%

Letter grades will be assigned according the weighted points earned. A score within [90-92%) will earn an A-, [92-98%) will earn an A, and [98-100%) will earn an A+. Scores in the 80’s will receive B’s, in the 70’s will receive C’s, in the 60’s will receive D’s, with the same thresholds for plusses and minuses. Scores below 60% will be considered failing. Grades will be non-negotiable.

Grades will not be curved except where otherwise noted.

254 students will be graded the same as 154 students, but will have extra homework and quiz problems. See below for details.

Attendance

Lecture attendance will not be tracked or graded, although it will be highly encouraged. Laptops will not be permitted in lectures. Ipads and phones will be permitted during lecture for note-taking as long as their use doesn’t inhibit participation.

Attendance in labs is required (though see “drops” below for emergency situations).

Homework.

In this class, homework will serve both as preparation for quizzes and as a way to teach supplemental material not covered in the lectures. The purpose of homework is for you to attempt to work through problems on their own.

Homework will be graded on completion only. Copying from other students or from generative AI provides poor preparation for the quizzes, finals, and labs, and will not provide any credit beyond attempting the homework problems on your own. We strongly encourage students to submit their own best efforts, even if imperfect, rather than copy a correct answer! Solutions will be provided before the quiz, and students are encouraged to check their own work for correctness.

Homework assignments will be due on Gradescope roughly every two weeks on Sundays no later at 9pm. (See below for more about Gradescope.) All homework will be due as a pdf via Gradescope unless otherwise noted. Students can use whatever tool they like to produce the pdf (latex, Rmd, Jupyter, scanned handwritten notes for mathematical problems, etc.).

In each homework, some problems will be designated for 254 students only. Of course, 154 students are welcome to attempt the problems and to discuss them with the instructors.

Labs.

Attendance at labs is required, and will be recorded for the attendance half of your lab grade. Every week, with some exceptions, students will be required to submit a lab assignment; these assignments will comprise the other half of your lab grade. The lab assignments, like the homework, will be graded for completion only.

Quizzes.

Following homework due dates we will have an thirty-minute in-class quiz covering the content of the homework, typically on the Thursday following a homework due date. These quizzes will take the place of a sitdown midterm exam (i.e., there will be no midterm). No external materials, including cheatsheets, will be allowed during quizzes. The quizzes will be based closely on the homework, and function to evaluate whether the student is able to answer the homework questions on their own with no outside aids.

Each quiz will have a section for both 154 and 254, and a section that is for 254 only. Students in 254 will not receive extra time, and so must be able to complete the 154 section more quickly than the 154 students, in addition to answering an additional question.

Final exam.

An in-person pencil-and-paper final exam will be scheduled during the usual final exam week. No exam notes (i.e., no “cheatsheet”) will be allowed.

As with the quizzes, 254 students will have extra questions on the final exam, but the same amount of time.

Gradescope

You will be turning in your homework, and receiving your quiz and final grades, on a platform called Gradescope. You are welcome to file a regrade request if you notice that we made an error in applying the rubric to your work, but be sure to do so within a week of the grades being posted. We will not accept regrade requests past that point.

Drops

In order to provide flexibility around emergencies that might arise for you throughout the semester (for example, missing a quiz due to COVID), we will apply for everyone:

  • one emergency drop for quizzes
  • one emergency drop for homework
  • one emergency drop for a lab submission
  • four emergency drops for lab attendance

Additional excused absences and / or drops will be granted only with an exception granted by the Berkeley DSP office, and we encourage students who are experiencing difficulties to reach out to the DSP office. Unless students are excused by official university policies, additional drops will not be given.
We strongly recommend that students reserve their emergency drops for real emergencies.

Late Work

Late work will not be accepted. If work is not submitted on time, it will receive a zero. It is entirely the students’ responsibility to turn work in on time. If there is any uncertainty concerning this policy, please discuss your concerns with the professor, not with the GSI or reader.

Prerequisites

This course will assume familiarity with:

  • Probability
    • Expectations and conditional expectations
    • Variances and covariances
    • Multivariate distributions
    • The law of large numbers and the central limit theorem
  • Statistics
    • Estimators
    • The bias and variance of an estimator
    • Consistency of estimators
    • Hypothesis testing
  • Multivariate calculus
    • Higher-order derivatives
    • Multivariate derivatives
    • Minimization of functions
    • Lagrange multipliers and constrained optimization
  • Linear algebra
    • Basic linear algebra and systems of equations
    • Matrix rank, invertibility, row spaces and column spaces
    • Vector subspaces and spans
    • Singular value and eigen decompositions

This semester of STAT 154/254 will include labs and projects in the Python language, and familiarity with Python is a prerequisite.

Course website

All of the assignments, lecture notes, and reading will be posted to this course website.

Policies

Course Culture

Students taking STAT 154/254 come from a wide range of backgrounds. We hope to foster an inclusive and supportive learning environment based on curiosity rather than competition. All members of the course community—the instructor, students, tutors, and readers—are expected to treat each other with courtesy and respect.

You will be interacting with course staff and fellow students in several different environments: in class, over the discussion forum, and in office hours. Some of these will be in person, some of them will be online, but the same expectations hold: be kind, be respectful, be professional.

If you are concerned about classroom environment issues created by other students or course staff, please come talk to the instructors about it.

Collaboration policy

You are encouraged to collaborate with your fellow students on problem sets and labs, but the work you turn in should reflect your own understanding and all of your collaborators must be cited. The individual component of quizzes, reading questions, and exams must reflect only your work.

Researchers don’t use one another’s research without permission; scholars and students always use proper citations in papers; professors may not circulate or publish student papers without the writer’s permission; and students may not circulate or post non-public materials (quizzes, exams, rubrics-any private class materials) from their class without the written permission of the instructor.

The general rule: you must not submit assignments that reflect the work of others unless they are a cited collaborator.

The following examples of collaboration are allowed and in fact encouraged!

  • Discussing how to solve a problem with a classmate.
  • Showing your code to a classmate along with an error message or confusing output.
  • Posting snippets of your code to the discussion forum when seeking help.
  • Helping other students solve questions on the discussion with conceptual pointers or snippets of code that doesn’t whole hog give away the answer.
  • Googling the text of an error message.
  • Copying small snippets of code from answers on Stack Overflow.

The following examples are not allowed:

  • Leaving a representation of your assignment (the text, a screenshot) where students (current and future) can access it. Examples of this include websites like course hero, on a group text chain, over discord/slack, or in a file passed on to future students.
  • Accessing and submitting solutions to assignments from other students distributed as above. This includes copying written answers from other students and slightly modifying the language to differentiate it.
  • Searching or using generative AI to produce complete problem solutions.
  • Working on the final exam or individual quizzes in collaboration with other people or resources. These assignments must reflect individual work.
  • Submitting work on an exam that reflects consultation with outside resources or other people. Exams must reflect individual work.

If you have questions about the boundaries of the policy, please ask. We’re always happy to clarify.

Violations of the collaboration policy

The integrity of our course depends on our ability to ensure that students do not violate the collaboration policy. We take this responsibility seriously and forward cases of academic misconduct to the Center for Student Conduct.

Students determined to have violated the academic misconduct policy by the Center for Student Conduct will receive a grade penalty in the course and a sanction from the university which is generally: (i) First violation: Non-Reportable Warning and educational intervention, (ii) Second violation: Suspension/Disciplinary Probation and educational interventions, (iii) Third violation: Dismissal.

Again, if you have questions about the boundaries of the collaboration policy, please ask!

Laptop policy

Laptops will not be permitted in lecture, but will be required for labs.

If you do not have access to a laptop, you can borrow one from the University library. See the UC Berkeley hardware lending program for more details. The Student Technology Equity Program is another good resource. Feel free to contact the instructor if you have concerns about your access to needed technology.

COVID policy

Maintaining your health and that of the Berkeley community is of primary importance to course staff, so if you are feeling ill or have been exposed to illness, please do not come to class. All of the materials used in class will be posted to the course website. You’re encouraged to reach out to fellow students to discuss the class materials or stop by group tutoring or office hours to chat with a tutor or the instructor.

Accomodations for students with disabilities

STAT 154/254 is a course that is designed to allow all students to succeed. If you have letters of accommodations from the Disabled Students’ Program, please share them with your instructor as soon as possible, and we will work out the necessary arrangements.

Note

These course polices are based on a template and text generously shared by Andrew Bray. Thanks, Andrew!