## COMPSCI 689: Machine Learning – Fall 2021

Offered: 2021

**Course Description:** Machine learning is the computational study of artificial systems that can adapt to novel situations, discover patterns from data, and improve performance with practice. This course will cover the mathematical foundation of supervised and unsupervised learning. The course will provide a state-of-the-art overview of the field, with an emphasis on implementing and deriving learning algorithms for a variety of models from first principles. 3 credits.

**Detailed course topics:** Overview of supervised and unsupervised learning; mathematical foundations of numerical optimization and statistical estimation; maximum likelihood and maximum a posteriori (MAP) estimation; missing data and expectation maximization (EM); graphical models including mixture models, hidden-Markov models; logistic regression and generalized linear models; maximum entropy and undirected graphical models; nonparametric models including nearest neighbor methods and kernel-based methods; and dimensionality reduction methods (PCA and LDA). The course will focus on deriving learning algorithms from first principles and implementing them from scratch.

**Location:**TBA.**Website:**The course website will be hosted on Moodle.

**Textbook:** The course will use *Machine Learning: A Probabilistic Perspective* by Kevin Murphy as the course text. This text is available to UMass students for free through the UMass library.

**Computing: **Access to a relatively modern computer will be required to complete the assignment for the course. The course will use Python as a programming language.

**Required Background:** This course requires a **strong mathematical background** in probability and statistics, multivariate calculus and linear algebra. See below for recommended preparation over the summer.

**What is the difference between COMPSCI 689 and COMPSCI 589?:** 589 was designed to focus on understanding and applying core machine learning models and algorithms. 689 focuses on the mathematical foundations of machine learning with a focus on deriving and implementing machine learning algorithms for novel models from scratch. The course is primarily intended for students interested in pursuing research on machine learning models and algorithms. It focuses on the math-to-code-to-experiments-to-results pipeline needed to take machine learning research ideas from conception to publication. While both 589 and 689 require a background in multivariate calculus, linear algebra, and probability; 689 will use more of this background material than 589.

**Who Should Take COMPSCI 689?: **689 is primarily intended as an AI area core course for doctoral stream students. Undergraduate students should take COMPSCI 589 before applying for an override for COMPSCI 689 without exception. Professional MS students and other graduate students from outside computer science should also take COMPSCI 589 before attempting COMPSCI 689 unless they have a prior undergraduate background in machine learning or an extremely strong background in mathematics, statistics, and programming (for example, an undergraduate degree in mathematical computing).

**What Should I do to Prepare to Take 689?**

*Make sure 689 is the right course for you and this is the right time to take it.*See the suggestions above about 589 vs 689.*Set-up your schedule to accommodate the course.*All students are strongly advised against taking 689 in combination with any other PhD-level core course unless they have extremely strong backgrounds in all areas. You can make-up gaps in background at the same time you learn primary course material, but you will need to be prepared to devote extra time to the course to do so.*Start addressing gaps or weaknesses in you background now.*689 starts with the assumption that you have sufficient background knowledge of linear algebra, vector calculus, multi-variate probability, and Python, and will integrate aspects of these topics together from the outset (e.g., using differential calculus to derive a method for optimizing the parameters of a multi-variate probability density over a vector space and then implementing the method in Python). The course does not cover background topics, but to help you prepare we have assembled a reading list that covers what you need to know to get started in the course. Reviewing all of the material below with a focus on weaker areas is a good strategy for all students. The specific sources below may cover material at a deeper level than is included in some undergrad CS programs (for example, computational complexity of linear algebra operations), so all students may want to at least skim this material.

**Suggested Reading List:**

Covering the math in the order listed below is likely to be most helpful.** **For calculus, Corral or Marsden and Tromba can be used. Marsden and Tromba is more detailed, but Corral will do. All texts are open access or freely available through the UMass Library (links provided), except for Marsden and Tromba. The course’s Piazza site will open at the beginning of the summer to facilitate discussion of background material among students.

- Stephen Boyd. Introduction to Applied Linear Algebra
**.**- Chapter 1: Vectors
- Chapter 2.1: Linear Functions
- Chapter 3: Norm and Distance
- Chapter 5: Linear Independence
- Chapter 6: Matrices
- Chapter 8: Linear Equations (Can skip 8.2)
- Chapter 10: Matrix Multiplication
- Chapter 11: Matrix Inverses

- Stephen Boyd and Lieven Vandenberghe. Convex Optimization. (Covers additional linear algebra background missing from the Applied text)
- Appendix A.1, A.3, A.4, A.5
- Appendix C.1, C.2, C.3, C.4

- Michael Corral. Vector Calculus
- Chapter 1: Vectors in Euclidean Space (1.1 to 1.6, 1.8)
- Chapter 2: Functions of Several Variables (2.1 to 2.5)
- Chapter 3: Double Integrals (3.1, 3.3, 3.4, 3.7)

- Marsden and Tromba. Vector Calculus
- Chapter 1: Geometry of Euclidean Space (1.1, 1.2, 1.3, 1.5)
- Chapter 2: Differentiation (2.1, 2.2, 2.3, 2.5, 2.6)
- Chapter 3: Higher Order Derivatives (3.1, 3.3)
- Chapter 4: Vector Valued Functions (4.1)
- Chapter 5: Double and Triple Integrals (5.1, 5.2, 5.5)

- Bishop. Pattern Recognition and Machine Learning (probability from an ML perspective)
- Chapter 1: Introduction (1.2)
- Chapter 2: Probability Distributions (2.1, 2.2, 2.3, 2.4)

- Murphy. Machine Learning: A Probabilistic Perspective (more probability from an ML perspective)
- Chapter 2: Probability

- Python background (NumPy, SciPy, PyTorch)

- Scipy Lecture Notes: Getting started with Python for science (can skip 1.5.8-1.5.10)
- Scipy Lecture Notes: Optimizing Code
- Scipy Lecture Notes: Scikit-learn: machine learning in Python (will use for some baseline methods, helpful to know APIs)
- PyTorch Tutorials
- PyTorch Documentation