Giter Site home page Giter Site logo

matrixcalc's Introduction

Matrix Calculus for Machine Learning and Beyond

This is the course page for an 18.S096 Special Subject in Mathematics at MIT taught in January 2024 (IAP) by Professors Alan Edelman and Steven G. Johnson.

Lectures: MWF time 11am–1pm, Jan 16–Feb 2 in room 2-131. 3 units, 2 problem sets due Jan 24 and Jan 31 — submitted electronically via Canvas, no exams. TA/grader: TBA.

Piazza forum: TBA

Description:

We all know that calculus courses such as 18.01 and 18.02 are univariate and vector calculus, respectively. Modern applications such as machine learning and large-scale optimization require the next big step, "matrix calculus" and calculus on arbitrary vector spaces.

This class covers a coherent approach to matrix calculus showing techniques that allow you to think of a matrix holistically (not just as an array of scalars), generalize and compute derivatives of important matrix factorizations and many other complicated-looking operations, and understand how differentiation formulas must be re-imagined in large-scale computing. We will discuss reverse/adjoint/backpropagation differentiation, custom vector-Jacobian products, and how modern automatic differentiation is more computer science than calculus (it is neither symbolic formulas nor finite differences).

Prerequisites: Linear Algebra such as 18.06 and multivariate calculus such as 18.02.

Course will involve simple numerical computations using the Julia language. Ideally install it on your own computer following these instructions, but as a fallback you can run it in the cloud here: Binder

Topics:

Here are some of the planned topics:

  • Derivatives as linear operators and linear approximation on arbitrary vector spaces: beyond gradients and Jacobians.
  • Derivatives of functions with matrix inputs and/or outputs (e.g. matrix inverses and determinants). Kronecker products and matrix "vectorization".
  • Derivatives of matrix factorizations (e.g. eigenvalues/SVD) and derivatives with constraints (e.g. orthogonal matrices).
  • Multidimensional chain rules, and the significance of right-to-left ("forward") vs. left-to-right ("reverse") composition. Chain rules on computational graphs (e.g. neural networks).
  • Forward- and reverse-mode manual and automatic multivariate differentiation.
  • Adjoint methods (vJp/pullback rules) for derivatives of solutions of linear, nonlinear, and differential equations.
  • Application to nonlinear root-finding and optimization. Multidimensional Newton and steepest–descent methods.
  • Applications in engineering/scientific optimization and machine learning.
  • Second derivatives, Hessian matrices, quadratic approximations, and quasi-Newton methods.

Lecture 1 (Jan 18)

  • part 1: overview
  • part 2: derivatives as linear operators

matrixcalc's People

Contributors

alanedelman avatar frankschae avatar gaurav-arya avatar mohamed82008 avatar pitmonticone avatar stevengj avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.