Einsum Visualized

A Swiss army knife of the array operations

Lev Maximov

Published in

Better Programming

12 min readFeb 27, 2023

Swiss army knife: one tool, many functions — Credit: zabo69005/Getty

einsum (=Einstein sum)— a modest but powerful function of NumPy (also of PyTorch, TensorFlow, Dask, etc.)— is a universal tool for manipulating multi-dimensional arrays: it alone is able to do all kinds of sums, multiplications, axes rearrangements, and so on.

Its syntax might seem awkward at first glance, but once you get used to it, it proves to be way more convenient than traditional tools like tensordot or transpose, particularly in non-trivial cases.

Motivation
How it works:
Why Einstein?
Axes Manipulation
Alphabetic ordering of indices
Summing over a non-repeating index
Five Rules of Einsum
Specific use cases:
Multiplying 2D Arrays
Multiplying 3D Arrays
Multiplying a bunch of arrays at once
Ellipsis and ‘keinsum’
Examples:
Example from deep learning: Transformers
Example from physics: N-body problem
Relation to other libraries
Derived work
Conclusion

1. Motivation

Although the primary scope of einsum is 3D and above, it also proves to be a lifesaver — both in terms of speed and clarity— when working with matrices and vectors.

Two examples of higher speeds are:

rewriting an element-wise matrix product a*b*c using einsum provides a 2x performance boost since it optimizes two loops into one
rewriting a linear algebra matrix product a@b@c with einsum can lead to a 100x speedup even for moderately sized (non-square) matrices, as einsum can automatically select the optimal execution path (see details below)

As an example of improved clarity, let’s look at some common linear algebra operations in different dimensions with and without einsum. Here’s a table showing the operations:

Linear algebra operations with and without ‘einsum’

For 1D and 2D, the @ operator wins hands down, but starting from 3D, it becomes apparent that dot, matmul and tensordot are all ill-suited for high dimensions. There must be a…