John’s ellipsoid theorem is a clean high-dimensional convex geometry fact that shows up in a lot of different places. Informally, it says:
Every $n$-dimensional symmetric convex body is an ellipsoid, up to a $\sqrt n$ scaling.
This is quite nice and convenient because ellipsoids are basically “$\ell_2$” objects, which makes them a lot easier to do math with than arbitrary convex bodies. There is a long list of applications of John’s theorem that I won’t cover here, but it recently came up for me in three different contexts.
I was reading Thomas Rothvoss’s exposition of Shachar Lovett’s proof of the $\sqrt{\operatorname{rank}}$-type bound for the log-rank conjecture, and John’s theorem is used to prove a certain matrix factorization theorem. John’s theorem (and crucially the contact isotropy lemma used in its proof) also came up in Milman’s proof of Dvoretzky’s theorem too. Independently, I saw this Bizeul–Klartag result in a talk, which roughly says that every convex body is approximately an ellipsoid up to a $\operatorname{polylog}(n)$ factor. In particular: there is an ellipsoid $\mathcal{E}$ such that $99\%$ of $\mathcal{E}$ is contained inside the convex body, and a $\operatorname{polylog}(n)$-dilate of $\mathcal{E}$ contains $99\%$ of the convex body.
Theorem. Let $K \subseteq \mathbb R^n$ be a bounded centrally symmetric convex body. There exists an ellipsoid $\mathcal E$ such that $ \mathcal E \subseteq K \subseteq \sqrt n \mathcal E. $
There is also a non-symmetric version, where the factor becomes $n$ instead of $\sqrt n$, but today let us focus on the symmetric case.
Proof of John’s theorem
The first step of the proof is:
Pick the maximum volume ellipsoid $\mathcal E$ contained in $K$.
Without loss of generality, we may assume that the maximum volume ellipsoid is the Euclidean unit ball by passing to an appropriate linear transformation on $K$ (i.e. the inverse of the transformation mapping the ball to $\mathcal{E}$). The goal thus reduces to proving that for $K$ such that the maximum volume ellipsoid it contains is the unit ball, we have $\|v\|_2 \le \sqrt{n}$ for every $v\in K$.
Let $S_{\mathrm{contact}} = \partial K \cap S^{n-1}$ be the set of contact points between $K$ and the unit ball, i.e., the set of all points of unit norm that are on the boundary of $K$. A key lemma for us is that there is an isotropic distribution over contact points.
Contact isotropy lemma. There is a probability distribution $\mu$ on $S_{\mathrm{contact}}$ such that $ \mathbb{E}_{x \sim \mu} xx^\top = \frac{1}{n}I. $
The contact isotropy lemma also features in applications like Dvoretzky’s theorem. Before proving the lemma, let’s see why it immediately implies John’s theorem.
Take a contact point $x \in S_{\mathrm{contact}}$. Since $B_2^n$ is contained in $K$, and $x$ is a boundary point where the unit ball touches $K$, the halfspace $\{v:\langle v, x\rangle \le 1\}$ contains $K$. Since $K$ is symmetric, $-v \in K$ too, and thus $|\langle v,x\rangle| \leq 1$. Now take any $v \in K$. Using the isotropic distribution $\mu$ from the contact isotropy lemma we may write, $\|v\|_2^2 = \langle v,\mathbb{I}v\rangle = n \mathbb E_{x \sim \mu} \langle v,x\rangle^2 \leq n$.
And so we are done, and the whole game is proving that this isotropic probability distribution exists.
Proof of contact isotropy point lemma
We want to prove that $\frac{1}{n}\mathbb{I}$ is a convex combination of the rank-one matrices $\{xx^\top:x\in S_{\mathrm{contact}}\}$. Checking whether such a convex combination exists can be expressed as an LP:
$$ \text{Does there exist } p\text{ such that }\qquad \frac{1}{n}\mathbb{I} := \sum_i p_i x_i x_i^\top, \qquad p_i \geq 0, \qquad \sum_i p_i = 1, \qquad x_i \in S_{\mathrm{contact}}? $$If yes, we are done.
So suppose the answer is no. We will show that this lets us find a larger volume ellipsoid inside $K$, contradicting maximality of the unit ball.
Let $ \mathcal C := \operatorname{conv}\{xx^\top : x \in S_{\mathrm{contact}}\} $ inside the vector space of symmetric $n \times n$ matrices. Assume for contradiction that $ \frac{1}{n}\mathbb{I} \notin \mathcal C. $ Then there is a hyperplane separating $\frac{1}{n}\mathbb{I}$ from $\mathcal{C}$. This hyperplane can concretely be described by a symmetric matrix $H$ and a real number $a$ such that:
- $\left\langle H,\frac{1}{n}\mathbb{I} \right\rangle = \mathrm{tr}(H) < -\delta$, but
- $\langle H,xx^\top\rangle > \delta$ for every $x \in S_{\mathrm{contact}}$.
Now we use this $H$ to slightly distort the unit ball.
The perturbation
For tiny $t>0$, define $ T_t = \mathbb{I}-tH. $ Consider the ellipsoid $ \mathcal E_t = T_t B_2^n. $ Once we prove:
- $\mathcal E_t$ has larger volume than $B_2^n$.
- $\mathcal E_t \subseteq K$,
we obtain a contradiction to the assumption that $B_2^n$ is the maximum volume ellipsoid contained in $K$, which then completes the proof.
Why does the volume increase?
The volume of $T_tB_2^n$ is
$$ \operatorname{vol}(T_tB_2^n) := |\det(T_t)|\operatorname{vol}(B_2^n). $$For small $t$, $ \det(I-tH) := 1-t\operatorname{Tr}(H)+O(t^2), $ which is $>1$ for all sufficiently tiny $t > 0$. So the ellipsoid $\mathcal E_t$ has larger volume than $B_2^n$.
Why does the perturbed ellipsoid stay inside $K$?
Recall the norm $\|\cdot\|_K$ associated to a symmetric convex body $K$: $ \|y\|_K := \inf\{\lambda>0 : y \in \lambda K\}. $ Since $ B_2^n \subseteq K, $ we have $ \|y\|_K \leq \|y\|_2. $ It suffices to show that for every $v \in \mathbb{S}^{n-1}$, $\|T_t v\|_K \leq 1.$ We split the sphere into two regions: points close to contact points, and points far from contact points.
Close to contact. At every contact point $x$, we know $ x^\top H x \geq \delta. $ By continuity, there is some $\varepsilon>0$ such that whenever $v \in \mathbb{S}^{n-1}$ is within Euclidean distance $\varepsilon$ of $S_{\mathrm{contact}}$, we have $ v^\top H v \geq \frac{\delta}{2}. $ For such $v$, $ \|T_t v\|_K^2 \leq \|T_t v\|_2^2 := \|(\mathbb{I}-tH)v\|_2^2 \ge 1 - t\delta + t^2 \|H\|_{\mathrm{op}}^2. $ For sufficiently tiny $t>0$, this is strictly less than $1$. So near the contact points, the perturbation actually moves the sphere inward.
Far from contact. Now consider the points on the sphere that are at least $\varepsilon$ away from every contact point. This set is compact, and it contains no contact points. Therefore, for every such $v$, $\|v\|_K < 1.$ By compactness, there is some $\eta>0$ such that for all these $v$, $\|v\|_K \leq 1-\eta.$ By the triangle inequality: $\|T_t v\|_K = \|v-tHv\|_K \leq \|v\|_K + t\|Hv\|_K.$
The quantity $\|Hv\|_K$ is uniformly bounded on the sphere, say by $M$. Thus $ \|T_t v\|_K \leq 1-\eta+tM, $ which is less than $1$ for sufficiently tiny $t$. Putting the two regions together, for all $v \in \mathbb{S}^{n-1}$, $T_tB_2^n \subseteq K.$
Remarks
A few random remarks.
It’s natural to wonder whether John’s theorem is tight. It is in fact tight for the $K$ being the $\ell_1$ ball, and the $\ell_{\infty}$ ball.
The above proof actually tells us that we simply want to pick the maximum volume ellipsoid, which makes it amenable to finding with an efficient algorithm. Due to the fact that an ellipsoid $\mathcal{E}$ can be expressed via a positive semidefinite matrix $M$, the volume of $M$ scales as $\det(M)$, and $\log\det(M)$ is actually a concave function, under access the right oracles for $K$, it is possible to find the matrix $M$ corresponding to the optimal ellipsoid.