Sums of squares and the Jacobi theta function

Which numbers can be written as a sum of two squares (of integers)? To narrow the problem down a little bit, which prime numbers can be written as a sum of two squares? Notice that $2$ can be written as the sum of two squares, $2=1^{2}+1^{2}$. Meanwhile $3$ cannot be written as the sum of two squares; since squares are positive, we only need look at the numbers less than $3$ and we can exhaust all possibilities. Going to $5$, we can see that it can once again be written as the sum of two squares, $1^{2}+2^{2}$.

This problem was solved by Fermat, and the answer is that aside from $2$, which we have already resolved, it is precisely the prime numbers which are $1$ mod $4$ which can be written as the sum of two squares. More generally, for numbers which are not necessarily prime, such a number can be written as the sum of two squares if the numbers which are $3$ mod $4$ appear in its prime factorization with an even exponent. Fermat used the method of infinite descent to solve this problem, however, there are many other proofs, and this problem and its many variants have motivated many developments in mathematics. In this post, we will discuss a fascinating method due to Jacobi, which involves the theory of modular forms ( see also Modular Forms).

Before we start discussing the approach of Jacobi let us state another such variant of the problem. Which numbers can be written as the sum of four squares? This question was settled by Lagrange, and it turns out the answer is that all positive integers can be written as the sum of four squares! The approach of Jacobi that we will discuss turns out to solve this problem as well!

Furthermore, the method of Jacobi not only tells us whether a number is a sum of two squares or four squares, but it actually tells us how many ways such a number can be written in that form. For example, we have mentioned earlier that $5$ can be written as $1^{2}+2^{2}$. This is one way to write it as a sum of two squares – there are actually eight such ways:

$\displaystyle 1^{2}+2^{2}$

$\displaystyle (-1)^{2}+2^{2}$

$\displaystyle (1)^{2}+(-2)^{2}$

$\displaystyle (-1)^{2}+(-2)^{2}$

$\displaystyle (2)^{2}+1^{2}$

$\displaystyle (-2)^{2}+1^{2}$

$\displaystyle 2^{2}+(-1)^{2}$

$\displaystyle (-2)^{2}+(-1)^{2}$

In fact, this what Jacobi’s approach actually does – it gives us the number of ways $r_{k}(n)$ to write a number $n$ as the sum of $k$ squares (for the classical problems we mentioned $k=2$ or $k=4$). If the $r_{k}(n)$ is nonzero, then we know that $n$ can be written as a sum of $k$ squares.

Let us now discuss this method of Jacobi. We will streamline the discussion a bit using modern language that was probably not available to Jacobi. It hinges on a very special function $\theta(z)$ on the upper half-plane called the theta function, defined as follows:

$\displaystyle \theta(z)=\sum_{n=-\infty}^{\infty}e^{2\pi i n^{2}z}=\sum_{n=-\infty}^{\infty}q^{n^{2}}$

Here in the second equation we have just chosen to adopt the traditional notation $q=e^{2\pi i z}$. Re-indexing the summation we can also write the theta function as

$\displaystyle \theta(z)=1+\sum_{n=1}^{\infty}2q^{n^{2}}=1+2q+2q^{4}+2q^{9}+\ldots$

The square of the theta function is a modular form of weight $1$, level $\Gamma_{0}(4)$, and character $\chi_{-4}$ (see also Modular Forms). This means that $(\theta(z))^{2}$ is a holomorphic function on the upper half-plane, bounded as the imaginary part of $z$ goes to infinity, and satisfying the transformation law

$\displaystyle \left(\theta\left(\frac{az+b}{cz+d}\right)\right)^{2}=\chi_{-4}(a)(cz+d)(\theta(z))^{2}$

where $\begin{pmatrix}a&b\\c&d\end{pmatrix}$ is an element of $\Gamma_{0}(4)$, the group of $2\times 2$ integer matrices with determinant $1$ and which become upper triangular when the entries are reduced mod $4$ (i.e. $c$ is divisible by $4$), and $\chi_{-4}$ is a function which takes any integer $n$ and outputs $1$ if $n$ is $1$ mod $4$, outputs $-1$ if $n$ is $3$ mod $4$, and outputs $0$ if $n$ is even ($\chi_{-4}$ is an example of a Dirichlet character).

(In the literature the theta function $\theta(z)$ itself is referred to as a “modular form of weight $1/2$“, but we will avoid this terminology in this post to keep things less confusing.)

Now here is what relates the square of the Jacobi to sums of two squares. We can write

$\displaystyle (\theta(z))^{2}=\left(\sum_{a=-\infty}^{\infty}q^{a^{2}}\right)\left(\sum_{b=-\infty}^{\infty}q^{b^{2}}\right)$

Expanding the square of theta function as a Fourier series (again writing $q=e^{2 \pi i z}$) the above equation becomes

$\displaystyle (\theta(z))^{2}=\sum_{n=0}a_{n}q^{n}=\left(\sum_{a=-\infty}^{\infty}q^{a^{2}}\right)\left(\sum_{b=-\infty}^{\infty}q^{b^{2}}\right)$

Now the $n$-th term of this Fourier expansion will receive a contribution from each product of $q^{a^{2}}$ and $q^{b^{2}}$ such that $n=a^{2}+b^{2}$. In other words, the coefficient $a_{n}$ counts how many pairs $(a,b)$ there are such that $n=a^{2}+b^{2}$ – it counts the number of ways $n$ can be written as a sum of two squares! Therefore, the $n$-th Fourier coefficient of $(\theta(z))^{2}$ is just the function $r_{2}(n)$ we mentioned earlier that tells us how many ways there are to write $n$ as a sum of two squares.

More generally, the same argument can be applied to other powers of the theta function. In particular, we can also look at $(\theta(z))^{4}$ and this will tell us about sums of four squares. More precisely, the $n$-th Fourier coefficient of $(\theta(z))^{4}$ is the function $r_{4}(n)$ that tells us how many ways there are to write $n$ as a sum of four squares.

Now we will use results from the theory of modular forms to give us proofs of the theorems of Fermat and Lagrange that we have mentioned earlier.

Modular forms of a certain weight and level form a complex vector space, and the dimension of this vector space can be computed via dimension formulas. In particular, the vector space of modular forms of weight $1$ and level $\Gamma_{0}(4)$ has dimension $1$, which means they are all just complex multiples of each other.

There is another modular form of weight $1$ and level $\Gamma_{0}(4)$ which is well-studied, called the Eisenstein series of weight $1$, level $\Gamma_{0}(4)$, and character $\chi_{-4}$. It is defined as follows:

$\displaystyle G_{1,\chi_{-4}}(z)=\frac{1}{4}+\sum_{n=1}^{\infty}\left(\sum_{d\vert n}\chi_{-4}(d)\right)q^{n}$

From the fact that modular forms of weight $1$ and level $\Gamma_{0}(4)$ form a vector space of dimension $1$, we know that the square of the theta function and this Eisenstein series are just multiples of each other. In fact, from a comparison of the leading terms, we can see that

$(\theta(z))^{2}=4G_{1,\chi_{-4}}(z)$

Therefore, comparing the Fourier expansions, we see that $r_{2}(n)=4(\sum_{d\vert n}\chi_{-4}(d))$. Specializing to when $n$ is a prime, the only divisors of $n$ are $1$ and $n$, and we have $r_{2}(n)=4(1+\chi_{-4}(n))$, which is $8$ when $n$ is $1$ mod $4$, and $0$ when $n$ is $3$ mod $4$, as follows from the definition of $\chi_{-4}$. Therefore this tells us that $n$ is a sum of two squares precisely when $n$ is $1$ mod $4$. With a little more effort, one can see that the formula $r_{2}(n)=4(\sum_{d\vert n}\chi_{-4}(d))$ also tells us that more generally $n$ (even when it is not prime) is a sum of two squares precisely when the prime divisors of $n$ which are $3$ mod $4$ have an even power in its prime factorization.

Let us now look at $(\theta(z))^{4}$ and the problem of writing a number as the sum of four squares. Now $(\theta(z))^{4}$ is actually a modular form of weight $2$ and level $\Gamma_{0}(4)$. This time the vector space of modular forms of weight $2$ and level $\Gamma_{0}(4)$ is a vector space of dimension $2$. So it is not quite as easy as the case of $(\theta(z))^{2}$ and sums of two squares, but we can still find two linearly independent modular forms of weight $2$ and level $\Gamma_{0}(4)$ which will form a convenient basis for us to express $(\theta(z))^{4}$ in terms of.

These modular forms are given by

$\displaystyle G_{2}(z)-2G_{2}(2z)$

and

$\displaystyle G_{2}(2z)-2G_{2}(4z)$

where

$\displaystyle G_{2}(z)=-\frac{1}{24}+\sum_{n=1}^{\infty}\sigma_{1}(n)q^{n}$

is the Eisenstein series of weight $2$ and level $\Gamma=\mathrm{SL}_{2}(\mathbb{Z})$ (here the symbol $\sigma_{1}(n)$ denotes the sum of the positive divisors of $n$ – note also that we are using a different normalization than in Modular Forms for convenience). It turns out that

$\displaystyle (\theta(z))^{4}=8(G_{2}(z)-2G_{2}(2z))+16(G_{2}(2z)-2G_{2}(4z))$

Similar to the earlier case for the sum of two squares, one can now expand both sides in a Fourier expansion and compare Fourier coefficients. It will turn out that $r_{4}(n)$ is equal to $8$ times the sum of the positive divisors of $n$ which are not divisible by $4$. Since there is always going to be such a divisor, this tells us that any positive integer can always be written as the sum of four squares.

We have seen, therefore, that the theory of modular forms can help us understand very classical problems in number theory. The theta function is in fact worthy of a whole entire theory itself – it is connected to many things in mathematics from representation theory to abelian varieties. We will discuss more of these aspects in future posts.

References:

Theta function on Wikipedia

Jacobi’s four-square theorem on Wikipedia

Sum of squares function on Wikipedia

Elliptic modular forms and their applications by Don Zagier