More on Galois Deformation Rings

In Galois Deformation Rings we introduced the concept of a Galois deformation ring, and how it is used to prove “R=T” theorems. In this post we will look at a very simple example to help make things more concrete. Then we will explore more about the structure of Galois deformation rings, in particular we want to relate the tangent space of such a Galois deformation ring to the Selmer group in Galois cohomology (which also shows up in a lot of contexts all over arithmetic geometry and number theory).

Let F be a finite extension of \mathbb{Q}, and let k be some finite field, with ring of Witt vectors W(k) (for example if k=\mathbb{F}_{p} then W(k)=\mathbb{Z}_{p}). Let our residual representation \overline{\rho}:\mathrm{Gal}(\overline{F}/F)\to GL_{1}(k) be the trivial representation, i.e. the group acts as the identity. A lift will be a Galois representation \overline{\rho}:\mathrm{Gal}(\overline{F}/F)\to GL_{1}(A), where A is a complete Noetherian algebra over W(k). Then our Galois deformation ring is given by the completed group ring

\displaystyle R _{\overline{\rho}}=W(k)[[\mathrm{Gal}(\overline{F}/F)^{\mathrm{ab,p}}]]

where \mathrm{Gal}(\overline{F}/F)^{\mathrm{ab,p}} means the pro-p completion of the abelianization of the Galois group \mathrm{Gal}(\overline{F}/F). Using local class field theory, we can express this even more explicitly as

\displaystyle R_{\overline{\rho}}=W(k)[\mu_{p^{\infty}}(F)][[X_{1},\ldots,X_{[F:\mathbb{Q}]}]]

Let us now consider a useful fact about the tangent space (see also Tangent Spaces in Algebraic Geometry) of such a deformation ring. Let us first consider the framed deformation ring R _{\overline{\rho}}^{\Box}. It is local, and has a unique maximal ideal \mathfrak{m}. There is only one tangent space, defined to be the dual of \mathfrak{m}/\mathfrak{m^{2}}, but this can also be expressed as the set of its dual number-valued points, i.e. \mathrm{Hom}(R_{\overline{\rho}}^{\Box},k[\epsilon]), which by the definition of the framed deformation functor, is also D_{\overline{\rho}}(k[\epsilon])^{\Box}. Any such deformation must be of the form

\displaystyle \rho(\sigma)=(1+\varepsilon c(\sigma))\overline{\rho}(\sigma)

where c is some n\times n matrix with coefficients in k. If \sigma and \tau are elements of \mathrm{Gal}(\overline{F}/F), if we substitute the above form of \rho into the equation \rho(\sigma\tau)=\rho(\sigma)\rho(\tau) we have

\displaystyle  (1+\varepsilon c(\sigma\tau))\overline{\rho}(\sigma\tau) = (1+\varepsilon c(\sigma))\overline{\rho}(\sigma) (1+\varepsilon c(\tau))\overline{\rho}(\tau)

from which we can see that

\displaystyle  c(\sigma\tau))\overline{\rho}(\sigma\tau) = c(\sigma)\overline{\rho}(\sigma)\overline{\rho}(\tau)+\overline{\rho}(\sigma)c(\tau)\overline{\rho}(\tau)

and, multiplying by \overline{\rho}(\sigma\tau)^{-1}= \overline{\rho}(\tau)^{-1}\overline{\rho}(\sigma)^{-1} on the right,

\displaystyle  c(\sigma\tau))=c(\sigma)(\tau)+c(\tau) \overline{\rho}(\sigma)\overline{\rho}(\sigma)^{-1}

In the language of Galois cohomology, we say that c is a 1-cocycle, if we take the n\times n matrices to be a Galois module coming from the “Lie algebra” of GL_{n}(k). We call this Galois module \mathrm{Ad}\overline{\rho}.

Now consider two different lifts (framed deformations) \rho_{1} and \rho_{2} which give rise to the same deformation of \overline{\rho}. Then there exists some n\times n matrix X such that

\displaystyle \rho_{1}(\sigma)=(1+\varepsilon X)\rho_{2}(\sigma)(1-\varepsilon X)

Plugging in \rho_{1}=(1+\varepsilon c_{1})\overline{\rho} and \rho_{2}=(1+\varepsilon c_{2})\overline{\rho} we obtain

\displaystyle  (1+\varepsilon c_{1})\overline{\rho}=(1+\varepsilon X) (1+\varepsilon c_{2})\overline{\rho}(1-\varepsilon X)

which will imply that

\displaystyle  c_{1}(\sigma)=c_{2}(\sigma)+X-\overline{\rho}(\sigma)X\overline{\rho}(\sigma)^{-1}

In the language of Galois cohomology (see also Etale Cohomology of Fields and Galois Cohomology) we say that c_{1} and c_{2} differ by a coboundary. This means that the tangent space of the Galois deformation ring is given by the first Galois cohomology with coefficients in \mathrm{Ad}\overline{\rho}:

\displaystyle D_{\overline{\rho}}(k[\epsilon])\simeq H^{1}(\mathrm{Gal}(\overline{F}/F),\mathrm{Ad}\overline{\rho})

More generally, when our Galois deformation ring is subject to conditions, it will be given by a subgroup of the first Galois cohomology known as the Selmer group (note that the Selmer group shows up in many places in arithmetic geometry and number theory, for instance, in the proof of the Mordell-Weil theorem where the Galois module used comes from the torsion points of an elliptic curve – in this post we are considering the case where the Galois module is \mathrm{Ad}\overline{\rho}, as stated earlier). The advantage of expressing the tangent space in the language of Galois deformation ring using Galois cohomology is that in Galois cohomology there are certain formulas such as Tate duality and the Euler characteristic formula that we can use to perform computations.

Finally to end this post we remark that under certain conditions (namely that for every open subgroup H of \mathrm{Gal}(\overline{F}/F) the space of continuous homomorphisms from H to \mathbb{F}_{p} has finite dimension) this tangent space is going to be a finite-dimensional vector space over k. Then the Galois deformation ring has the following form

\displaystyle R_{\overline{\rho}}=W(k)[[x_{1},\ldots,x_{g}]]/(f_{1},\ldots,f_{r})

i.e. it is a quotient of a W(k)-power series in g variables, where the number g is given by the dimension of H^{1}(\mathrm{Gal}(\overline{F}/F),\mathrm{Ad}\overline{\rho}) as a k-vector space, while the number of relations r is given by the dimension of H^{2}(\mathrm{Gal}(\overline{F}/F),\mathrm{Ad}\overline{\rho}) as a k-vector space.

Knowing the structure of Galois deformation rings is going to be important in proving R=T theorems, since such proofs often reduce to commutative algebra involving these rings. More details will be discussed in future posts on this blog.

References:

Group cohomology on Wikipedia

Galois cohomology on Wikipedia

Selmer group on Wikipedia

Tate duality on Wikipedia

Modularity Lifting Theorems by Toby Gee

Modularity Lifting (Course Notes) by Patrick Allen

Motives and L-functions by Frank Calegari

Beyond the Taylor-Wiles Method by Jack Thorne

Galois Deformation Rings

In Galois Representations we talked about obtaining continuous Galois representations for example from the \ell-adic etale cohomology of algebraic varieties, and hinted at being able to obtain such Galois representations from modular forms as well. While we postpone the discussion of how to obtain such a Galois representation to some future blog post (hopefully), we now mention the very important topic of modularity – which investigates, given some Galois representation, whether it comes from a modular form, and furthermore whether it provides some other information about the modular form that it comes from.

The topic of modularity is composed of two parts. The first is residual modularity – where we are given a Galois representation over a finite field (we call such a Galois representation a residual representation, in reference to the finite field being the residue field of some other ring) and figure out whether it comes from a modular form (in which case we also say that it is modular). The second part is modularity lifting, where, given a residual representation we know to be modular, we figure out whether it “lifts” to a Galois representation over \mathbb{Q}_{\ell}.

In this post, we focus only on one small ingredient of the approach to proving modularity lifting. Proofs of modularity lifting rely on “R=T” theorems, where R refers to a Galois deformation ring and T comes from a (localization of) a Hecke algebra (see also Hecke Operators). The small ingredient we will focus on in this post is the R, the Galois deformation ring.

A “deformation” in our context is an equivalence class of “lifts” and before we give the precise definitions we give a little bit of intuition about why we are interested in lifts. Roughly, in our context, a lift of some field \overline{R} is a local ring R such that \overline{R} is the residue field of R, i.e. \overline{R}=R/\mathfrak{m} where \mathfrak{m} is the unique maximal ideal of R (since R is a local ring by definition it has a unique maximal ideal).

So now for the intuition. Consider the real numbers \mathbb{R}. The “dual numbers” are defined to be \mathbb{R}[x]/(x^{2}). Its elements are of the form a+bx where a and b are real numbers. We can consider x here to be an “infinitesimal element”. So we may think of an element of the dual numbers to be a number, given by a, but with a “tangent vector” given by the number b. Another way to think about it is that is at “position a“, but it also has a “velocity b“. It’s like numbers, but with a little “wiggle”. Now that we know about the dual numbers \mathbb{R}[x]/(x^{2}), what about elements of \mathbb{R}[x]/(x^{3})? We may think of such an element, which is of the form a+bx+cx^{2}, to be a position “a“, with “velocity b“, and “acceleration c“, a kind of “higher wiggle”.

If we continue including higher and higher derivatives, then we have something whose elements are formal power series a+bx+cx^2+dx^3+\ldots. This is the ring \mathbb{R}[[x]], which is the inverse limit of the rings \mathbb{R}/(x^{n}). Now the ring \mathbb{R}[[x]] is a local ring with maximal ideal (x), and modding out by this maximal ideal gives \mathbb{R}. So this power series ring is a lift of \mathbb{R}, kind of numbers with “higher wiggles”. This is what the term “deformation” is supposed to bring to mind.

We now give more precise definitions. Let F be a finite extension of \mathbb{Q}, and let k be a finite field. A Galois representation \overline{\rho}:\text{Gal}(\overline{F}/F)\to \text{GL}_{2}(k) is also called a residual representation. Now let W(k) be the ring of Witt vectors of k; for example, if k=\mathbb{F}_{p}, then W(k)=\mathbb{Z}_{p}. A lift, or framed deformation of the residual representation \overline{\rho} is a Galois representation \overline{\rho}:\text{Gal}(\overline{\mathbb{Q}}/\mathbb{Q})\to \text{GL}_{n}(A) where A is a complete Noetherian local W(k)-algebra, such that modding out by the unique maximal ideal of A gives the residual representation \overline{\rho}. A deformation of \overline{\rho} is an equivalence class of lifts of \overline{\rho}, where two lifts are considered equivalent if they are conjugates under the kernel of the modding out map.

Consider the functor \text{Def}_{\overline{\rho}}^{\Box} from the category of complete Noetherian local W(k)-algebras to the category of sets, which assigns to a complete Noetherian local W(k)-algebra A the set of all its lifts. This functor happens to be representable, i.e. there is a Galois representation \overline{\rho}:\text{Gal}(\overline{F}/F)\to \text{GL}_{n}(R_{\overline{\rho}}^{\Box}) over some ring R_{\overline{\rho}}^{\Box} called the universal framed deformation ring, such that the lifts of \overline{\rho} are given by maps from the Galois deformations to the universal Galois deformation.

We can also do the same for deformations instead of framed deformations, as long as our residual representation satisfies a condition called “Schur’s condition”.

We can also impose conditions on our deformations – for instance, we may want to consider only lifts with a certain fixed determinant. These conditions are also called deformation problems and they are important because it is conjectured that Galois representations coming from modular forms have certain properties, and we want to match up these Galois representations with modular forms.

Roughly, the way these are matched up goes in the following manner. We have said above that deformations of a certain fixed Galois representation \overline{\rho} to A, possibly with some conditions, correspond to maps R_{\overline{\rho},\mathrm{conditions}}\to A. We state that, given an isomorphism between the complex numbers and the p-adic complex numbers we can always construct a map R_{\overline{\rho}, \mathrm{conditions} }\to \mathbb{C} from the preceding map.

Now a Hecke algebra \mathbb{T} acts on Hecke eigenforms (which say we want to match up with the Galois representations, to show that these Galois representations come from them) and therefore have associated systems of eigenvalues. It is known that any such system of eigenvalues comes from some Hecke eigenform.

We choose only a localization of the Hecke algebra, which we call \mathbb{T}_{\mathfrak{m}} , corresponding to only the modular forms that are expected to give rise to the Galois representations we are considering (the Eichler-Shimura theorem gives relations between the Fourier coefficients of the Hecke eigenform and the form of the characteristic polynomial of the Frobenius under the Galois representation, restricting it). On the other hand, these systems of eigenvalues corresponds to maps \mathbb{T}_{\mathfrak{m}}\to \mathbb{C}.

So if we can show that R_{\overline{\rho}, \mathrm{conditions} }=\mathbb{T}_{\mathfrak{m}}, then these two sets of maps to \mathbb{C} match up, then we can show that these Galois representations come from modular forms. Showing that R_{\overline{\rho}, \mathrm{conditions} }=\mathbb{T}_{\mathfrak{m}} is itself an elaborate process that involves a fascinating strategy pioneered by Richard Taylor and Andrew Wiles known as patching. We will hopefully discuss R=T theorems, and the method of patching, on this blog in more detail in the future.

References:

Deformation on Wikipedia

Modularity Lifting Theorems by Toby Gee

Modularity Lifting (Course Notes) by Patrick Allen

Motives and L-functions by Frank Calegari

Beyond the Taylor-Wiles Method by Jack Thorne

Perturbations, Deformations, and Variations (and “Near-Misses”) in Geometry, Physics, and Number Theory by Barry Mazur

Galois Representations

The absolute Galois group \text{Gal}(\overline{\mathbb{Q}}/\mathbb{Q}) is one of the most important objects of study in mathematics. However the direct study of this group is very difficult; for instance it is an infinite group, and we know very little about it. To make it easier for us, we will often instead study representations of this group – i.e. group homomorphisms to the group \text{GL}(V) of linear transformations of some vector space V over some field F. When V has finite dimension n, \text{GL}(V) is just \text{GL}_{n}(F), the group of n\times n matrices with entries in F and nonzero determinant. Often we will also want the field F to carry a topology – this will also endow \text{GL}_{n}(F) with a topology. For instance, if F is the p-adic numbers \mathbb{Q}_{p} it has a p-adic topology (see also Valuations and Completions). Since \text{Gal}(\overline{\mathbb{Q}}/\mathbb{Q}) has its own topology, we can then talk about representations which are continuous. In this post we shall consider three examples of these continuous Galois representations.

Our first example of a Galois representation is known as the p-adic cyclotomic character. This is a one-dimensional representation over the p-adic numbers \mathbb{Q}_{p}, i.e. a group homomorphism from \text{Gal}(\overline{\mathbb{Q}}/\mathbb{Q} to \text{GL}_{1}(\mathbb{Q}_{p}), which also happens to just be the multiplicative group \mathbb{Q}_{p}^{\times}. Let us explain how to obtain this Galois representation.

Consider a primitive p^{n}-th root of unity \zeta_{p^{n}}. Any element \sigma of \text{Gal}(\overline{\mathbb{Q}}/\mathbb{Q}) acts on \zeta_{p^{n}} and sends it to some p^{n}-th root of unity, which amounts to raising it to some integer power between 1 and p^{n}-1, i.e. an element of (\mathbb{Z}/p^{n}\mathbb{Z})^{\times}. We now define the p-adic cyclotomic character \chi to be the map from \text{Gal}(\overline{\mathbb{Q}}/\mathbb{Q}) to \mathbb{Z}_{p}^{\times} which sends the element \sigma to the element of \mathbb{Z}_{p}^{\times} which after modding out by p^{n} is precisely the integer power to which we raised \zeta_{p^{n}}.

Our second example of a Galois representation is known as the Tate module of an elliptic curve. We recall that we also discussed an example of a Galois representation coming from the p-torsion points of an elliptic curve in Elliptic Curves. The Tate module is a way to package the action of the Galois group not only the p-torsion points but also the p^{n}-torsion for any n, by taking an inverse limit over n. Now the p^{n}-torsion points are isomorphic to (\mathbb{Z}/p^{n}\mathbb{Z})^{2}, so the inverse limit is going to be isomorphic to \mathbb{Z}_{p}^{2}. This is not a vector space, since \mathbb{Z}_{p} is not a field, so we take the tensor product with \mathbb{Q}_{p} to get \mathbb{Q}_{p}^{2}, which is a vector space. Therefore we get a Galois representation, i.e. a homomorphism from \text{Gal}(\overline{\mathbb{Q}}/\mathbb{Q}) to \text{GL}_{2}(\mathbb{Q}_{p}). This construction also works for abelian varieties – higher dimensional analogues of elliptic curves – except that the Tate module is now 2g-dimensional, where g is the dimension of the abelian variety.

Our last example of a Galois representation is given by the \ell-adic cohomology (explanation of this terminology to come later) of a smooth proper algebraic variety X over \mathbb{Q}. This is the inverse limit over n of the etale cohomology (see also Cohomology in Algebraic Geometry) of X with coefficients in the constant sheaf \mathbb{Z}/p^{n}\mathbb{Z}. These etale cohomology groups are somewhat confusingly denoted H^{i}(X,\mathbb{Z}_{p}) – note that they are not the etale cohomology of X with \mathbb{Z}_{p} coefficients! Just as in the case of the Tate module, we take the tensor product with \mathbb{Q}_{p} to produce our Galois representation.

These Galois representations coming from the \ell-adic cohomology somewhat subsume the Tate modules discussed earlier – that is because, if X is an elliptic curve or more generally an abelian variety, we have that the \mathbb{Q}_{p}-linear maps from the Tate module (tensored with \mathbb{Q}_{p}) is isomorphic to the first \ell-adic cohomology H_{1}(X,\mathbb{Z}_{p})\otimes\mathbb{Q}_{p}. We say that the first \ell-adic cohomology is the dual of the Tate module.

Although we discussed representations over \text{Gal}(\overline{\mathbb{Q}}/\mathbb{Q}) in this post, it is also often useful to make our study “local” and focus on a single prime \ell, and study \text{Gal}(\overline{\mathbb{Q}}_{\ell}/\mathbb{Q}_{\ell}) instead. In this case we might as well just have replaced \text{Gal}(\overline{\mathbb{Q}}/\mathbb{Q}) with \text{Gal}(\overline{\mathbb{Q}}_{\ell}/\mathbb{Q}_{\ell}) in the above discussion, and nothing really changes, as long as the primes \ell and p are different primes. In the case that they are the same prime, things become much more complicated (and the theory is far richer)!

Note: Usually, when discussing “local” Galois representations, the notation for the primes p and \ell are switched! In other words, our local Galois representations are group homomorphisms from \text{Gal}(\overline{\mathbb{Q}}_{p}/\mathbb{Q}_{p}) to \text{GL}_{n}(\mathbb{Q}_{\ell}). This is the reason for the terminology “\ell-adic cohomology”. Since we started out just discussing “global” Galois representations, I switched the notation to use p instead for the only instances were we needed a prime. Hopefully this is not overly confusing. We can also study Galois representations more generally for number fields (“global”) and finite extensions of \mathbb{Q}_{p} (“local”).

Finally, although we stated above that we will only discuss three examples here, let us mention a fourth example: Galois representations can also come from modular forms (see also Modular Forms). To discuss these Galois representations would require us to develop some more machinery first, so we leave this to future posts for now.

References:

Cyclotomic character on Wikipedia

Tate module on Wikipedia

Etale cohomology on Wikipedia

Reciprocity laws and Galois representations: recent breakthroughs by Jared Weinstein

Hecke Operators

A Hecke operator is a certain kind of linear transformation on the space of modular forms or cusp forms (see also Modular Forms) of a certain fixed weight k. They were originally used (and now named after) Erich Hecke, who used them to study L-functions (see also Zeta Functions and L-Functions) and in particular to determine the conditions for whether an L-series \sum_{n=1}^{\infty}a_{n}n^{-s} has an Euler product. Together with the meromorphic continuation and the functional equation, these are the important properties of the Riemann zeta function, which L-functions are supposed to be generalizations of. Hecke’s study was inspired by the work of Bernhard Riemann on the zeta function.

An example of a Hecke operator is the one commonly denoted T_{p}, for p a prime number. To understand it conceptually, we must take the view of modular forms as functions on lattices. This is equivalent to the definition of modular forms as functions on the upper half-plane, if we recall that a lattice \Lambda can also be expressed as \mathbb{Z}+\tau\mathbb{Z} where \tau is a complex number in the upper half-plane (see also The Moduli Space of Elliptic Curves).

In this view a modular form is a function on the space of lattices on \mathbb{C} such that

  • f(\mathbb{Z}+\tau\mathbb{Z}) is holomorphic as a function on the upper half-plane
  • f(\mathbb{Z}+\tau\mathbb{Z}) is bounded as \tau goes to i\infty
  • f(\mu\Lambda)=\mu^{-k}f(\Lambda) for some nonzero complex number \mu, and k is the weight of the modular form 

Now we define the Hecke operator T_{p} by what it does to a modular form f(\Lambda) of weight k as follows:

\displaystyle T_{p}f(\Lambda)=p^{k-1}\sum_{\Lambda'\subset \Lambda}f(\Lambda')

where \Lambda' runs over the sublattices of \Lambda of index p. In other words, applying T_{p} to a modular form gives back a modular form whose value on a lattice \Lambda is the sum of the values of the original modular form on the sublattices of \Lambda  of index p, times some factor that depends on the Hecke operator and the weight of the modular form.

Hecke operators are also often defined via their effect on the Fourier expansion of a modular form. Let f(\tau) be a modular form of weight k whose Fourier expansion is given by \sum_{n=0}^{\infty}a_{i}q^{i}, where we have adopted the convention q=e^{2\pi i \tau} which is common in the theory of modular forms (hence this Fourier expansion is also known as a q-expansion). Then the effect of a Hecke operator T_{p} is as follows:

\displaystyle T_{p}f(\tau)=\sum_{n=0}^{\infty}(a_{pn}+p^{k-1}a_{n/p})q^{n}

where a_{n/p}=0 when p does not divide n. To see why this follows from our first definition of the Hecke operator, we note that if our lattice is given by \mathbb{Z}+\tau\mathbb{Z}, there are p+1 sublattices of index p: There are p of these sublattices given by p\mathbb{Z}+(j+\tau)\mathbb{Z} for j ranging from 0 to p-1, and another one given by \mathbb{Z}+(p\tau)\mathbb{Z}. Let us split up the Hecke operators as follows:

\displaystyle T_{p}f(\mathbb{Z}+\tau\mathbb{Z})=p^{k-1}\sum_{j=0}^{p-1}f(p\mathbb{Z}+(j+\tau)\mathbb{Z})+p^{k-1}f(\mathbb{Z}+p\tau\mathbb{Z})=\Sigma_{1}+\Sigma_{2}

where \Sigma_{1}=p^{k-1}\sum_{j=0}^{p-1}f(p\mathbb{Z}+(j+\tau)\mathbb{Z}) and \Sigma_{2}=p^{k-1}f(\mathbb{Z}+p\tau\mathbb{Z}). Let us focus on the former first. We have

\displaystyle \Sigma_{1}=p^{k-1}\sum_{j=0}^{p-1}f(p\mathbb{Z}+(j+\tau)\mathbb{Z})

But applying the third property of modular forms above, namely that f(\mu\Lambda)=\mu^{-k}f(\Lambda) with \mu=p, we have

\displaystyle \Sigma_{1}=p^{-1}\sum_{j=0}^{p-1}f(\mathbb{Z}+((j+\tau)/p)\mathbb{Z})

Now our argument inside the modular forms being summed are in the usual way we write them, except that instead of \tau we have ((j+\tau)/p), so we expand them as a Fourier series

\displaystyle \Sigma_{1}=p^{-1}\sum_{j=0}^{p-1}\sum_{n=0}^{\infty}a_{n}e^{2\pi i n((j+\tau)/p)}

We can switch the summations since one of them is finite

\displaystyle \Sigma_{1}=p^{-1}\sum_{n=0}^{\infty}\sum_{j=0}^{p-1}a_{n}e^{2\pi i n((j+\tau)/p)}

The inner sum over j is zero unless p divides n, in which case the sum is equal to p. This gives us

\displaystyle \Sigma_{1}=p^{-1}\sum_{n=0}^{\infty}a_{pn}q^{n}

where again q=e^{2\pi i \tau}. Now consider \Sigma_{2}. We have

\displaystyle \Sigma_{2}=p^{k-1}f(\mathbb{Z}+p\tau\mathbb{Z})

Expanding the right hand side into a Fourier series, we have

\displaystyle \Sigma_{2}=p^{k-1}\sum_{n}a_{n}e^{2\pi i n p\tau}

Reindexing, we have

\displaystyle \Sigma_{2}=p^{k-1}\sum_{n}a_{n/p}q^{n}

and adding together \Sigma_{1} and \Sigma_{2} gives us our result.

The Hecke operators can be defined not only for prime numbers, but for all natural numbers, and any two Hecke operators T_{m} and T_{n} commute with each other. They preserve the weight of a modular form, and take cusp forms to cusp forms (this can be seen via their effect on the Fourier series). We can also define Hecke operators for modular forms with level structure, but it is more complicated and has some subtleties when for the Hecke operator T_{n} we have n sharing a common factor with the level.

If a cusp form f is an eigenvector for a Hecke operator T_{n}, and it is normalized, i.e. its Fourier coefficient a_{1} is equal to 1, then the corresponding eigenvalue of the Hecke operator T_{n} on f is precisely the Fourier coefficient a_{n}.

Now the Hecke operators satisfy the following multiplicativity properties:

  • T_{m}T_{n}=T_{mn} for m and n mutually prime
  • T_{p^{n}}T_{p}=T_{p^{n+1}}+p^{k-1}T_{p} for p prime

Suppose we have an L-series \sum_{n}a_{n}n^{-s}. This L-series will have an Euler product if and only if the coefficients a_{n} satisfy the following:

  • a_{m}a_{n}=a_{mn} for m and n mutually prime
  • a_{p^{n}}T_{p}=a_{p^{n+1}}+p^{k-1}a_{p} for p prime

Given that the Fourier coefficients of a normalized Hecke eigenform (a normalized cusp form that is a simultaneous eigenvector for all the Hecke operators) are the eigenvalues of the Hecke operators, we see that the L-series of a normalized Hecke eigenform has an Euler product.

In addition to the Hecke operators T_{n}, there are also other closely related operators such as the diamond operator \langle n\rangle and another operator denoted U_{p}. These and more on Hecke operators, such as other ways to define them with double coset operators or Hecke correspondences will hopefully be discussed in future posts.

References:

Hecke Operator on Wikipedia

Modular Forms by Andrew Snowden

Congruences between Modular Forms by Frank Calegari

A First Course in Modular Forms by Fred Diamond and Jerry Shurman

Advanced Topics in the Arithmetic of Elliptic Curves by Joseph H. Silverman

Iwasawa theory, p-adic L-functions, and p-adic modular forms

In Bernoulli Numbers, Fermat’s Last Theorem, and the Riemann Zeta Function, we introduced the Kubota-Leopold p-adic L-function, which encodes the congruences discovered by Kummer between special values of the Riemann zeta function. In this post, we will connect them to Iwasawa theory and p-adic modular forms.

Let us start with a little introduction to Iwasawa theory. Consider the Galois group \text{Gal}(\mathbb{Q}(\mu_{p^{\infty}})/\mathbb{Q}), where \mathbb{Q}(\mu_{p^{\infty}}) is the extension of the rational numbers \mathbb{Q} obtained by adjoining all the p-th-power roots of unity to \mathbb{Q}. This Galois group is isomorphic to \mathbb{Z}_{p}^{\times}, the group of units of the p-adic integers \mathbb{Z}_p.

The group \mathbb{Z}_{p}^{\times} decomposes into the product of a group isomorphic to 1+p\mathbb{Z}_{p} and a group isomorphic to (p-1)-th roots of unity. Let \Gamma be the subgroup of this Galois group isomorphic to 1+p\mathbb{Z}_{p}. The Iwasawa algebra is defined to be the group ring \mathbb{Z}_{p}[[\Gamma]], which also happens to be isomorphic to the power series ring \mathbb{Z}_{p}[[T]].

The interest in the Iwasawa algebra comes from the fact that many important objects of interest in number theory are modules over the Iwasawa algebra, and such modules have a structure that makes them easy to study. For instance, the inverse limit of the p-part of the ideal class groups of cyclotomic fields is such a module. The “main conjecture of Iwasawa theory“, a high-powered version of Kummer’s theorem that relates ideal class groups and Bernoulli numbers, describes this module. Namely, the main conjecture of Iwasawa theory states that as a module over the Iwasawa algebra, the inverse limit of the p-part of the ideal class groups of cyclotomic fields has a characteristic ideal generated by none other than the Kubota-Leopoldt p-adic L-function!

Let us describe more the relation between the Iwasawa algebra and the Kubota-Leopoldt zeta function by relating them to measures. Our measure here takes functions on the group \mathbb{Z}_p^{\times} and gives an element of \mathbb{Z}_{p}. This should remind us of measures and integrals in real analysis, except instead of our functions being on \mathbb{R}, they are on the group \mathbb{Z}_{p}^{\times}, and instead of taking values in \mathbb{R}, they take values in \mathbb{Z}_{p}. This is just an example of a more general kind of measure.

Now these measures are actually in one-to-one correspondence with the elements of the Iwasawa algebra!

The Iwasawa algebra is \mathbb{Z}_{p}[[\Gamma]], and note that \Gamma is a subset of \mathbb{Z}_{p}^{\times}. Suppose we have an element of the Iwasawa algebra. We define the corresponding measure by saying what it does to a function f on \mathbb{Z}_{p}^{\times}. Note that if we extend this function linearly, we can evaluate it on the element of the Iwasawa algebra and get an element of \mathbb{Z}_{p}^{\times}. Thus we define our measure by evaluation. The other direction is a bit more involved, but given the measure, we build an element of the Iwasawa algebra by exploiting the profinite nature of \mathbb{Z}_{p}^{\times}, which means the measure was built from functions on the finite pieces of it.

Now we know how the Iwasawa algebra and measures are related, what about the Kubota-Leopoldt zeta function? For those we must now take a detour through p-adic modular forms, in particular p-adic Eisenstein series.

The reason modular forms are brought into this is that the value of the zeta function at 1-k shows up in the constant term in the Fourier expansion of the Eisenstein series G_{k}:

\displaystyle G_{k}(\tau):=\frac{\zeta(1-k)}{2}+\sum_{n=1}^{\infty}\left(\sum_{d\vert n}d^{k-1}\right)q^{n}

where q=e^{2\pi i \tau}, as is common convention in the theory (hence the Fourier expansion is also known as the q-expansion). This Eisenstein series G_{k} is a modular form of weight k. A similar relationship holds between the Kubota-Leopoldt p-adic L-function and p-adic Eisenstein series, the latter of which is an example of a p-adic modular form. We will define this now. Let f be a modular form defined over \mathbb{Q}. This means that, when we consider its Fourier expansion

\displaystyle f(\tau)=\sum_{n=0}^{\infty}a_{n}q^{n},

the coefficients a_{n} are rational numbers. We define a p-adic valuation on the space of modular form by taking the biggest power of p among the coefficients a_{n}, i.e.

\displaystyle v_{p}(f)=\inf_{n} v_{p}(a_{n})

We recall that the bigger the power of p dividing a rational number, the smaller its p-adic valuation. This lets us consider the limit of a sequence. A p-adic modular form is the limit of a sequence of classical modular forms.

The weight of a p-adic modular form is the limit of the weights of the classical ones of which it is the limit. Serre showed that for classical modular forms f and g, if the p-adic valuation

\displaystyle v(f-g)>=v(f)+m

for some m, then the weights of f and g will be congruent mod (p-1)p^m.

This implies that the weight of a p-adic modular form takes values in the inverse limit of \mathbb{Z}/(p-1)p^{m}\mathbb{Z}, which is isomorphic to the product of \mathbb{Z}_{p} and (p-1)\mathbb{Z}. Here is where measures come in – this space of weights can be identified with characters of \mathbb{Z}_{p}^{\times}, i.e. a weight k is a function on \mathbb{Z}_{p}^{\times} and being such a function, it is an input for a measure!

Now, we will create a measure, with a bit of a twist. Given a weight k, we can build a p-adic Eisenstein series of weight k (recall that this is a limit of classical Eisenstein series):

\displaystyle G_{k}^{*}:=\varinjlim_{i}G_{k_{i}}

We think of this as a “measure” that takes a weight k (again recall that the weight k is a character, i.e. a function on \mathbb{Z}_{p}) and gives a weight k Eisenstein series, i.e an “Eisenstein measure“. But the value of the Kubota-Leopoldt zeta function at 1-k is the constant in the Fourier expansion! Therefore, if we take the constant term of this p-adic Eisenstein series, we have a good old measure, a recipe for taking a function on \mathbb{Z}_{p} (the weight k) and giving us an element of \mathbb{Z}_{p}. But by our earlier discussion, this is an element of the Iwasawa algebra!

There are some subtleties I swept under the rug, but to summarize – important objects in number theory are modules over the Iwasawa algebra. p-adic L-functions which interpolate L-functions at special values are elements of the Iwasawa algebra.

This is a modern, high-powered version of Kummer’s discovery that relates certain ideal class groups and Bernoulli numbers (which are special values of the Riemann zeta function). The Eisenstein measure, which gives a p-adic modular form when evaluated at a certain weight, leads to the notion of a “Hida family“, a “p-adic family” of p-adic modular forms. But that discussion is for another time!

References:

Iwasawa theory on Wikipedia

Iwasawa algebra on Wikipedia

p-adic L-function on Wikipedia

Main conjecture of Iwasawa theory on Wikipedia

An introduction to Eisenstein measures by E. E. Eischen

Modular curves and cyclotomic fields by Romyar Sharifi

Desde Fermat, Lamé y Kummer hasta Iwasawa: Una introducción a la teoría de Iwasawa (in Spanish) by Álvaro Lozano-Robledo

Bernoulli Numbers, Fermat’s Last Theorem, and the Riemann Zeta Function

The Bernoulli numbers are the Taylor series coefficients of the function

\displaystyle \frac{x}{e^{x}-1}.

The n-th Bernoulli number B_{n} is zero for odd n, except for n=1, where it is equal to -1/2. For the first few even numbers, we have

\displaystyle B_0=1,\; B_{2}=\frac{1}{6}, \; B_{4}=-\frac{1}{30}, \; B_6=\frac{1}{42}, \; B_{8}=-\frac{1}{30}, \; B_{10}=\frac{5}{66}.

Bernoulli numbers have many interesting properties, and many mathematicians have studied them for a very long time. They are named after Jacob Bernoulli, but were also studied by Seki Takakazu in Japan at around the same time (end of 17th/beginning of 18th century). In this post I want to focus more on the work of Ernst Edouard Kummer, more than a century after Bernoulli and Takakazu.

We’re going to come back to Bernoulli numbers later, but for now let’s talk about something completely different – Fermat’s Last Theorem, which Kummer was working on. In the time of Kummer, a proposal to study Fermat’s Last Theorem by factoring both sides of the famous equation into linear terms. Just as x^2+y^2 factors into

\displaystyle x^2+y^2=(x+iy)(x-iy),

we would have that x^{p}+y^{p} also factors into

\displaystyle x^{p}+y^{p}=(x+\zeta_{p}y)(x+\zeta_{p}^{2} y)...(x+\zeta_{p}^{p-1} y)

where \zeta_{p} is a p-th root of unity.

However, there is a problem. In these kinds of numbers where p-th roots of unity are adjoined, factorization may not be unique! Hence Kummer developed the theory of “ideals” to study this (see also The Fundamental Theorem of Arithmetic and Unique Factorization).

Unique factorization does not work with the numbers themselves, but it works with ideals (this is true for number fields, since they form what is called a “Dedekind domain”). Hence the original name of ideals was “ideal numbers”. To number fields we associate an “ideal class group“. If this group has only one element, unique factorization holds. If not, then things can get complicated. The ideal class group (together with the Galois group) is probably the most important group in number theory.

Kummer found that if p is a “regular prime“, i.e. if p does not divide the number of elements of the ideal class group (also known as the class number) of the “p-th cyclotomic field” (the rational numbers with p-th roots of unity adjoined), then Fermat’s Last Theorem is true for p.

Let’s go back to Bernoulli numbers now – Kummer also found that a prime p is regular if and only if it does not divide the numerator for the nth Bernoulli number, for all n less than p-1. In other words, Kummer proved Fermat’s Last Theorem for prime exponents not dividing the numerators of Bernoulli numbers! Fermat’s Last Theorem has now been proved in all cases, but the work of Kummer remains influential.

So we’ve related Bernoulli numbers to ideal class groups and the very famous Fermat’s Last Theorem. Now let us relate Bernoulli numbers to another very famous thing in math – the Riemann zeta function (see also Zeta Functions and L-Functions).

It is known that the Bernoulli numbers are related to values of the Riemann zeta function at the negative integers (so we need the analytic continuation to do this) by the following equation: B_n=n \zeta(1-n) for n greater than or equal to 1.

Now, Kummer also discovered that Bernoulli numbers satisfy certain congruences modulo powers of a prime p, in particular

\displaystyle \frac{B_{m}}{m}\equiv \frac{B_{n}}{n} \mod p

where m and n are positive even integers neither of which are divisible by (p-1), and m\equiv n \mod (p-1). Here congruence for two rational numbers \frac{a}{b} and \frac{c}{d} means that ad is congruent to cd mod p.

We also have a more general congruence for bigger powers of p:

\displaystyle (1-p^{m-1})\frac{B_{m}}{m}\equiv (1-p^{n-1})\frac{B_{n}}{n} \mod p^{a+1}

where m and n are positive even integers neither of which are divisible by (p-1), and m\equiv n \mod \varphi(p^{a}+1), \varphi^{a}+1 being the number of positive integers less than p^{a+1} that are also mutually prime to it.

By by our earlier discussion, this means the special values of the Riemann zeta function also satisfy congruences modulo powers of p.

Congruences modulo powers of p is encoded in modern language by the “p-adic numbers” (see also Valuations and Completions) introduced by Kurt Hensel near the end of the 19th century. The congruences between the special values of the Riemann zeta function is now similarly encoded in a p-adic analytic function known as the Kubota-Leopoldt p-adic L-function.

So again, to summarize the story so far – Bernoulli numbers are related to the ideal class group and also to the special values of the Riemann zeta function, and bridge the two subjects.

If this reminds you of the analytic class number formula, well in fact that is one of the ingredients in the proof of Kummer’s result relating regular primes and the Bernoulli numbers. Moreover, the information that they encode is related to divisibility or congruence modulo primes or their powers. This is where the p-adic L-functions come in.

The Bernoulli numbers also appear in the constant term of the Fourier expansion of Eisenstein series. The Eisenstein series is an example of a modular form (see also Modular Forms), which gives us Galois representations. The Galois group, on the other hand is related to the ideal class group by class field theory (see also Some Basics of Class Field Theory). So this is one way to create the bridge between the two concepts. In fact, this was used to prove the Herbrand-Ribet theorem, a stronger version of Kummer’s result.

So we also have modular forms in the picture. In modern research all of these are deeply intertwined – ideal class groups, zeta functions, congruences, and modular forms.

References:

Bernoulli number on Wikipedia

Riemann zeta function on Wikipedia

Kummer’s congruence on Wikipedia

p-adic L-function on Wikipedia

Herbrand-Ribet theorem on Wikipedia

Bernoulli numbers, Hurwitz numbers, p-adic L-functions and
Kummer’s criterion
by Alvaro Lozano-Robledo

An introduction to Eisenstein measures by E. E. Eischen

How can we construct abelian Galois extensions of basic number
fields?
by Barry Mazur

Modular Forms

We have previously mentioned modular forms in The Moduli Space of Elliptic Curves and discussed them very briefly in the context of modular curves in Shimura Varieties. In this post, we will discuss this very important and central concept in modern number theory in more detail.

First we recall some facts about the group \text{SL}_{2}(\mathbb{Z}), which is so important that it is given the special name of the modular group. It is defined as the group of 2\times 2 matrices with integer coefficients and determinant equal to 1, and it acts on the upper half-plane (the set of complex numbers with positive imaginary part) in the following manner. Suppose an element \gamma of \text{SL}_{2}(\mathbb{Z}) is written in the form \left(\begin{array}{cc}a&b\\ c&d\end{array}\right). Then for \tau an element of the upper half-plane we write

\displaystyle \gamma(\tau)=\frac{a\tau+b}{c\tau+d}

A modular form (with respect to \text{SL}_{2}(\mathbb{Z})) is a holomorphic function on the upper half-plane such that

\displaystyle f(\gamma(\tau))=(c\tau+d)^{k}f(\tau)

for some k and such that f(\tau) is bounded as the imaginary part of \tau goes to infinity. The number k is called the weight of the modular form. If the function is not required to be bounded as the imaginary part of \tau goes to infinity it is a weakly modular form, and if furthermore it is merely required to be meromorphic, , it is a meromorphic modular form. A meromorphic modular form of weight 0 is just a meromorphic function on the upper half-plane which is invariant under the action of \text{SL}_{2}(\mathbb{Z}) (and bounded as the imaginary part of its argument goes to infinity) – we also call it a modular function.

We denote the set of modular forms of weight k with respect to \text{SL}_{2}(\mathbb{Z}) by \mathcal{M}_{k}(\text{SL}_{2}(\mathbb{Z})). Adding together two modular forms of the same weight gives another modular form of the same weight, and modular forms can be scaled by a complex number, so \mathcal{M}_{k}(\text{SL}_{2}(\mathbb{Z})) actually forms a vector space. We can also multiple a modular form of weight k with a modular form of weight l to get a modular form of weight k+l, so modular forms of a certain weight form a graded piece of a graded ring \mathcal{M}(\text{SL}_{2}(\mathbb{Z}):

\displaystyle \mathcal{M}(\text{SL}_{2}(\mathbb{Z}))=\bigoplus_{k}\mathcal{M}_{k}(\text{SL}_{2}(\mathbb{Z}))

Modular functions are actually functions on the moduli space of elliptic curves – but what about modular forms of higher weight? It turns out that he modular forms of weight 2 correspond to coefficients of differential forms on this space. To see this, consider d\tau and how the group \text{SL}(\mathbb{Z}) acts on it:

\displaystyle d\gamma(\tau)=\gamma'(\tau)d\tau=(c\tau+d)^{-2}d\tau

where \gamma'(\tau) is just the usual derivative of he action of \gamma as describe earlier. For a general differential form given by f(\tau)d\tau to be invariant under the action of \text{SL}(\mathbb{Z}) we must therefore have

\displaystyle f(\gamma(\tau))=(c\tau+d)^{2}f(\tau).

The modular forms of weight greater than 2 arise when we consider products of these differential forms. More technically, modular forms are sections of line bundles on modular curves, which come about when we compactify moduli spaces of elliptic curves (possibly with extra structure).

Let us now look at some examples of modular forms. Since modular forms “live on” moduli spaces of elliptic curves, we will keep in mind elliptic curves as we look at these examples. Our first family of examples are Eisenstein series of weight k, denoted by G_{k}(\tau) which is of the form

\displaystyle G_{k}(\tau)=\sum_{(m,n)\in\mathbb{Z}^{2}\setminus (0,0)}\frac{1}{(m+n\tau)^{k}}

Any modular form can in fact be written in terms of Eisenstein series G_{4}(\tau) and G_{6}(\tau).

Now, let us relate this to elliptic curves. An elliptic curve over the complex numbers may be written as a Weierstrass equation

\displaystyle y^{2}=4x^{3}-g_{2}x-g_{3}

The coefficients on the right-hand side g_{2} and g_{3} are in fact modular forms, of weight 4 and weight 6 respectively, given in terms of the Eisenstein series by g_{2}(\tau)=60G_{4}(\tau) and g_{3}(\tau)=140G_{6}(\tau).

Another example of a modular form is the modular discriminant of an elliptic curve, as a modular form denoted \Delta(\tau). It is a modular form of weight 12, and can be expressed via the elliptic curve coefficients that we defined earlier:

\Delta(\tau)=(g_{2}(\tau))^{3}-27(g_{3}(\tau))^{2}.

Our final example in this post is not of a modular form, but a meromorphic modular form of weight 0, i.e. a modular function. It is holomorphic on the upper half-plane, but goes to infinity as the imaginary part of \tau goes to infinity. It is the j-invariant associated to an elliptic curve. Once again we may express it in terms of the elliptic curve coefficients g_{2} and g_{3}:

\displaystyle j(\tau)=1728\frac{(g_{2}(\tau))^{3}}{(g_{2}(\tau))^{3}-27(g_{3}(\tau))^{2}}

Note that the denominator is also the modular discriminant.  The points of the moduli space of elliptic curves correspond to isomorphism classes of elliptic curves, and since the j-invariant is an honest-to-goodness holomorphic function on the moduli space of elliptic curves over \mathbb{C}, we can see that isomorphic elliptic curves will have the same j-invariant. This is not the case for the other modular forms we described above, which are not modular functions, i.e. they have nonzero weight! Why is this so? Let us recall that an elliptic curve over \mathbb{C} corresponds to a lattice. Acting on a basis of this lattice by an element of \text{SL}_{2}(\mathbb{Z}) changes the basis, but preserves the lattice. This will be reflected as “admissible changes of coordinates” in the Weierstrass equations, and also changes these modular forms associated to the elliptic curves even though the elliptic curves are still isomorphic. But they change in a predictable way, according to the definition of modular forms.

A modular form f(\tau) is also called a cusp form if the limit of f(\tau) is zero as the imaginary part of \tau approaches infinity. We denote the set of cusp forms of weight k by \mathcal{S}_{k}(\text{SL}_{2}(\mathbb{Z}). They are a vector subspace of \mathcal{M}_{k}(\text{SL}_{2}(\mathbb{Z}) and the graded ring formed by their direct sum for all k, denoted \mathcal{S}_{k}(\text{SL}_{2}(\mathbb{Z}), is an ideal of the graded ring \mathcal{M}(\text{SL}_{2}(\mathbb{Z}). Cusp forms form a very important part of modern research, but we will not discuss them much in this introductory post and leave them for the future.

Let us now discuss congruence subgroups of \text{SL}_{2}(\mathbb{Z}) (we have also discussed this briefly in Shimura Varieties), so that we can define more general modular forms with respect to such a congruence subgroup instead of just \text{SL}_{2}(\mathbb{Z}). Given an integer N, the principal congruence subgroup \Gamma(N) of \text{SL}_{2}(\mathbb{Z}) is the subgroup consisting of the elements which reduce to the identity when we reduce the entries modulo N. A congruence subgroup is any subgroup \Gamma that contains the principal congruence subgroup \Gamma(N). We refer to N as the level of the congruence subgroup.

There are two important kinds of congruence subgroups of \text{SL}_{2}(\mathbb{Z}), denoted by \Gamma_{0}(N) and \Gamma_{1}(N). The subgroup \Gamma_{0}(N) consists of the elements that become upper triangular after reduction modulo N, while the subgroup \Gamma_{1}(N) consists of the elements that become upper triangular with ones on the diagonal after reduction modulo N. As we discussed in Shimura Varieties, these are related to moduli spaces of “elliptic curves with level structure”.

Now we can define the modular forms of weight k with respect to such a congruence subgroup \Gamma. We shall once again require them to be holomorphic functions on the upper half-plane, and we require that for \gamma\in \Gamma written as \left(\begin{array}{cc}a&b\\ c&d\end{array}\right) we must have

\displaystyle f(\gamma(\tau))=(c\tau+d)^{k}f(\tau).

However, the condition that the function be bounded as the imaginary part of \tau goes to infinity must be modified. The reason is that the “point at infinity” is a cusp, a point of the modular curve that does not correspond to an elliptic curve over \mathbb{C} but rather to a “degeneration” of it (this point is therefore not a part of the usual moduli space of elliptic curves –  we can think of it as a “puncture” in this space).

We recall that the construction of the moduli space of elliptic curves over \mathbb{C} starts with the upper half-plane, then we quotient out by the action of \text{SL}_{2}(\mathbb{Z}). The cusps come from taking the union of the rational numbers with the upper half-plane, as well as the point at infinity. When we take the quotient by \text{SL}_{2}(\mathbb{Z}) this all gets sent to the same point, therefore the usual moduli space has only one cusp. But if we take the quotient by a congruence subgroup, we may have several cusps. Therefore, what we really require is for the modular form to be “holomorphic at the cusps“. We can still express this condition in familiar terms by requiring that not f(\tau), but rather (c\tau+d)^{-k}f(\gamma(\tau)) for \gamma\in \text{SL}_{2}(\mathbb{Z}) be bounded as the imaginary part of \tau goes to infinity. We can then define cusp forms with respect to \Gamma by requiring vanishing at the cusps instead. The set of modular forms (resp. cusp forms) of weight k with respect to \Gamma are denoted \mathcal{M}_{k}(\Gamma) (resp. \mathcal{S}_{k}(\Gamma)), and they also have the same structures of being vector spaces and being graded pieces of graded rings as the ones for \text{SL}_{2}(\mathbb{Z}).

Having only discussed the very basics of modular forms we end the post here, with the hope  that in the near future we will be able to discuss things such as Hecke operators, modular curves and their Jacobians, and their associated Galois representations. We redirect the interested reader to the references for now.

References:

Modular Form on Wikipedia

Eisenstein Series in Wikipedia

j-invariant on Wikipedia

Modular Form on Wikipedia

Congruence Subgroups on Wikipedia

A First Course in Modular Forms by Fred Diamond and Jerry Shurman

Advanced Topics in the Arithmetic of Elliptic Curves by Joseph H. Silverman

Shimura Varieties

In The Moduli Space of Elliptic Curves we discussed how to construct a space whose points correspond to isomorphism classes of elliptic curves over \mathbb{C}. This space is given by the quotient of the upper half-plane by the special linear group \text{SL}_{2}(\mathbb{Z}). Shimura varieties kind of generalize this idea. In some cases their points may correspond to isomorphism classes of abelian varieties over \mathbb{C}, which are higher-dimensional generalizations of elliptic curves in that they are projective varieties whose points form a group, possibly with some additional information.

Using the orbit-stabilizer theorem of group theory, the upper half-plane can also be expressed as the quotient \text{SL}_{2}(\mathbb{R})/\text{SO}(2). Therefore, the moduli space of elliptic curves over \mathbb{C} can be expressed as

\displaystyle \text{SL}_{2}(\mathbb{Z})\backslash\text{SL}_{2}(\mathbb{R})/\text{SO}(2).

If we wanted to parametrize “level structures” as well, we could replace \text{SL}_{2}(\mathbb{Z}) with a congruence subgroup \Gamma(N), a subgroup which contains the matrices in \text{SL}_{2}(\mathbb{Z}) which reduce to an identity matrix when we mod out b some natural number N which is greater than 1. Now we obtain a moduli space of elliptic curves over \mathbb{C} together with a basis of their N-torsion:

Y(N)=\Gamma(N)\backslash\text{SL}_{2}(\mathbb{R})/\text{SO}(2)

We could similarly consider the subgroup \Gamma_{0}(N), the subgroup of \text{SL}_{2}(\mathbb{Z}) containing elements that reduce to an upper-triangular matrix mod N, to parametrize elliptic curves over \mathbb{C} together with a cyclic N-subgroup, or \Gamma_{1}(N), the subgroup of \text{SL}_{2}(\mathbb{Z}) which contains elements that reduce to an upper-triangular matrix with 1 on every diagonal entry mod N, to parametrize elliptic curves over \mathbb{C} together with a point of order N. These give us

Y_{0}(N)=\Gamma_{0}(N)\backslash\text{SL}_{2}(\mathbb{R})/\text{SO}(2)

and

Y_{1}(N)=\Gamma_{1}(N)\backslash\text{SL}_{2}(\mathbb{R})/\text{SO}(2)

Let us discuss some important properties of these moduli spaces, which will help us generalize them. The space \text{SL}_{2}(\mathbb{R})/\text{SO}(2), i.e. the upper-half plane, is an example of a Riemannian symmetric space. This means it is a Riemannian manifold whose group of automorphisms act transitively – in layperson’s terms, every point looks like every other point – and every point has an associated involution fixing only that point in its neighborhood.

These moduli spaces almost form smooth projective curves, but they have missing points called “cusps” that do not correspond to an isomorphism class of elliptic curves but rather to a “degeneration” of such. We can fill in these cusps to “compactify” these moduli spaces, and we get modular curves X(N), X_{0}(N), and X_{1}(N). On these modular curves live cusp forms, which are modular forms satisfying certain conditions at the cusps. Traditionally these modular forms are defined as functions on the upper-half plane satisfying certain conditions under the action of \text{SL}_{2}(\mathbb{Z}), but when they are cusp forms we may also think of them as sections of line bundles on these modular curves. In particular the cusp forms of “weight 2” are the differential forms on a modular curve.

These modular curves are equipped with Hecke operators, T_{p} and \langle p\rangle for every p not equal to N. These are operators on modular forms, but may also be thought of in terms of Hecke correspondences. We recall that elliptic curves over \mathbb{C} are lattices in \mathbb{C}. Take such a lattice \Lambda. The p-th Hecke correspondence is a sum over all the index p sublattices of \Lambda. It is a multivalued function from the modular curve to itself, but the better way to think of such a multivalued function is as a correspondence, a curve inside the product of the modular curve with itself.

With these properties as our guide, let us now proceed to generalize these concepts. One generalization is through the concept of an arithmetic manifold. This is a double coset space

\Gamma\backslash G(\mathbb{R})/K

where G is a semisimple algebraic group over \mathbb{Q}, K is a maximal compact subgroup of G(\mathbb{R}), and \Gamma is an arithmetic subgroup, which means that it is intersection with G(\mathbb{Z}) has finite index in both \Gamma and G(\mathbb{Z}). A theorem of Margulis says that, with a handful of exceptions, G(\mathbb{R})/K is a Riemannian symmetric space. Arithmetic manifolds are equipped with Hecke correspondences as well.

Arithmetic manifolds can be difficult to study. However, in certain cases, they form algebraic varieties, in which case we can use the methods of algebraic geometry to study them. For this to happen, the Riemannian symmetric space G(\mathbb{R})/K must have a complex structure compatible with its Riemannian structure, which makes it into a Hermitian symmetric space. The Baily-Borel theorem guarantees that the quotient of a Hermitian symmetric space by an arithmetic subgroup of G(\mathbb{Q}) is an algebraic variety. This is what Shimura varieties accomplish.

To motivate this better, we discuss the idea of Hodge structures. Let V be an n-dimensional real vector space. A (real) Hodge structure on V is a decomposition of its complexification V\otimes\mathbb{C} as follows:

\displaystyle V\otimes\mathbb{C}=\bigoplus_{p,q} V^{p,q}

such that V^{q,p} is the complex conjugate of V^{p,q}. The set of pairs (p,q) for which V^{p,q} is nonzero is called the type of the Hodge structure. Letting V_{n}=\bigoplus_{p+q=n} V^{p,q}, the decomposition V=\bigoplus_{n} V_{n} is called the weight decomposition. An integral Hodge structure is a \mathbb{Z}-module V together with a Hodge structure on V_{\mathbb{R}} such that the weight decomposition is defined over \mathbb{Q}. A rational Hodge structure is defined similarly but with V a finite-dimensional vector space over \mathbb{Q}.

An example of a Hodge structure is given by the singular cohomology of a smooth projective variety over \mathbb{C}:

\displaystyle H^{n}(X(\mathbb{C}),\mathbb{Z})\otimes_{\mathbb{Z}}\mathbb{C}=\bigoplus_{i+j=n}H^{j}(X,\Omega_{X/\mathbb{C}}^{i})

In particular for an abelian variety A, the integral Hodge structure of type (1,0),(0,1) given by the first singular cohomology H^{1}(A(\mathbb{C}),\mathbb{Z}) gives an integral Hodge structure of type (-1,0),(0,-1) on its dual, the first singular homology H_{1}(A(\mathbb{C}),\mathbb{Z}). Specifying such an integral Hodge structure of type (-1,0),(0,-1) on H_{1}(A(\mathbb{C}),\mathbb{Z}) is also the same as specifying a complex structure on H_{1}(A(\mathbb{C}),\mathbb{Z})\otimes_{\mathbb{Z}} \mathbb{R}. In fact, the category of integral Hodge structures of type (-1,0),(0,-1) is equivalent to the category of complex tori.

Let \mathbb{S} be the group \text{Res}_{\mathbb{C}/\mathbb{R}}\mathbb{G}_{\text{m}}. It is the Tannakian group for Hodge structures on finite-dimensional real vector spaces, which basically means that the category of Hodge structures on finite-dimensional real vector spaces are equivalent to the category of representations of \mathbb{S} on finite-dimensional real vector spaces. This lets us redefine Hodge structures as a pair (V,h) where V is a finite-dimensional real vector space and h is a map from \mathbb{S} to \text{GL}(V).

We have earlier stated that the category of integral Hodge structures of type (-1,0),(0,-1) is equivalent to the category of complex tori. However, not all complex tori are abelian varieties. To obtain an equivalence between some category of Hodge structures and abelian varieties, we therefore need a notion of polarizable Hodge structures. We let \mathbb{R}(n) denote the Hodge structure on \mathbb{R} of type (-n,-n) and define \mathbb{Q}(n) and \mathbb{Z}(n) analogously. A polarization on a real Hodge structure V of weight n is a morphism \Psi of Hodge structures from V\times V to \mathbb{R}(-n) such that the bilinear form defined by (u,v)\mapsto \Psi(u,h(i)v) is symmetric and positive semidefinite.

A polarizable Hodge structure is a Hodge structure that can be equipped with a polarization, and it turns out that the functor that assigns to an abelian variety A its first singular homology H_{1}(X,\mathbb{Z}) defines an equivalence of categories between the category of abelian varieties over \mathbb{C} and the category of polarizable integral Hodge structures of type (-1,0),(0,-1).

A Shimura datum is a pair (G,X) where G is a connected reductive group over \mathbb{Q}, and X is a G(\mathbb{R}) conjugacy class of homomorphisms from \mathbb{S} to G, satisfying the following conditions:

  • The composition of any h\in X with the adjoint action of G(\mathbb{R}) on its Lie algebra \mathfrak{g} induces a Hodge structure of type (-1,1)(0,0)(1,-1) on \mathfrak{g}.
  • For any h\in X, h(i) is a Cartan involution on G(\mathbb{R})^{\text{ad}}.
  • G^{\text{ad}} has no factor defined over \mathbb{Q} whose real points form a compact group.

Let (G,X) be a Shimura datum. For K a compact open subgroup of G(\mathbb{A}_{f}) where \mathbb{A}_{f} is the finite adeles (the restricted product of completions of \mathbb{Q} over all finite places, see also Adeles and Ideles), the Shimura variety \text{Sh}_{K}(G,X) is the double quotient

\displaystyle G(\mathbb{Q})\backslash (X\times G(\mathbb{A}_{f})/K)

The introduction of adeles serves the purpose of keeping track of the level structures all at once. The space \text{Sh}_{K}(G,X) is a disjoint union of locally symmetric spaces of the form \Gamma\backslash X^{+}, where X^{+} is a connected component of X and \Gamma is an arithmetic subgroup of G(\mathbb{Q})^{+}. By the Baily-Borel theorem, it is an algebraic variety. Taking the inverse limit of over compact open subgroups K gives us the Shimura variety at infinite level \text{Sh}(G,X).

Let us now look at some examples. Let G=\text{GL}_{2}, and let X be the conjugacy class of the map

\displaystyle h:a+bi\to\left(\begin{array}{cc}a&b\\ -b&a\end{array}\right)

There is a G(\mathbb{R})-equivariant bijective map from X to \mathbb{C}\setminus \mathbb{R} that sends h to i. Then the Shimura varieties \text{Sh}_{K}(G,X) are disjoint copies of modular curves and the Shimura variety at infinite level \text{Sh}(G,X) classifies isogeny classes of elliptic curves with full level structure.

Let’s look at another example. Let V be a 2n-dimensional symplectic space over \mathbb{Q} with symplectic form \psi. Let G be the group of symplectic similitudes \text{GSp}_{2n}, i.e. for k a \mathbb{Q}-algebra

\displaystyle G(k)=\lbrace g\in \text{GL}(V\otimes k)\vert \psi(gu,gv)=\nu(g)\psi(u,v)\rbrace

where \nu:G\to k^{\times} is called the similitude character. Let J be a complex structure on V_{\mathbb{R}} compatible with the symplectic form \psi and let X be the conjugacy class of the map h that sends a+bi to the linear transformation v\mapsto av+bJv. Then the conjugacy class X is the set of complex structures polarized by \pm\psi. The Shimura varieties Sh_{K}(G,X) are called Siegel modular varieties and they parametrize isogeny classes of n-dimensional principally polarized abelian varieties with level structure.

There are many other kinds of Shimura varieties, which parametrize abelian varieties with other kinds of extra structure. Just like modular curves, Shimura varieties also have many interesting aspects, from Galois representations (related to their having Hecke correspondences), to certain special points related to the theory of complex multiplication, to special cycles with height pairings generalizing results such as the Gross-Zagier formula in the study of special values of L-functions and their derivatives. There is also an analogous local theory; in this case, ideas from p-adic Hodge theory come into play, where we can further relate the p-adic analogue of Hodge structures and Galois representations. The study of Shimura varieties is a very fascinating aspect of modern arithmetic geometry.

References:

Shimura variety on Wikipedia

Reciprocity Laws and Galois Representations: Recent Breakthroughs by Jared Weinstein

Perfectoid Shimura Varieties by Ana Caraiani

Introduction to Shimura Varieties by J.S. Milne

Lecture Notes for Advanced Number Theory by Jared Weinstein

The Lubin-Tate Formal Group Law

A (one-dimensional, commutative) formal group law f(X,Y) over some ring A is a formal power series in two variables with coefficients in A satisfying the following axioms that among other things makes it behave like an abelian group law:

  • f(X,Y)=X+Y+\text{higher order terms}
  • f(X,Y)=f(Y,X)
  • f(f(X,Y),Z)=f(X,f(Y,Z))

A homomorphism of formal group laws g:f_{1}(X,Y)\to f_{2}(X,Y) is another formal power series in two variable such f_{1}(g(X,Y))=g(f_{2}(X,Y)). An endomorphism of a formal group law is a homomorphism of a formal group law to itself.

As basic examples of formal group laws, we have the additive formal group law \mathbb{G}_{a}(X,Y)=X+Y, and the multiplicative group law \mathbb{G}_{m}(X,Y)=X+Y+XY. In this post we will focus on another formal group law called the Lubin-Tate formal group law.

Let F be a nonarchimedean local field and let \mathcal{O}_{F} be its ring of integers. Let A be an \mathcal{O}_{F}-algebra with i:\mathcal{O}_{F}\to A its structure map. A formal \mathcal{O}_{F}-module law over A over A is a formal group law f(X,Y) such that for every element a of \mathcal{O}_{F} we have an associated endomorphism [a] of f(X,Y), and such that the linear term of this endomorphism as a power series is i(a)X.

Let \pi be a uniformizer (generator of the unique maximal ideal) of \mathcal{O}_{F}. Let q=p^{f} be the cardinality of the residue field of \mathcal{O}_{F}. There is a unique (up to isomorphism) formal \mathcal{O}_{F}-module law over \mathcal{O}_{F} such that as a power series its linear term is \pi X and such that it is congruent to X^{q} mod \pi. It is called the Lubin-Tate formal group law and we denote it by \mathcal{G}(X,Y).

The Lubin-Tate formal group law was originally studied by Jonathan Lubin and John Tate for the purpose of studying local class field theory (see Some Basics of Class Field Theory). The results of local class field theory state that the Galois group of the maximal abelian extension of F is isomorphic to the profinite completion \widehat{F}^{\times}. This profinite completion in turn decomposes into the product \mathcal{O}_{F}^{\times}\times \pi^{\widehat{\mathbb{Z}}}.

The factor isomorphic to \mathcal{O}_{F}^{\times} fixes the maximal unramified extension F^{\text{nr}} of F, the factor isomorphic to \pi^{\widehat{\mathbb{Z}}} fixes an infinite, totally ramified extension F_{\pi} of F, and we have that F=F^{\text{nr}}F_{\pi}. The theory of the Lubin-Tate formal group law was developed to study F_{\pi}, taking inspiration from the case where F=\mathbb{Q}_{p}. In this case \pi=p and the infinite totally ramified extension F_{p} is obtained by adjoining to \mathbb{Q}_{p} all p-th power roots of unity, which is also the p-th power torsion of the multiplicative group \mathbb{G}_{m}. We want to generalize \mathbb{G}_{m}, and this is what the Lubin-Tate formal group law accomplishes.

Let \mathcal{G}[\pi^{n}] be the set of all elements in the maximal ideal of some separable extension \mathcal{O}_{F} such that its image under the endomorphism [\pi^{n}] is zero. This takes the place of the p-th power roots of unity, and adjoining to F all the \mathcal{G}[\pi^{n}] for all n gives us the field F_{\pi}.

Furthermore, Lubin and Tate used the theory they developed to make local class field theory explicit in this case. We define the \pi-adic Tate module T_{\pi}(\mathcal{G}) as the inverse limit of \mathcal{G}[\pi^{n}] over all n. This is a free \mathcal{O}_{F}-module of rank 1 and its automorphisms are in fact isomorphic to \mathcal{O}_{F}^{\times}. Lubin and Tate proved that this is isomorphic to the Galois group of F_{\pi} over F and explicitly described the reciprocity map of local class field theory in this case as the map from F^{\times } to \text{Gal}(F_{\pi}/F) sending \pi to the identity and an element of \mathcal{O}_{F}^{\times} to the image of its inverse under the above isomorphism.

To study nonabelian extensions, one must consider deformations of the Lubin-Tate formal group. This will lead us to the study of the space of these deformations, called the Lubin-Tate space. This is intended to be the subject of a future blog post.

References:

Lubin-Tate Formal Group Law on Wikipedia

Formal Group Law on Wikipedia

The Geometry of Lubin-Tate Spaces by Jared Weinstein

A Rough Introduction to Lubin-Tate Spaces by Zhiyu Zhang

Formal Groups and Applications by Michiel Hazewinkel

The Arithmetic Site and the Scaling Site

Introduction

In The Riemann Hypothesis for Curves over Finite Fields, we gave a rough outline of Andre Weil’s strategy to prove the analogue of the famous Riemann hypothesis for curves over finite fields. A rather natural question to ask would be, does this strategy give us any suggestions on how to take on the original Riemann hypothesis? We mentioned briefly in The Field with One Element that some mathematicians hope to find in the theory of the so-called “field with one element” something that will allow them to apply the ideas of Weil’s proof to the original Riemann hypothesis, by viewing the scheme \text{Spec}(\mathbb{Z})  as some kind of “curve” over the “field with one element”.

In this post we will consider something along similar lines, examining a kind of “space” to which we can apply an analogue of Weil’s strategy. This approach is due to the mathematicians Alain Connes and Caterina Consani, and makes use of the concepts of sites and toposes (see More Category Theory: The Grothendieck Topos and Even More Category Theory: The Elementary Topos). This is perhaps appropriate, since sites or toposes are often referred to as “generalized spaces”.

We recall from The Riemann Hypothesis for Curves over Finite Fields some aspects of Weil’s strategy. The object in consideration is a curve C over a finite field \mathbb{F}_{q}. In order to write down the zeta function for C, we need to count the number of points over \mathbb{F}_{q^{n}}, for every n from 1 to infinity. We can do this by counting the fixed points of powers of the Frobenius morphism. Explicitly this means taking intersection numbers of the diagonal and the divisor formed by integral linear combinations of powers of the Frobenius morphism on \bar{C}\times_{\bar{\mathbb{F}}_{q}}\bar{C}, where \bar{\mathbb{F}}_{q} is an algebraic closure of \mathbb{F}_{q} (it is the direct limit of the directed system formed by all the \mathbb{F}_{q^{n}}) and \bar{C}=C\otimes_{\mathbb{F}_{q}}\bar{\mathbb{F}}_{q}. The number of points of \bar{\mathbb{F}}_{q} will be the same as the number of points of C over \mathbb{F}_{q^{n}}. Throughout this post we should keep these steps of Weil’s strategy in mind.

In order to transfer this strategy of Weil to the original Riemann hypothesis, Connes and Consani construct the arithmetic site, meant to be the analogue of C, and the scaling site, meant to be the analogue of \bar{C}. The intuition behind these constructions is that the points of the scaling site, which is the same as the points of the arithmetic site “over \mathbb{R}_{+}^{\text{max}}“, is the same as the points of the “adele class space\mathbb{Q}^{\times}\backslash\mathbb{A}_{\mathbb{Q}}/\hat{\mathbb{Z}}^{\times}, which originally came up in earlier work of Connes where he constructed a quantum-mechanical system which gives Riemann’s prime-counting function (whose study provided the historical origin of the Riemann hypothesis), in the form of Weil’s “explicit formula”, as a quantum-mechanical trace formula! In essence this work restates the Riemann hypothesis in terms of mathematical language more commonly associated to physics, and is part of Connes’ pioneering work in noncommutative geometry, a new area of mathematics also closely related to physics, in particular quantum mechanics and quantum field theory. In the definition of the adele class space, \mathbb{A}_{\mathbb{Q}} refers to the ring of adeles of \mathbb{Q} (see Adeles and Ideles), while \hat{\mathbb{Z}} refers to \prod_{p}\mathbb{Z}_{p}, where \mathbb{Z}_{p} are the p-adic integers, which can be defined as the inverse limit of the inverse system formed by \mathbb{Z}/p^{n}\mathbb{Z}.

The Arithmetic Site

We now proceed to discuss the arithmetic site. It is described as the pair (\widehat{\mathbb{N}^{\times}},\mathbb{Z}_{\text{max}}), where \widehat{\mathbb{N}^{\times}} a Grothendieck topos, which, as we may recall from More Category Theory: The Grothendieck Topos, is defined as a category equivalent to the category \text{Sh}(\mathbf{C},J) of sheaves on a site (\mathbf{C},J). In the case of \widehat{\mathbb{N}^{\times}}, \mathbf{C} is the category with only one object, and whose morphisms correspond to the multiplicative monoid of nonzero natural numbers \mathbb{N}^{\times} (we also use \mathbb{N}^{\times} to denote this category, and \mathbb{N}_{0}^{\times} to denote the category with one object and whose morphisms correspond to \mathbb{N}^{\times}\cup\{0\}), while J is the indiscrete, or chaotic, Grothendieck topology, where all presheaves are also sheaves.

As part of the definition of the arithmetic site, we must also specify a structure sheaf. In this case is provided by \mathbb{Z}_{\text{max}}, the semiring (a semiring is like a ring, but is only a monoid, and not a group, under the “addition” operation – a semiring is also sometimes called a “rig“, because it is a ring without the “n” – the negative elements, and the most common example is the natural numbers \mathbb{N} with the usual addition and multiplication) whose elements are just the integers, together with -\infty, but where the “addition” is provided by the “maximum” operation, and the “multiplication” is provided by the ordinary addition! With the arithmetic site thus defined, we denote it by \mathcal{A}.

We digress for a while to discuss the semiring \mathbb{Z}_{\text{max}}, as well as the closely related semirings \mathbb{R}_{\text{max}} (defined similarly to \mathbb{Z}_{\text{max}}, but with the real numbers instead of the integers), \mathbb{R}_{+}^{\text{max}} (whose elements are the positive real numbers, with the addition given by the maximum operation, and the multiplication given by the ordinary multiplication), and the so-called Boolean semifield \mathbb{B} (whose elements are 0 and 1, with the addition again given by the maximum operation, and the multiplication again given by the ordinary multiplication). These semirings have origins in the area of mathematics known as tropical geometry, so named because one of its pioneers, Imre Simon, comes from Brazil, which is a tropical country. However, another source of inspiration is the work of the mathematical physicist Viktor Pavlovich Maslov in “semiclassical” quantum mechanics, where certain approximations could be made as the quantum mechanical systems being studied approached the classical limit. Maslov considered a “conjugated” addition

\displaystyle \lim_{\epsilon\to 0}(x^{\frac{1}{\epsilon}}+y^{\frac{1}{\epsilon}})^{\epsilon}

and this just happened to be the same as \text{max}(x,y).

Going back to the arithmetic site, we now discuss its points. Recall from Even More Category Theory: The Elementary Topos that a point of a topos (we discussed elementary toposes in that post, but this also applies to Grothendieck toposes) is defined by a geometric morphism from the topos \mathfrak{P} of sheaves of sets on the singleton set (the set with a single element) to the topos. This refers to a pair of adjoint functors such that the left-adjoint is left-exact (preserves finite limits). Therefore, for the arithmetic site, a point p is given by such a pair p^{*} and p_{*} such that p^{*}:\widehat{\mathbb{N}^{\times}}\rightarrow\textbf{Sets} is left-exact. The point p is also uniquely determined by the covariant functor \mathscr{P}=p^{*}\circ\epsilon:\mathbb{N}^{\times}\rightarrow\textbf{Sets} where \epsilon:\mathbb{N}^{\times}\rightarrow\widehat{\mathbb{N}^{\times}} is the Yoneda embedding.

There is an equivalence of categories between the category of points of the arithmetic site and the category of totally ordered groups which are isomorphic to the nontrivial subgroups of (\mathbb{Q},\mathbb{Q}_{+}) and injective morphisms of ordered groups. For such an ordered group \textbf{H} we therefore have a point \mathscr{P}_{\textbf{H}}. This gives us a correspondence with \mathbb{Q}_{+}^{\times}\backslash\mathbb{A}_{\mathbb{Q}}^{f}/\hat{\mathbb{Z}}^{\times} (where \mathbb{A}_{\mathbb{Q}}^{f} refers to the ring of finite adeles of \mathbb{Q}, which is defined similarly to the ring of adeles of \mathbb{Q} except that the infinite prime is not considered) because any such ordered group \textbf{H} is of the form \textbf{H}_{a}, the ordered group of all rational numbers q such that aq\in\hat{\mathbb{Z}}, for some unique a\in \mathbb{A}_{\mathbb{Q}}^{f}/\hat{\mathbb{Z}}. We can also now describe the stalks of the structure sheaf \mathbb{Z}_{\text{max}} at the point \mathscr{P}_{\textbf{H}}; it is isomorphic to the semiring H_{\text{max}}, with elements given by the set (\textbf{H}\cup\{-\infty\}), addition given by the maximum operation, and multiplication given by the ordinary addition.

The arithmetic site is analogous to the curve C over the finite field \mathbb{F}_{q}. As for the finite field \mathbb{F}_{q}, its analogue is given by the Boolean semifield \mathbb{B} mentioned earlier, which has “characteristic 1“, reminiscent of the field with one element (see The Field with One Element). Next we want to find the analogues of the algebraic closure \bar{\mathbb{F}}_{q}, as well as the Frobenius morphism. The former is given by the semiring \mathbb{R}_{+}^{\text{max}}, which contains \mathbb{B}, while the latter is given by multiplicative group of the positive real numbers \mathbb{R}_{+}^{\times}, as it is isomorphic to the group of automorphisms of \mathbb{R}_{+}^{\text{max}} that keep \mathbb{B} fixed.

But while we do know that the points of the arithmetic topos are given by geometric morphisms p:\mathfrak{P}\rightarrow \widehat{\mathbb{N}^{\times}} and determined by contravariant functors \mathscr{P}_{\textbf{H}}:\mathbb{N}^{\times}\rightarrow\textbf{Sets}, what do we mean by its “points over \mathbb{R}_{+}^{\text{max}}“? A point of the arithmetic site “over \mathbb{R}_{+}^{\text{max}}” refers to the pair (\mathscr{P}_{\textbf{H}},f_{\mathscr{P}}^{\#}), where \mathscr{P}_{\textbf{H}}:\mathbb{N}^{\times}\rightarrow\textbf{Sets} as earlier, and f_{\mathscr{P}_{\textbf{H}}}^{\#}:H_{\text{max}}\rightarrow\mathbb{R}_{+}^{\text{max}} (we recall that H_{\text{max}} are the stalks of the structure sheaf \mathbb{Z}_{\text{max}}). The points of the arithmetic site over \mathbb{R}_{+}^{\text{max}} include its points “over \mathbb{B}“, which are what we discussed earlier, and mentioned to be in correspondence with \mathbb{Q}_{+}^{\times}\backslash\mathbb{A}_{\mathbb{Q}}^{f}/\hat{\mathbb{Z}}^{\times}. But in addition, there are also other points of the arithmetic site over \mathbb{R}_{+}^{\text{max}} which are in correspondence with \mathbb{Q}_{+}^{\times}\backslash((\mathbb{A}_{\mathbb{Q}}^{f}/\hat{\mathbb{Z}}^{\times})\times\mathbb{R}_{+}^{\times}), just as \mathbb{R}_{+}^{\text{max}} contains all of \mathbb{B} but also other elements. Altogether, the points of the arithmetic site over \mathbb{R}_{+}^{\text{max}} correspond to the disjoint union of \mathbb{Q}_{+}^{\times}\backslash\mathbb{A}_{\mathbb{Q}}^{f}/\hat{\mathbb{Z}}^{\times} and \mathbb{Q}_{+}^{\times}\backslash((\mathbb{A}_{\mathbb{Q}}^{f}/\hat{\mathbb{Z}}^{\times})\times\mathbb{R}_{+}^{\times}), which is \mathbb{Q}^{\times}\backslash\mathbb{A}_{\mathbb{Q}}/\hat{\mathbb{Z}}^{\times}, the adele class space as mentioned earlier.

There is a geometric morphism \Theta:\text{Spec}(\mathbb{Z})\rightarrow \widehat{\mathbb{N}_{0}^{\times}} (here \widehat{\mathbb{N}_{0}^{\times}} is defined similarly to \widehat{\mathbb{N}^{\times}}, but with \mathbb{N}_{0}^{\times} in place of \mathbb{N}^{\times}) uniquely determined by

\displaystyle \Theta^{*}:\mathbb{N}_{0}^{\times}\rightarrow \text{Sh}(\text{Spec}(\mathbb{Z}))

which sends the single object of \mathbb{N}_{0}^{\times} to the sheaf \mathcal{S} on \text{Spec}(\mathbb{Z}), which we now describe. Let H_{p} denote the set of all rational numbers q such that a_{p}q is an element of \hat{Z}, where a_{p} is the adele with a 0 for the p-th component and 1 for all other components. Then the sheaf \mathcal{S} can be described in terms of its stalks \mathcal{S}_{\mathscr{P}}, which are given by H_{p}^{+}, the positive part of H_{p}, and \mathcal{S}_{0}, given by \{0\}. The sections \Gamma(U,\mathcal{S}) are given by the maps \xi:U\rightarrow \coprod_{p}H_{p}^{+} such that \xi_{p}\neq 0 for finitely many p\in U.

The Scaling Site

Now that we have defined the arithmetic site, which is the analogue of C, and the points of the arithmetic site over \mathbb{R}_{+}^{\text{max}}, which is the analogue of the points of C over the algebraic closure \bar{\mathbb{F}}_{q}, we now proceed to define the scaling site, which is the analogue of \bar{C}=C\otimes_{\mathbb{F}_{q}}\bar{\mathbb{F}}_{q}. The points of the scaling site are the same as the points of the arithmetic site over \mathbb{R}_{+}^{\text{max}}, which is analogous to the points of \bar{C} being the same as the points of C over \bar{\mathbb{F}}_{q}. But the importance of the scaling site lies in the fact that we can construct the analogue of a sheaf of rational functions on it, and a Riemann-Roch theorem, which, as we may recall from The Riemann Hypothesis for Curves over Finite Fields, it is also an important part of Weil’s proof.

The scaling site is once again given by a pair ([0,\infty)\rtimes\mathbb{N}^{\times},\mathcal{O}), where [0,\infty)\rtimes\mathbb{N}^{\times} is a Grothendieck topos and \mathcal{O} is a structure sheaf, but both are quite sophisticated constructions compared to the arithmetic site. To describe the Grothendieck topos [0,\infty)\rtimes\mathbb{N}^{\times} we recall that it must be a category equivalent to the category \text{Sh}(\mathbf{C},J) of sheaves on some site (\mathbf{C},J). Here \mathbf{C} is the category whose objects are given by bounded open intervals \Omega\subset [0,\infty), including the empty interval \null, and whose morphisms are given by

\displaystyle \text{Hom}(\Omega,\Omega')=\{n\in\mathbb{N}^{\times}|n\Omega\subset\Omega'\}

and in the special case that \Omega is the empty interval \null, we have

\displaystyle \text{Hom}(\Omega,\Omega')=\{*\}.

The Grothendieck topology J here is defined by the collection K(\Omega) of all ordinary covers of \Omega for any object \Omega of the category \mathbf{C}:

\displaystyle \{\Omega_{i}\}_{i\in I}=\{\Omega_{i}\subset\Omega|\cup_{i}\Omega_{i}=\Omega\}

Now we have to describe the structure sheaf \mathcal{O}. We start by considering \mathbb{Z}_{\text{max}}, the structure sheaf of the arithmetic site. By “extension of scalars” from \mathbb{B} to \mathbb{R}_{+}^{\text{max}} we obtain the reduced semiring \mathbb{Z}_{\text{max}}\hat{\otimes}_{\mathbb{B}}\mathbb{R}_{+}^{\text{max}}. This is not yet the structure sheaf \mathcal{O}, because the underlying category and Grothendieck topology for the scaling site is more complicated than the arithmetic site, and unlike the case for the arithmetic site, for the scaling site not every presheaf is a sheaf. So we must first “localize” \mathbb{Z}_{\text{max}}\hat{\otimes}_{\mathbb{B}}\mathbb{R}_{+}^{\text{max}}, and this gives us the structure sheaf \mathcal{O}.

Let us describe \mathbb{Z}_{\text{max}}\hat{\otimes}_{\mathbb{B}}\mathbb{R}_{+}^{\text{max}} in more detail. Let H be a rank 1 subgroup of \mathbb{R}. Then an element of H_{\text{max}}\hat{\otimes}_{\mathbb{B}}\mathbb{R}_{+}^{\text{max}} is given by a Newton polygon N\subset\mathbb{R}^{2}, which is the convex hull of the union of finitely many quadrants (x_{j},y_{j}-Q), where Q=H\times\mathbb{R}_{+} and (x_{j},y_{j})\in H\times R (a set is a convex set if it contains the line segment connecting any two of its points; the convex hull of a set is the smallest convex set that contains it). The Newton polygon N is uniquely determined by the function

\displaystyle \ell_{N}(\lambda)=\text{max}(\lambda x_{j}+y_{j})

for \lambda\in\mathbb{R}_{+}. This correspondence gives us an isomorphism between H\hat{\otimes}_{\mathbb{B}}\mathbb{R}_{+}^{\text{max}} and \mathcal{R}(H), the semiring of convex, piecewise affine, continuous functions on [0,\infty) with slopes in H\subset\mathbb{R} and finitely many singularities, with the pointwise operations (function is a convex function if the points on and above its graph form a convex set).

Therefore, we can describe the sections \Gamma(\Omega,\mathcal{O}) of the structure sheaf \mathcal{O}, for any bounded open interval \Omega, as the set of all convex, piecewise affine, continuous functions from \Omega to \mathbb{R}_{\text{max}} with slopes in \mathbb{Z}. We can also likewise describe the stalks of the structure sheaf \mathcal{O} – for a point \mathfrak{p}_{H}:[0,\infty)\rtimes\mathbb{N}^{\times}\rightarrow\textbf{Sets} associated to a rank 1 subgroup H\subset\mathbb{R}, the stalk \mathcal{O}_{\mathfrak{p}_{H}} is given by the semiring \mathcal{R}_{H} of germs of \mathbb{R}_{+}^{\text{max}}-valued, convex, piecewise affine, continuous functions with slope in H. We also have points \mathfrak{p}_{H}^{0}:[0,\infty)\rtimes\mathbb{N}^{\times}\rightarrow\textbf{Sets} with “support \{0\}“, corresponding to the points of the arithmetic site over \mathbb{B}. For such a point, the stalk \mathcal{O}_{\mathfrak{p}_{H}^{0}} is given by the semiring (H\times\mathbb{R})_{\text{max}} associated to the totally ordered group H\times\mathbb{R}.

Now that we have decribed the Grothendieck topos [0,\infty)\rtimes\mathbb{N}^{\times} and the structure sheaf \mathcal{O}, we describe the scaling site as being given by the pair ([0,\infty)\rtimes\mathbb{N}^{\times},\mathcal{O}), and we denote it by \hat{\mathcal{A}}.

Our next task, now that we have described the arithmetic site and the scaling site, is to find the analogue of the Riemann-Roch theorem. We start by noting that we have a sheaf of semifields \mathcal{K}, defined by letting \mathcal{K}(\Omega) be the semifield of fractions of \mathcal{O}(\Omega). For an element f_{H} in the stalk \mathcal{K}_{\mathfrak{p}_{H}} of \mathcal{K}, we define its order as

\displaystyle \text{Order}_{H}(f):=h_{+}-h_{-}

where

\displaystyle h_{\pm}:=\lim_{\epsilon\to 0_{\pm}}(f((1+\epsilon)H)-f(H))/\epsilon

for \epsilon\in\mathbb{R}_{+}.

We let C_{p} be the set of all points \mathfrak{p}_{H}:[0,\infty)\rtimes\mathbb{N}^{\times}\rightarrow\textbf{Sets} of the scaling site \hat{\mathcal{A}} such that H is isomorphic to H_{p}. The C_{p} are the analogues of the orbits of Frobenius. There is a topological isomorphism \eta_{p}:\mathbb{R}_{+}^{\times}/p^{\mathbb{Z}}\rightarrow C_{p}. It is worth noting that the expression \mathbb{R}_{+}^{\times}/p^{\mathbb{Z}} is reminiscent of the Tate uniformization of an elliptic curve (which generalizes the idea that an elliptic curve over the complex numbers forms a lattice in the complex plane to other complete fields besides the complex numbers –  see The Moduli Space of Elliptic Curves).

We have a pullback sheaf \eta_{p}^{*}(\mathcal{O}|_{C_{p}}), which we denote suggestively by \mathcal{O}_{p}. It is the sheaf on \mathbb{R}_{+}^{\times}/p^{\mathbb{Z}} whose sections are convex, piecewise affine, continuous functions with slopes in H_{p}. We can consider the sheaf of quotients \mathcal{K}_{p} of \mathcal{O}_{p} and its global sections f:\mathbb{R}_{+}^{\times}\rightarrow\mathbb{R}, which are piecewise affine, continuous functions with slopes in H_{p} such that f(p\lambda)=f(\lambda) for all \lambda\in\mathbb{R}_{+}^{\times}. Defining

\displaystyle \text{Order}_{\lambda}(f):=\text{Order}_{\lambda H_{p}}(f\circ\eta_{p}^{-1})

we have the following property for any f\in H^{0}(\mathbb{R}_{+}^{\times}/p^{\mathbb{Z}},\mathcal{K}_{p}) (recall that the zeroth cohomology group H^{0}(\mathbb{R}_{+}^{\times}/p^{\mathbb{Z}},\mathcal{K}_{p}) is defined as the space of global sections of \mathcal{K}_{p}):

\displaystyle \sum_{\lambda\in\mathbb{R}_{+}^{\times}/p^{\mathbb{Z}}}\text{Order}_{\lambda}(f)=0

We now want to define the analogue of divisors on C_{p} (see Divisors and the Picard Group). A divisor D on C_{p} is a section C_{p}\rightarrow H, mapping \mathfrak{p}_{H}\in C_{p} to D(H)\in H, of the bundle of pairs (H,h), where H\subset\mathbb{R} is isomorphic to H_{p}, and h\in H. We define the degree of a divisor D as follows:

\displaystyle \text{deg}(D)=\sum_{\mathfrak{p}\in C_{p}}D(H)

Given a point \mathfrak{p}_{H}\in C_{p} such that H=\lambda H_{p} for some \lambda\in\mathbb{R}_{+}^{*}, we have a map \lambda^{-1}:H\rightarrow H_{p}. This gives us a canonical mapping

\displaystyle \chi: H\rightarrow H_{p}/(p-1)H_{p}\simeq\mathbb{Z}/(p-1)\mathbb{Z}

Given a divisor D on C_{p}, we define

\displaystyle \chi(D):=\sum_{\frak{p}_{H}\in C_{p}}\chi(D(H))\in\mathbb{Z}/(p-1)\mathbb{Z}

We have \text{deg}(D)=0 and \chi(D)=0 if and only if D=(f), for f\in H^{0}(\mathbb{R}_{+}^{\times}/p^{\mathbb{Z}}\mathcal{K}_{p}) i.e. D is a principal divisor.

We define the group J(C_{p}) as the quotient \text{Div}^{0}(C_{p})/\mathcal{P} of the group \text{Div}^{0}(C_{p}) of divisors of degree 0 on C_{p} by the group \mathcal{P} of principal divisors on C_{p}. The group J(C_{p}) is isomorphic to \mathbb{Z}/(p-1)\mathbb{Z}, while the group \text{Div}(C_{p})/\mathcal{P} of divisors on C_{p} modulo the principal divisors is isomorphic to \mathbb{R}\times(\mathbb{Z}/(p-1)\mathbb{Z}).

In order to state the analogue of Riemann-Roch theorem we need to define the following module over \mathbb{R}_{+}^{\text{max}}:

\displaystyle H^{0}(D):=\{f\in\mathcal{K}_{p}|D+(f)\geq 0\}

Given f\in H^{0}(C_{p},\mathcal{K}_{p}), we define

\displaystyle \|f\|_{p}:=\text{max}\{h(\lambda)|_{p}/\lambda,\lambda\in C_{p}\}

where h(\lambda) is the slope of f at \lambda. Then we have the following increasing filtration on H^{0}:

\displaystyle H^{0}(D)^{\rho}:=\{f\in H^{0}(D)|\|f\|_{p}\leq\rho\}

This allows us to define the following notion of dimension for H^{0}(D) (here \text{dim}_{\text{top}} refers to what is known as the topological dimension or Lebesgue covering dimension, a notion of dimension defined in terms of refinements of open covers):

\displaystyle \text{Dim}_{\mathbb{R}}(H^{0}(D))=\lim_{n\to\infty}p^{-n}\text{dim}_{\text{top}}(H^{0}(D)^{p^{n}})

The analogue of the Riemann-Roch theorem is now given by the following:

\displaystyle \text{Dim}_{\mathbb{R}}(H^{0}(D))+\text{Dim}_{\mathbb{R}}(H^{0}(-D))=\text{deg}(D)

S-Algebras

This concludes our discussion of the arithmetic site and the scaling site, but I would like to discuss one more related topic also being explored by Connes and Consani – the use of \mathbb{S}-algebras, which is closely related to the \Gamma-sets we have already introduced in The Field with One Element. Both of these concepts have their origins in homotopy theory.

We recall from the short discussion at the end of The Riemann Hypothesis for Curves over Finite Fields that the Weil conjectures, which are Weil’s generalization of the Riemann hypothesis for curves over finite fields to varieties of higher dimension, were proven by making use of cohomology (in particular etale cohomology) to find the fixed points of the powers of the Frobenius morphism (the formula that gives us the fixed points of a map using cohomology is called the Lefschetz fixed point formula). Now, concepts such as monoids, semirings, and many others (including the mathematician Nikolai Durov’s approach to the field with one element, which he also uses to develop a new version of Arakelov geometry) are all subsumed under the concept of \mathbb{S}-algebras, and doing so allows us to make use of a cohomology theory called topological cyclic cohomology. Connes and Consani hope that topological cyclic cohomology will help prove the original Riemann hypothesis the way that etale cohomology helped prove the Weil conjectures. Let us discuss briefly the work of Connes and Consani on this topic.

We recall from The Field with One Element the definition of a \Gamma-set (there also referred to as a \Gamma-space). A \Gamma-set is defined to be a covariant functor from the category \Gamma^{\text{op}}, whose objects are pointed finite sets and whose morphisms are basepoint-preserving maps of finite sets, to the category \textbf{Sets}_{*} of pointed sets. An \mathbb{S}-algebra is defined to be a \Gamma-set \mathscr{A}:\Gamma^{\text{op}}\rightarrow \textbf{Sets}_{*} together with an associative multiplication \mu:\mathscr{A}\wedge \mathscr{A}\rightarrow\mathscr{A} and a unit 1:\mathbb{S}\rightarrow\mathscr{A}, where \mathbb{S}:\Gamma^{\text{op}}\rightarrow\textbf{Sets}_{*} is the inclusion functor (also known as the sphere spectrum). An \mathbb{S}-algebra is a monoid in the symmetric monoidal category of \Gamma-sets with the wedge product and the sphere spectrum.

Any monoid M defines an \mathbb{S}-algebra \mathbb{S}M via the following definition:

\displaystyle \mathbb{S}M(X):=M\wedge X

for any pointed finite set X. Here M\wedge X is the smash product of M and X as pointed sets, with the basepoint for M given by its zero element element. The maps are given by \text{Id}_{M}\times f, for f:X\rightarrow Y.

Similarly, any semiring R defines an \mathbb{S}-algebra HR via the following definition:

\displaystyle HR(X):=X^{R/*}

for any pointed finite set X. Here X^{R/*} refers to the set of basepoint preserving maps from R to X. The maps HR(f) are given by HR(f)(\phi)(y):=\sum_{x\in f^{-1}(y)}\phi(x) for f:X\rightarrow Y, x\in X, and y\in Y. The multiplication HR(X)\wedge HR(Y)\rightarrow HR(X\wedge Y) is given by \phi\psi(x,y)=\phi(x)\psi(y) for any x\in X\setminus * and y\in Y\setminus *. The unit 1_{X}:X\rightarrow HR(X) is given by 1_{X}(x)=\delta_{x} for all x in X, where \delta_{x}(y)=1 if x=y, and 0 otherwise.

Therefore we can see that the notion of \mathbb{S}-algebra subsumes the notions of monoids and semirings, and other notions such as that of “hyperrings“, which we leave to the references for the moment. Instead, we will discuss how \mathbb{S}-algebras are related to the approach of Durov to the field with one element and Arakelov geometry. As we mentioned in Arakelov Geometry, the main idea of the theory is to consider the “infinite prime” along with the other points of \text{Spec}(\mathbb{Z}). We therefore define \overline{\text{Spec}(\mathbb{Z})} as \text{Spec}(\mathbb{Z})\cup \{\infty\}. Let \mathcal{O}_{\text{Spec}(\mathbb{Z})} be the structure sheaf of \text{Spec}(\mathbb{Z}). We want to extend this to a structure sheaf on \overline{\text{Spec}(\mathbb{Z})}, and to accomplish this we will use the functor H from semirings to \mathbb{S}-algebras defined earlier. For any open set U containing \infty, we define

\displaystyle \mathcal{O}_{\overline{\text{Spec}(\mathbb{Z})}}(U):=\|H\mathcal{O}_{\text{Spec}(\mathbb{Z})}(U\cup\text{Spec}(\mathbb{Z}))\|_{1}.

The notation \|\|_{1} is defined for the \mathbb{S}-algebra HR associated to the semiring R as follows:

\displaystyle \|HR(X)\|_{1}:=\{\phi\in HR(X)|\sum_{X\*}\|\phi(x)\|\leq 1\}

where \|\| in this particular case comes from the usual absolute value on \mathbb{Q}. This becomes available to us because the sheaf \mathcal{O}_{\overline{\text{Spec}(\mathbb{Z})}} is a subsheaf of the constant sheaf \mathbb{Q}.

Given an Arakelov divisor on \overline{\text{Spec}(\mathbb{Z})} (in this context an Arakelov divisor is given by a pair (D_{\text{finite}},D_{\infty}), where D_{\text{finite}} is an ordinary divisor on \text{Spec}(\mathbb{Z}) and D_{\infty} is a real number) we can define the following sheaf of \mathcal{O}_{\overline{\text{Spec}(\mathbb{Z})}}-modules over \overline{\text{Spec}(\mathbb{Z})}:

\displaystyle \mathcal{O}_{\overline{\text{Spec}(\mathbb{Z})}}(D)(U):=\|H\mathcal{O}(D_{\text{finite}})(U\cup\text{Spec}(\mathbb{Z}))\|_{e^{a}}

where a is the real number “coefficient” of D_{\infty}, and \|\|_{\lambda} means, for an R-module E (here the \mathbb{S}-algebra HE is constructed the same as HR, except there is no multiplication or unit) with seminorm \|\|^{E} such that \|a\xi\|^{E}\leq\|a\|\|\xi\|^{E} for a\in R and \xi\in E,

\displaystyle \|HE(X)\|_{\lambda}:=\{\phi\in HE(X)|^{E}\sum_{X\*}\|\phi(x)\|^{E}\leq \lambda\}

With such sheaves of \mathbb{S}-algebras on \overline{\text{Spec}(\mathbb{Z})} now constructed, the tools of topological cyclic cohomology can be applied to it. The theory of topological cyclic cohomology is left to the references for now, but will hopefully be discussed in future posts on this blog.

Conclusion

The approach of Connes and Consani, whether making use of the arithmetic site and the scaling site to apply Weil’s strategy to the original Riemann hypothesis, or making use of \mathbb{S}-algebras and topological cyclic cohomology in analogy with the proof of the Weil conjectures, is still currently facing several technical obstacles. In the former case, an intersection theory and a Riemann-Roch theorem on the square of the scaling site is yet to be constructed. In the latter, there is the problem of appropriate coefficients for the cohomology theory. There are already several proposed strategies for dealing with these obstacles. Such efforts, aside from aiming to prove the Riemann hypothesis, widens the scope of the mathematics that we have today, and, perhaps more importantly, uncovers more and more the mysterious geometry underlying the familiar everyday concept of numbers.

References:

On the Geometry of the Adele Class Space of Q by Caterina Consani

An Essay on the Riemann Hypothesis by Alain Connes

The Arithmetic Site by Alain Connes and Caterina Consani

Geometry of the Arithmetic Site by Alain Connes and Caterina Consani

The Scaling Site by Alain Connes and Caterina Consani

Geometry of the Scaling Site by Alain Connes and Caterina Consani

Absolute Algebra and Segal’s Gamma Sets by Alain Connes and Caterina Consani

New Approach to Arakelov Geometry by Nikolai Durov