The Arithmetic Site and the Scaling Site


In The Riemann Hypothesis for Curves over Finite Fields, we gave a rough outline of Andre Weil’s strategy to prove the analogue of the famous Riemann hypothesis for curves over finite fields. A rather natural question to ask would be, does this strategy give us any suggestions on how to take on the original Riemann hypothesis? We mentioned briefly in The Field with One Element that some mathematicians hope to find in the theory of the so-called “field with one element” something that will allow them to apply the ideas of Weil’s proof to the original Riemann hypothesis, by viewing the scheme \text{Spec}(\mathbb{Z})  as some kind of “curve” over the “field with one element”.

In this post we will consider something along similar lines, examining a kind of “space” to which we can apply an analogue of Weil’s strategy. This approach is due to the mathematicians Alain Connes and Caterina Consani, and makes use of the concepts of sites and toposes (see More Category Theory: The Grothendieck Topos and Even More Category Theory: The Elementary Topos). This is perhaps appropriate, since sites or toposes are often referred to as “generalized spaces”.

We recall from The Riemann Hypothesis for Curves over Finite Fields some aspects of Weil’s strategy. The object in consideration is a curve C over a finite field \mathbb{F}_{q}. In order to write down the zeta function for C, we need to count the number of points over \mathbb{F}_{q^{n}}, for every n from 1 to infinity. We can do this by counting the fixed points of powers of the Frobenius morphism. Explicitly this means taking intersection numbers of the diagonal and the divisor formed by integral linear combinations of powers of the Frobenius morphism on \bar{C}\times_{\bar{\mathbb{F}}_{q}}\bar{C}, where \bar{\mathbb{F}}_{q} is an algebraic closure of \mathbb{F}_{q} (it is the direct limit of the directed system formed by all the \mathbb{F}_{q^{n}}) and \bar{C}=C\otimes_{\mathbb{F}_{q}}\bar{\mathbb{F}}_{q}. The number of points of \bar{\mathbb{F}}_{q} will be the same as the number of points of C over \mathbb{F}_{q^{n}}. Throughout this post we should keep these steps of Weil’s strategy in mind.

In order to transfer this strategy of Weil to the original Riemann hypothesis, Connes and Consani construct the arithmetic site, meant to be the analogue of C, and the scaling site, meant to be the analogue of \bar{C}. The intuition behind these constructions is that the points of the scaling site, which is the same as the points of the arithmetic site “over \mathbb{R}_{+}^{\text{max}}“, is the same as the points of the “adele class space\mathbb{Q}^{\times}\backslash\mathbb{A}_{\mathbb{Q}}/\hat{\mathbb{Z}}^{\times}, which originally came up in earlier work of Connes where he constructed a quantum-mechanical system which gives Riemann’s prime-counting function (whose study provided the historical origin of the Riemann hypothesis), in the form of Weil’s “explicit formula”, as a quantum-mechanical trace formula! In essence this work restates the Riemann hypothesis in terms of mathematical language more commonly associated to physics, and is part of Connes’ pioneering work in noncommutative geometry, a new area of mathematics also closely related to physics, in particular quantum mechanics and quantum field theory. In the definition of the adele class space, \mathbb{A}_{\mathbb{Q}} refers to the ring of adeles of \mathbb{Q} (see Adeles and Ideles), while \hat{\mathbb{Z}} refers to \prod_{p}\mathbb{Z}_{p}, where \mathbb{Z}_{p} are the p-adic integers, which can be defined as the inverse limit of the inverse system formed by \mathbb{Z}/p^{n}\mathbb{Z}.

The Arithmetic Site

We now proceed to discuss the arithmetic site. It is described as the pair (\widehat{\mathbb{N}^{\times}},\mathbb{Z}_{\text{max}}), where \widehat{\mathbb{N}^{\times}} a Grothendieck topos, which, as we may recall from More Category Theory: The Grothendieck Topos, is defined as a category equivalent to the category \text{Sh}(\mathbf{C},J) of sheaves on a site (\mathbf{C},J). In the case of \widehat{\mathbb{N}^{\times}}, \mathbf{C} is the category with only one object, and whose morphisms correspond to the multiplicative monoid of nonzero natural numbers \mathbb{N}^{\times} (we also use \mathbb{N}^{\times} to denote this category, and \mathbb{N}_{0}^{\times} to denote the category with one object and whose morphisms correspond to \mathbb{N}^{\times}\cup\{0\}), while J is the indiscrete, or chaotic, Grothendieck topology, where all presheaves are also sheaves.

As part of the definition of the arithmetic site, we must also specify a structure sheaf. In this case is provided by \mathbb{Z}_{\text{max}}, the semiring (a semiring is like a ring, but is only a monoid, and not a group, under the “addition” operation – a semiring is also sometimes called a “rig“, because it is a ring without the “n” – the negative elements, and the most common example is the natural numbers \mathbb{N} with the usual addition and multiplication) whose elements are just the integers, together with -\infty, but where the “addition” is provided by the “maximum” operation, and the “multiplication” is provided by the ordinary addition! With the arithmetic site thus defined, we denote it by \mathcal{A}.

We digress for a while to discuss the semiring \mathbb{Z}_{\text{max}}, as well as the closely related semirings \mathbb{R}_{\text{max}} (defined similarly to \mathbb{Z}_{\text{max}}, but with the real numbers instead of the integers), \mathbb{R}_{+}^{\text{max}} (whose elements are the positive real numbers, with the addition given by the maximum operation, and the multiplication given by the ordinary multiplication), and the so-called Boolean semifield \mathbb{B} (whose elements are 0 and 1, with the addition again given by the maximum operation, and the multiplication again given by the ordinary multiplication). These semirings have origins in the area of mathematics known as tropical geometry, so named because one of its pioneers, Imre Simon, comes from Brazil, which is a tropical country. However, another source of inspiration is the work of the mathematical physicist Viktor Pavlovich Maslov in “semiclassical” quantum mechanics, where certain approximations could be made as the quantum mechanical systems being studied approached the classical limit. Maslov considered a “conjugated” addition

\displaystyle \lim_{\epsilon\to 0}(x^{\frac{1}{\epsilon}}+y^{\frac{1}{\epsilon}})^{\epsilon}

and this just happened to be the same as \text{max}(x,y).

Going back to the arithmetic site, we now discuss its points. Recall from Even More Category Theory: The Elementary Topos that a point of a topos (we discussed elementary toposes in that post, but this also applies to Grothendieck toposes) is defined by a geometric morphism from the topos \mathfrak{P} of sheaves of sets on the singleton set (the set with a single element) to the topos. This refers to a pair of adjoint functors such that the left-adjoint is left-exact (preserves finite limits). Therefore, for the arithmetic site, a point p is given by such a pair p^{*} and p_{*} such that p^{*}:\widehat{\mathbb{N}^{\times}}\rightarrow\textbf{Sets} is left-exact. The point p is also uniquely determined by the covariant functor \mathscr{P}=p^{*}\circ\epsilon:\mathbb{N}^{\times}\rightarrow\textbf{Sets} where \epsilon:\mathbb{N}^{\times}\rightarrow\widehat{\mathbb{N}^{\times}} is the Yoneda embedding.

There is an equivalence of categories between the category of points of the arithmetic site and the category of totally ordered groups which are isomorphic to the nontrivial subgroups of (\mathbb{Q},\mathbb{Q}_{+}) and injective morphisms of ordered groups. For such an ordered group \textbf{H} we therefore have a point \mathscr{P}_{\textbf{H}}. This gives us a correspondence with \mathbb{Q}_{+}^{\times}\backslash\mathbb{A}_{\mathbb{Q}}^{f}/\hat{\mathbb{Z}}^{\times} (where \mathbb{A}_{\mathbb{Q}}^{f} refers to the ring of finite adeles of \mathbb{Q}, which is defined similarly to the ring of adeles of \mathbb{Q} except that the infinite prime is not considered) because any such ordered group \textbf{H} is of the form \textbf{H}_{a}, the ordered group of all rational numbers q such that aq\in\hat{\mathbb{Z}}, for some unique a\in \mathbb{A}_{\mathbb{Q}}^{f}/\hat{\mathbb{Z}}. We can also now describe the stalks of the structure sheaf \mathbb{Z}_{\text{max}} at the point \mathscr{P}_{\textbf{H}}; it is isomorphic to the semiring H_{\text{max}}, with elements given by the set (\textbf{H}\cup\{-\infty\}), addition given by the maximum operation, and multiplication given by the ordinary addition.

The arithmetic site is analogous to the curve C over the finite field \mathbb{F}_{q}. As for the finite field \mathbb{F}_{q}, its analogue is given by the Boolean semifield \mathbb{B} mentioned earlier, which has “characteristic 1“, reminiscent of the field with one element (see The Field with One Element). Next we want to find the analogues of the algebraic closure \bar{\mathbb{F}}_{q}, as well as the Frobenius morphism. The former is given by the semiring \mathbb{R}_{+}^{\text{max}}, which contains \mathbb{B}, while the latter is given by multiplicative group of the positive real numbers \mathbb{R}_{+}^{\times}, as it is isomorphic to the group of automorphisms of \mathbb{R}_{+}^{\text{max}} that keep \mathbb{B} fixed.

But while we do know that the points of the arithmetic topos are given by geometric morphisms p:\mathfrak{P}\rightarrow \widehat{\mathbb{N}^{\times}} and determined by contravariant functors \mathscr{P}_{\textbf{H}}:\mathbb{N}^{\times}\rightarrow\textbf{Sets}, what do we mean by its “points over \mathbb{R}_{+}^{\text{max}}“? A point of the arithmetic site “over \mathbb{R}_{+}^{\text{max}}” refers to the pair (\mathscr{P}_{\textbf{H}},f_{\mathscr{P}}^{\#}), where \mathscr{P}_{\textbf{H}}:\mathbb{N}^{\times}\rightarrow\textbf{Sets} as earlier, and f_{\mathscr{P}_{\textbf{H}}}^{\#}:H_{\text{max}}\rightarrow\mathbb{R}_{+}^{\text{max}} (we recall that H_{\text{max}} are the stalks of the structure sheaf \mathbb{Z}_{\text{max}}). The points of the arithmetic site over \mathbb{R}_{+}^{\text{max}} include its points “over \mathbb{B}“, which are what we discussed earlier, and mentioned to be in correspondence with \mathbb{Q}_{+}^{\times}\backslash\mathbb{A}_{\mathbb{Q}}^{f}/\hat{\mathbb{Z}}^{\times}. But in addition, there are also other points of the arithmetic site over \mathbb{R}_{+}^{\text{max}} which are in correspondence with \mathbb{Q}_{+}^{\times}\backslash((\mathbb{A}_{\mathbb{Q}}^{f}/\hat{\mathbb{Z}}^{\times})\times\mathbb{R}_{+}^{\times}), just as \mathbb{R}_{+}^{\text{max}} contains all of \mathbb{B} but also other elements. Altogether, the points of the arithmetic site over \mathbb{R}_{+}^{\text{max}} correspond to the disjoint union of \mathbb{Q}_{+}^{\times}\backslash\mathbb{A}_{\mathbb{Q}}^{f}/\hat{\mathbb{Z}}^{\times} and \mathbb{Q}_{+}^{\times}\backslash((\mathbb{A}_{\mathbb{Q}}^{f}/\hat{\mathbb{Z}}^{\times})\times\mathbb{R}_{+}^{\times}), which is \mathbb{Q}^{\times}\backslash\mathbb{A}_{\mathbb{Q}}/\hat{\mathbb{Z}}^{\times}, the adele class space as mentioned earlier.

There is a geometric morphism \Theta:\text{Spec}(\mathbb{Z})\rightarrow \widehat{\mathbb{N}_{0}^{\times}} (here \widehat{\mathbb{N}_{0}^{\times}} is defined similarly to \widehat{\mathbb{N}^{\times}}, but with \mathbb{N}_{0}^{\times} in place of \mathbb{N}^{\times}) uniquely determined by

\displaystyle \Theta^{*}:\mathbb{N}_{0}^{\times}\rightarrow \text{Sh}(\text{Spec}(\mathbb{Z}))

which sends the single object of \mathbb{N}_{0}^{\times} to the sheaf \mathcal{S} on \text{Spec}(\mathbb{Z}), which we now describe. Let H_{p} denote the set of all rational numbers q such that a_{p}q is an element of \hat{Z}, where a_{p} is the adele with a 0 for the p-th component and 1 for all other components. Then the sheaf \mathcal{S} can be described in terms of its stalks \mathcal{S}_{\mathscr{P}}, which are given by H_{p}^{+}, the positive part of H_{p}, and \mathcal{S}_{0}, given by \{0\}. The sections \Gamma(U,\mathcal{S}) are given by the maps \xi:U\rightarrow \coprod_{p}H_{p}^{+} such that \xi_{p}\neq 0 for finitely many p\in U.

The Scaling Site

Now that we have defined the arithmetic site, which is the analogue of C, and the points of the arithmetic site over \mathbb{R}_{+}^{\text{max}}, which is the analogue of the points of C over the algebraic closure \bar{\mathbb{F}}_{q}, we now proceed to define the scaling site, which is the analogue of \bar{C}=C\otimes_{\mathbb{F}_{q}}\bar{\mathbb{F}}_{q}. The points of the scaling site are the same as the points of the arithmetic site over \mathbb{R}_{+}^{\text{max}}, which is analogous to the points of \bar{C} being the same as the points of C over \bar{\mathbb{F}}_{q}. But the importance of the scaling site lies in the fact that we can construct the analogue of a sheaf of rational functions on it, and a Riemann-Roch theorem, which, as we may recall from The Riemann Hypothesis for Curves over Finite Fields, it is also an important part of Weil’s proof.

The scaling site is once again given by a pair ([0,\infty)\rtimes\mathbb{N}^{\times},\mathcal{O}), where [0,\infty)\rtimes\mathbb{N}^{\times} is a Grothendieck topos and \mathcal{O} is a structure sheaf, but both are quite sophisticated constructions compared to the arithmetic site. To describe the Grothendieck topos [0,\infty)\rtimes\mathbb{N}^{\times} we recall that it must be a category equivalent to the category \text{Sh}(\mathbf{C},J) of sheaves on some site (\mathbf{C},J). Here \mathbf{C} is the category whose objects are given by bounded open intervals \Omega\subset [0,\infty), including the empty interval \null, and whose morphisms are given by

\displaystyle \text{Hom}(\Omega,\Omega')=\{n\in\mathbb{N}^{\times}|n\Omega\subset\Omega'\}

and in the special case that \Omega is the empty interval \null, we have

\displaystyle \text{Hom}(\Omega,\Omega')=\{*\}.

The Grothendieck topology J here is defined by the collection K(\Omega) of all ordinary covers of \Omega for any object \Omega of the category \mathbf{C}:

\displaystyle \{\Omega_{i}\}_{i\in I}=\{\Omega_{i}\subset\Omega|\cup_{i}\Omega_{i}=\Omega\}

Now we have to describe the structure sheaf \mathcal{O}. We start by considering \mathbb{Z}_{\text{max}}, the structure sheaf of the arithmetic site. By “extension of scalars” from \mathbb{B} to \mathbb{R}_{+}^{\text{max}} we obtain the reduced semiring \mathbb{Z}_{\text{max}}\hat{\otimes}_{\mathbb{B}}\mathbb{R}_{+}^{\text{max}}. This is not yet the structure sheaf \mathcal{O}, because the underlying category and Grothendieck topology for the scaling site is more complicated than the arithmetic site, and unlike the case for the arithmetic site, for the scaling site not every presheaf is a sheaf. So we must first “localize” \mathbb{Z}_{\text{max}}\hat{\otimes}_{\mathbb{B}}\mathbb{R}_{+}^{\text{max}}, and this gives us the structure sheaf \mathcal{O}.

Let us describe \mathbb{Z}_{\text{max}}\hat{\otimes}_{\mathbb{B}}\mathbb{R}_{+}^{\text{max}} in more detail. Let H be a rank 1 subgroup of \mathbb{R}. Then an element of H_{\text{max}}\hat{\otimes}_{\mathbb{B}}\mathbb{R}_{+}^{\text{max}} is given by a Newton polygon N\subset\mathbb{R}^{2}, which is the convex hull of the union of finitely many quadrants (x_{j},y_{j}-Q), where Q=H\times\mathbb{R}_{+} and (x_{j},y_{j})\in H\times R (a set is a convex set if it contains the line segment connecting any two of its points; the convex hull of a set is the smallest convex set that contains it). The Newton polygon N is uniquely determined by the function

\displaystyle \ell_{N}(\lambda)=\text{max}(\lambda x_{j}+y_{j})

for \lambda\in\mathbb{R}_{+}. This correspondence gives us an isomorphism between H\hat{\otimes}_{\mathbb{B}}\mathbb{R}_{+}^{\text{max}} and \mathcal{R}(H), the semiring of convex, piecewise affine, continuous functions on [0,\infty) with slopes in H\subset\mathbb{R} and finitely many singularities, with the pointwise operations (function is a convex function if the points on and above its graph form a convex set).

Therefore, we can describe the sections \Gamma(\Omega,\mathcal{O}) of the structure sheaf \mathcal{O}, for any bounded open interval \Omega, as the set of all convex, piecewise affine, continuous functions from \Omega to \mathbb{R}_{\text{max}} with slopes in \mathbb{Z}. We can also likewise describe the stalks of the structure sheaf \mathcal{O} – for a point \mathfrak{p}_{H}:[0,\infty)\rtimes\mathbb{N}^{\times}\rightarrow\textbf{Sets} associated to a rank 1 subgroup H\subset\mathbb{R}, the stalk \mathcal{O}_{\mathfrak{p}_{H}} is given by the semiring \mathcal{R}_{H} of germs of \mathbb{R}_{+}^{\text{max}}-valued, convex, piecewise affine, continuous functions with slope in H. We also have points \mathfrak{p}_{H}^{0}:[0,\infty)\rtimes\mathbb{N}^{\times}\rightarrow\textbf{Sets} with “support \{0\}“, corresponding to the points of the arithmetic site over \mathbb{B}. For such a point, the stalk \mathcal{O}_{\mathfrak{p}_{H}^{0}} is given by the semiring (H\times\mathbb{R})_{\text{max}} associated to the totally ordered group H\times\mathbb{R}.

Now that we have decribed the Grothendieck topos [0,\infty)\rtimes\mathbb{N}^{\times} and the structure sheaf \mathcal{O}, we describe the scaling site as being given by the pair ([0,\infty)\rtimes\mathbb{N}^{\times},\mathcal{O}), and we denote it by \hat{\mathcal{A}}.

Our next task, now that we have described the arithmetic site and the scaling site, is to find the analogue of the Riemann-Roch theorem. We start by noting that we have a sheaf of semifields \mathcal{K}, defined by letting \mathcal{K}(\Omega) be the semifield of fractions of \mathcal{O}(\Omega). For an element f_{H} in the stalk \mathcal{K}_{\mathfrak{p}_{H}} of \mathcal{K}, we define its order as

\displaystyle \text{Order}_{H}(f):=h_{+}-h_{-}


\displaystyle h_{\pm}:=\lim_{\epsilon\to 0_{\pm}}(f((1+\epsilon)H)-f(H))/\epsilon

for \epsilon\in\mathbb{R}_{+}.

We let C_{p} be the set of all points \mathfrak{p}_{H}:[0,\infty)\rtimes\mathbb{N}^{\times}\rightarrow\textbf{Sets} of the scaling site \hat{\mathcal{A}} such that H is isomorphic to H_{p}. The C_{p} are the analogues of the orbits of Frobenius. There is a topological isomorphism \eta_{p}:\mathbb{R}_{+}^{\times}/p^{\mathbb{Z}}\rightarrow C_{p}. It is worth noting that the expression \mathbb{R}_{+}^{\times}/p^{\mathbb{Z}} is reminiscent of the Tate uniformization of an elliptic curve (which generalizes the idea that an elliptic curve over the complex numbers forms a lattice in the complex plane to other complete fields besides the complex numbers –  see The Moduli Space of Elliptic Curves).

We have a pullback sheaf \eta_{p}^{*}(\mathcal{O}|_{C_{p}}), which we denote suggestively by \mathcal{O}_{p}. It is the sheaf on \mathbb{R}_{+}^{\times}/p^{\mathbb{Z}} whose sections are convex, piecewise affine, continuous functions with slopes in H_{p}. We can consider the sheaf of quotients \mathcal{K}_{p} of \mathcal{O}_{p} and its global sections f:\mathbb{R}_{+}^{\times}\rightarrow\mathbb{R}, which are piecewise affine, continuous functions with slopes in H_{p} such that f(p\lambda)=f(\lambda) for all \lambda\in\mathbb{R}_{+}^{\times}. Defining

\displaystyle \text{Order}_{\lambda}(f):=\text{Order}_{\lambda H_{p}}(f\circ\eta_{p}^{-1})

we have the following property for any f\in H^{0}(\mathbb{R}_{+}^{\times}/p^{\mathbb{Z}},\mathcal{K}_{p}) (recall that the zeroth cohomology group H^{0}(\mathbb{R}_{+}^{\times}/p^{\mathbb{Z}},\mathcal{K}_{p}) is defined as the space of global sections of \mathcal{K}_{p}):

\displaystyle \sum_{\lambda\in\mathbb{R}_{+}^{\times}/p^{\mathbb{Z}}}\text{Order}_{\lambda}(f)=0

We now want to define the analogue of divisors on C_{p} (see Divisors and the Picard Group). A divisor D on C_{p} is a section C_{p}\rightarrow H, mapping \mathfrak{p}_{H}\in C_{p} to D(H)\in H, of the bundle of pairs (H,h), where H\subset\mathbb{R} is isomorphic to H_{p}, and h\in H. We define the degree of a divisor D as follows:

\displaystyle \text{deg}(D)=\sum_{\mathfrak{p}\in C_{p}}D(H)

Given a point \mathfrak{p}_{H}\in C_{p} such that H=\lambda H_{p} for some \lambda\in\mathbb{R}_{+}^{*}, we have a map \lambda^{-1}:H\rightarrow H_{p}. This gives us a canonical mapping

\displaystyle \chi: H\rightarrow H_{p}/(p-1)H_{p}\simeq\mathbb{Z}/(p-1)\mathbb{Z}

Given a divisor D on C_{p}, we define

\displaystyle \chi(D):=\sum_{\frak{p}_{H}\in C_{p}}\chi(D(H))\in\mathbb{Z}/(p-1)\mathbb{Z}

We have \text{deg}(D)=0 and \chi(D)=0 if and only if D=(f), for f\in H^{0}(\mathbb{R}_{+}^{\times}/p^{\mathbb{Z}}\mathcal{K}_{p}) i.e. D is a principal divisor.

We define the group J(C_{p}) as the quotient \text{Div}^{0}(C_{p})/\mathcal{P} of the group \text{Div}^{0}(C_{p}) of divisors of degree 0 on C_{p} by the group \mathcal{P} of principal divisors on C_{p}. The group J(C_{p}) is isomorphic to \mathbb{Z}/(p-1)\mathbb{Z}, while the group \text{Div}(C_{p})/\mathcal{P} of divisors on C_{p} modulo the principal divisors is isomorphic to \mathbb{R}\times(\mathbb{Z}/(p-1)\mathbb{Z}).

In order to state the analogue of Riemann-Roch theorem we need to define the following module over \mathbb{R}_{+}^{\text{max}}:

\displaystyle H^{0}(D):=\{f\in\mathcal{K}_{p}|D+(f)\geq 0\}

Given f\in H^{0}(C_{p},\mathcal{K}_{p}), we define

\displaystyle \|f\|_{p}:=\text{max}\{h(\lambda)|_{p}/\lambda,\lambda\in C_{p}\}

where h(\lambda) is the slope of f at \lambda. Then we have the following increasing filtration on H^{0}:

\displaystyle H^{0}(D)^{\rho}:=\{f\in H^{0}(D)|\|f\|_{p}\leq\rho\}

This allows us to define the following notion of dimension for H^{0}(D) (here \text{dim}_{\text{top}} refers to what is known as the topological dimension or Lebesgue covering dimension, a notion of dimension defined in terms of refinements of open covers):

\displaystyle \text{Dim}_{\mathbb{R}}(H^{0}(D))=\lim_{n\to\infty}p^{-n}\text{dim}_{\text{top}}(H^{0}(D)^{p^{n}})

The analogue of the Riemann-Roch theorem is now given by the following:

\displaystyle \text{Dim}_{\mathbb{R}}(H^{0}(D))+\text{Dim}_{\mathbb{R}}(H^{0}(-D))=\text{deg}(D)


This concludes our discussion of the arithmetic site and the scaling site, but I would like to discuss one more related topic also being explored by Connes and Consani – the use of \mathbb{S}-algebras, which is closely related to the \Gamma-sets we have already introduced in The Field with One Element. Both of these concepts have their origins in homotopy theory.

We recall from the short discussion at the end of The Riemann Hypothesis for Curves over Finite Fields that the Weil conjectures, which are Weil’s generalization of the Riemann hypothesis for curves over finite fields to varieties of higher dimension, were proven by making use of cohomology (in particular etale cohomology) to find the fixed points of the powers of the Frobenius morphism (the formula that gives us the fixed points of a map using cohomology is called the Lefschetz fixed point formula). Now, concepts such as monoids, semirings, and many others (including the mathematician Nikolai Durov’s approach to the field with one element, which he also uses to develop a new version of Arakelov geometry) are all subsumed under the concept of \mathbb{S}-algebras, and doing so allows us to make use of a cohomology theory called topological cyclic cohomology. Connes and Consani hope that topological cyclic cohomology will help prove the original Riemann hypothesis the way that etale cohomology helped prove the Weil conjectures. Let us discuss briefly the work of Connes and Consani on this topic.

We recall from The Field with One Element the definition of a \Gamma-set (there also referred to as a \Gamma-space). A \Gamma-set is defined to be a covariant functor from the category \Gamma^{\text{op}}, whose objects are pointed finite sets and whose morphisms are basepoint-preserving maps of finite sets, to the category \textbf{Sets}_{*} of pointed sets. An \mathbb{S}-algebra is defined to be a \Gamma-set \mathscr{A}:\Gamma^{\text{op}}\rightarrow \textbf{Sets}_{*} together with an associative multiplication \mu:\mathscr{A}\wedge \mathscr{A}\rightarrow\mathscr{A} and a unit 1:\mathbb{S}\rightarrow\mathscr{A}, where \mathbb{S}:\Gamma^{\text{op}}\rightarrow\textbf{Sets}_{*} is the inclusion functor (also known as the sphere spectrum). An \mathbb{S}-algebra is a monoid in the symmetric monoidal category of \Gamma-sets with the wedge product and the sphere spectrum.

Any monoid M defines an \mathbb{S}-algebra \mathbb{S}M via the following definition:

\displaystyle \mathbb{S}M(X):=M\wedge X

for any pointed finite set X. Here M\wedge X is the smash product of M and X as pointed sets, with the basepoint for M given by its zero element element. The maps are given by \text{Id}_{M}\times f, for f:X\rightarrow Y.

Similarly, any semiring R defines an \mathbb{S}-algebra HR via the following definition:

\displaystyle HR(X):=X^{R/*}

for any pointed finite set X. Here X^{R/*} refers to the set of basepoint preserving maps from R to X. The maps HR(f) are given by HR(f)(\phi)(y):=\sum_{x\in f^{-1}(y)}\phi(x) for f:X\rightarrow Y, x\in X, and y\in Y. The multiplication HR(X)\wedge HR(Y)\rightarrow HR(X\wedge Y) is given by \phi\psi(x,y)=\phi(x)\psi(y) for any x\in X\setminus * and y\in Y\setminus *. The unit 1_{X}:X\rightarrow HR(X) is given by 1_{X}(x)=\delta_{x} for all x in X, where \delta_{x}(y)=1 if x=y, and 0 otherwise.

Therefore we can see that the notion of \mathbb{S}-algebra subsumes the notions of monoids and semirings, and other notions such as that of “hyperrings“, which we leave to the references for the moment. Instead, we will discuss how \mathbb{S}-algebras are related to the approach of Durov to the field with one element and Arakelov geometry. As we mentioned in Arakelov Geometry, the main idea of the theory is to consider the “infinite prime” along with the other points of \text{Spec}(\mathbb{Z}). We therefore define \overline{\text{Spec}(\mathbb{Z})} as \text{Spec}(\mathbb{Z})\cup \{\infty\}. Let \mathcal{O}_{\text{Spec}(\mathbb{Z})} be the structure sheaf of \text{Spec}(\mathbb{Z}). We want to extend this to a structure sheaf on \overline{\text{Spec}(\mathbb{Z})}, and to accomplish this we will use the functor H from semirings to \mathbb{S}-algebras defined earlier. For any open set U containing \infty, we define

\displaystyle \mathcal{O}_{\overline{\text{Spec}(\mathbb{Z})}}(U):=\|H\mathcal{O}_{\text{Spec}(\mathbb{Z})}(U\cup\text{Spec}(\mathbb{Z}))\|_{1}.

The notation \|\|_{1} is defined for the \mathbb{S}-algebra HR associated to the semiring R as follows:

\displaystyle \|HR(X)\|_{1}:=\{\phi\in HR(X)|\sum_{X\*}\|\phi(x)\|\leq 1\}

where \|\| in this particular case comes from the usual absolute value on \mathbb{Q}. This becomes available to us because the sheaf \mathcal{O}_{\overline{\text{Spec}(\mathbb{Z})}} is a subsheaf of the constant sheaf \mathbb{Q}.

Given an Arakelov divisor on \overline{\text{Spec}(\mathbb{Z})} (in this context an Arakelov divisor is given by a pair (D_{\text{finite}},D_{\infty}), where D_{\text{finite}} is an ordinary divisor on \text{Spec}(\mathbb{Z}) and D_{\infty} is a real number) we can define the following sheaf of \mathcal{O}_{\overline{\text{Spec}(\mathbb{Z})}}-modules over \overline{\text{Spec}(\mathbb{Z})}:

\displaystyle \mathcal{O}_{\overline{\text{Spec}(\mathbb{Z})}}(D)(U):=\|H\mathcal{O}(D_{\text{finite}})(U\cup\text{Spec}(\mathbb{Z}))\|_{e^{a}}

where a is the real number “coefficient” of D_{\infty}, and \|\|_{\lambda} means, for an R-module E (here the \mathbb{S}-algebra HE is constructed the same as HR, except there is no multiplication or unit) with seminorm \|\|^{E} such that \|a\xi\|^{E}\leq\|a\|\|\xi\|^{E} for a\in R and \xi\in E,

\displaystyle \|HE(X)\|_{\lambda}:=\{\phi\in HE(X)|^{E}\sum_{X\*}\|\phi(x)\|^{E}\leq \lambda\}

With such sheaves of \mathbb{S}-algebras on \overline{\text{Spec}(\mathbb{Z})} now constructed, the tools of topological cyclic cohomology can be applied to it. The theory of topological cyclic cohomology is left to the references for now, but will hopefully be discussed in future posts on this blog.


The approach of Connes and Consani, whether making use of the arithmetic site and the scaling site to apply Weil’s strategy to the original Riemann hypothesis, or making use of \mathbb{S}-algebras and topological cyclic cohomology in analogy with the proof of the Weil conjectures, is still currently facing several technical obstacles. In the former case, an intersection theory and a Riemann-Roch theorem on the square of the scaling site is yet to be constructed. In the latter, there is the problem of appropriate coefficients for the cohomology theory. There are already several proposed strategies for dealing with these obstacles. Such efforts, aside from aiming to prove the Riemann hypothesis, widens the scope of the mathematics that we have today, and, perhaps more importantly, uncovers more and more the mysterious geometry underlying the familiar everyday concept of numbers.


On the Geometry of the Adele Class Space of Q by Caterina Consani

An Essay on the Riemann Hypothesis by Alain Connes

The Arithmetic Site by Alain Connes and Caterina Consani

Geometry of the Arithmetic Site by Alain Connes and Caterina Consani

The Scaling Site by Alain Connes and Caterina Consani

Geometry of the Scaling Site by Alain Connes and Caterina Consani

Absolute Algebra and Segal’s Gamma Sets by Alain Connes and Caterina Consani

New Approach to Arakelov Geometry by Nikolai Durov


Even More Category Theory: The Elementary Topos

In More Category Theory: The Grothendieck Topos, we defined the Grothendieck topos as something like a generalization of the concept of sheaves on a topological space. In this post we generalize it even further into a concept so far-reaching it can even be used as a foundation for mathematics.

I. Definition of the Elementary Topos

We start by discussing the generalization of the universal constructions we defined in More Category Theory: The Grothendieck Topos, called limits and colimits.

Given categories \mathbf{J} and \mathbf{C}, we refer to a functor F: \mathbf{J}\rightarrow \mathbf{C} as a diagram in \mathbf{C} of type \mathbf{J}, and we refer to \mathbf{J} as an indexing category. We write the functor category of all diagrams in \mathbf{C} of type \mathbf{J} as \mathbf{C^{J}}.

Given a diagram F: \mathbf{J}\rightarrow \mathbf{C}, a cone to F is an object N of \mathbf{C} together with morphisms \psi_{X}: N\rightarrow F(X) indexed by the objects X of \mathbf{J} such that for every morphism f: X\rightarrow Y in  \mathbf{J}, we have F(f)\circ \psi_{X}=\psi_{Y}.

A limit of a diagram F: \mathbf{J}\rightarrow \mathbf{C} is a cone (L, \varphi) to F such that for any other cone (N, \psi)  to F there exists a unique morphism u: N\rightarrow L such that \varphi_{X}\circ \psi_{X} for all X in J.

For example, when \mathbf{J} is a category with only two objects A and B and whose only morphisms are the identity morphisms on each of these objects, the limit of the diagram F: \mathbf{J}\rightarrow \mathbf{C} is just the product. Similarly, the pullback is the limit of the diagram F: \mathbf{J}\rightarrow \mathbf{C} when \mathbf{J} is the category with three objects A, B, and C, and the only morphisms aside from the identity morphisms are one morphism A\xrightarrow{f}C and another morphism B\xrightarrow{g}C. The terminal object is the limit of the diagram F: \mathbf{J}\rightarrow \mathbf{C} when \mathbf{J} is the empty category, and the equalizer is the limit of the diagram F: \mathbf{J}\rightarrow \mathbf{C} when \mathbf{J} is the category with two objects A and B and whose only morphisms aside from the identity morphisms are two morphisms A\xrightarrow{f}B and A\xrightarrow{g}B.

A colimit is the dual concept to a limit, obtained by reversing the directions of all the morphisms in the definition. In the same way that the limit generalizes the concepts of product, pullback, terminal object, and equalizer, the colimit generalizes the concepts of coproduct, pushout, initial object, and coequalizer.

Next we discuss the concept of adjoint functors. Consider two categories \mathbf{C} and \mathbf{D}, and two functors F: \mathbf{C}\rightarrow \mathbf{D} and G: \mathbf{D}\rightarrow \mathbf{C}. We say that F is right adjoint to G, and that G is left adjoint to F, if for all objects C in \mathbf{C} and D in \mathbf{D} there exist bijections

\theta: \text{Hom}_{\mathbf{C}}(C, G(D))\xrightarrow{\sim}\text{Hom}_{\mathbf{D}}(F(C), D)

which are natural in the sense that given morphisms \alpha: C\rightarrow C' in \mathbf{C} and \xi: D'\rightarrow D in \mathbf{D}, we have

\theta(G(\alpha)\circ f\circ \xi)=\alpha\circ \theta(f)\circ F(\xi).

Suppose that products exist in \mathbf{C}. For a fixed object A of \mathbf{C}, consider the functor

A\times - : \mathbf{C}\rightarrow \mathbf{C}

which sends an object C of \mathbf{C} to the product A\times C in \mathbf{C}. If this functor has a right adjoint, we denote it by

(-)^{A}: \mathbf{C}\rightarrow \mathbf{C}.

We refer to the object A as an exponentiable object. We refer to the object B^{A} for some B in \mathbf{C} as an exponential object in \mathbf{C}. A category is called Cartesian closed if it has a terminal object and binary products, and if every object is an exponentiable object.

In the category \mathbf{Sets}, the exponential object B^{A} corresponds to the set of all functions from A to B. This also explains our notation for functor categories such as \mathbf{Sets^{C^{op}}} and \mathbf{C^{J}}.

Finally, we discuss the concept of subobject classifiers. We start by defining two important kinds of morphisms, monomorphisms and epimorphisms. A monomorphism (also called a mono, or monic) is a morphism f: X\rightarrow Y such that for all morphisms g_{1}: Y\rightarrow Z and g_{2}: Y\rightarrow Z, whenever the compositions f\circ g_{1} and f\circ g_{2} are equal, then it is guaranteed that g_{1} and g_{2} are also equal. An epimorphism (also called an epi, or epic)  is the dual of this concept, obtained by reversing the directions of all the morphisms in the definition of a monomorphism.

Two monomorphisms f: A\rightarrow D and g: B\rightarrow D are called equivalent if there is an isomorphism h: A\rightarrow B such that g\circ h=f. A subobject of D is then defined as an equivalence class of monomorphisms with domain D.

A subobject classifier is an object \Omega and a monomorphism \text{true}: 1\rightarrow \Omega such that to every monic j: U\rightarrow X there is a unique arrow \chi_{j}: X\rightarrow \Omega such that if u: U\rightarrow 1 is the unique morphism from U to the terminal object 1, then we have

\chi_{j}\circ j=\text{true}\circ u.

The significance of the subject classifier can perhaps best be understood by considering the category \mathbf{Sets}. The characteristic function \chi_{j} of the subset U of X is defined as the function on X that gives the value 1 if x\in U and gives the value 0 if x\notin U. Then we can set the terminal object 1 to be the set \{0\} and the object \Omega as the set \{0,1\}. The morphism \text{true} then sends 0\in \{0\} to 0\in \{0,1\}. The idea is that subobjects, i.e. subsets of sets in \mathbf{Sets}, can be obtained as pullbacks of \text{true} along the characteristic function \chi_{j}.

For the category \text{Sh }(X) of sheaves on a topological space X, the subobject classifier is the sheaf on X where for each open subset U of X the set \mathcal{F} (U) is given by the set of open subsets of U. The morphism \text{true} then “selects” the “maximal” open subset U of U.

Now we define our generalization of the Grothendieck topos. An elementary topos is a category \mathcal{E} satisfying the following conditions.

(i) \mathcal{E} has all finite limits and colimits.

(ii) \mathcal{E} is Cartesian closed.

(iii) \mathcal{E} has a subobject classifier.

A Grothendieck topos satisfies all these conditions and is an example of an elementary topos. However, the elementary topos is a much more general idea, and whereas the Grothendieck topos can be considered as a “generalized space”, the elementary topos can be considered as a “generalized universe of sets”. The term “universe”, as used in mathematics, refers to the entirety of where our discourse takes place, such that any concept or construction that we will ever need to consider or discuss can be found in this universe.

Perhaps the most basic example of an elementary topos is the category \mathbf{Sets}. It is actually also a Grothendieck topos, with its underlying category the category with one object and one morphism, which is the identity morphism on its one object. An example of an elementary topos that is not a Grothendieck topos is the category \mathbf{FinSets} of finite sets. It is worth noting, however, that despite the elementary topos being more general, the Grothendieck topos still continues to occupy somewhat of a special place in topos theory, including its applications to logic and other branches of mathematics beyond its origins in algebraic geometry.

II. Logic and the Elementary Topos

Mathematics is formalized, as a language, using what is known as first-order logic (also known as predicate logic or predicate calculus). This involves constants and variables of different “sorts” or “types”, such as x or y, strung together by relations, usually written Q(x, y), expressing a statement such as x=y. We also have functions, usually written g(x, y) expressing something such as x+y. The variables and functions are terms, and when these terms and strung together by relations, they form formulas. These formulas in turn are strung together by binary connectives such as “and”, “or”, “not”, “implies” and quantifiers such as “for all” and “there exists” to form more complicated formulas.

We can associate with an elementary topos a “language”. The “types” of this language are given by the objects of the topos. “Functions” are given by morphisms of objects. “Relations” are given by the subobjects of the object. In addition to these we need a notion of quantifiers, “for all” (written \forall) and “there exists” (written \exists). These quantifiers are given, for the functors \text{Sub }(Y)\rightarrow \text{Sub }(X), by left and right adjoints \exists_{f}, \forall_{f}: \text{Sub }(X)\rightarrow \text{Sub }(Y). For the binary connectives such as “and”, or”, “not”, and “implies”, we rely on the Heyting algebra structure on the subobjects of an elementary topos.

The existence of a Heyting algebra structure means that there exist operations, called join (written \vee) and meet (written \wedge), generalizing unions and intersections of sets, supremum and infimum of elements, or binary connectives “and” and “or”, a least element (written 0), a greatest element (written 1), and an implication operation such that

z\leq(x\Rightarrow y) if and only if z\wedge x\leq y.

We also have the negation of an element x

\neg x=(x\Rightarrow 0).

This Heyting algebra structure for subobjects \text{Sub }(A) of an object A of an elementary topos is provided by taking pullbacks (for the meet) and coproducts (for the join), with 0\rightarrow A as the least element, A\rightarrow A as the greatest element, and the implication given by the exponential.

We have shown one way in which topos theory is related to logic. Now we show how topos theory is related to the most commonly accepted foundations of mathematics, set theory. More technically, these foundations come from a handful of axioms called the ZFC axioms. The letters Z and F come from the names of the mathematicians who developed it, Ernst Zermelo and Abraham Fraenkel, while the letter C comes from another axiom called the axiom of choice.

The elementary topos, with some additional conditions, can be used to construct a version of the ZFC axioms. The first condition is that whenever there are two morphisms f: A\rightarrow B and g: A\rightarrow B, and a morphism x: 1\rightarrow X from the terminal object 1 to A, we only have f\circ x=g\circ x if f=g. In this case we say that the topos is well-pointed. The second condition is that we have a natural numbers object, which is an object \mathbf{N} and morphisms 0:1\rightarrow \mathbf{N} ands:\mathbf{N}\rightarrow \mathbf{N}, such that for any other object X and morphisms x:1\rightarrow X and f:X\rightarrow X, we have a unique morphism h: \mathbf{N}\rightarrow X such that h\circ 0=x and h\circ s=f . The third condition is the axiom of choice; this is equivalent to the statement that for every epimorphism p:X\rightarrow I there exists s:I\rightarrow X such that s\circ p=1.

One of the issues that hounded set theory in the early days after the ZFC axioms were formulated where whether the axiom of choice could be derived from the other axioms (these axioms were simply called the ZF axioms) or whether it needed to be put in separately. Another issue concerned what was known as the continuum hypothesis, a statement concerning the cardinality of the natural numbers and the real numbers, and whether this statement could be proved or disproved from the ZFC axioms alone. The mathematician Paul Cohen showed that both the axiom of choice and the continuum hypothesis are independent of ZF and ZFC respectively. A topos-theoretic version of Cohen’s proof of the independence of the continuum hypothesis was then later developed by the mathematicians William Lawvere and Myles Tierney (both of whom also developed much of the original theory of elementary toposes).

We now discuss certain aspects of topos theory related to Cohen’s proof. First we introduce a construction in an elementary topos that generalizes the Grothendieck topology discussed in More Category Theory: The Grothendieck Topos. A Lawvere-Tierney topology on \mathcal{E} is a map: j: \Omega\rightarrow \Omega such that

(a) j\circ \text{true}=\text{true}

(b) j\circ j=j

(c) j\circ \wedge=\wedge \circ (j\times j)

The Lawvere-Tierney topology allows us to construct sheaves on the topos, and together with the Heyting algebra structure on the subobject classifier \Omega, allows us to construct double negation sheaves, which themselves form toposes that have the special property that they are Boolean, i.e. the Heyting algebra structure of its subobject classifier satisfies the additional property \neg \neg x=x. This is important because a well-pointed topos, which is necessary to formulate a topos-theoretic version of ZFC, is necessarily Boolean. Another condition for the topos to be well-pointed is for it to be two-valued, which means that there are only two morphisms from the terminal object 1 to \Omega. We can obtain such a two-valued topos from any other topos using the concept of a filter, which essentially allows us to take “quotients” of the Heyting algebra structure on \Omega.

There is yet another condition for an elementary topos to be well-pointed, namely that its “supports split” in the topos. This condition is automatically satisfied whenever the topos satisfies the axiom of choice.

It turns out that the topos of double negation sheaves over a partially ordered set is Boolean (as discussed earlier) and satisfies the axiom of choice. For proving the independence of the continuum hypothesis, a partially ordered set was constructed by Cohen, representing  “finite states of knowledge”, and we can use this to form a topos of double negation sheaves known as the Cohen topos. Using the concept of a filter we then obtain a two-valued topos and therefore satisfy all the requirements for a topos-theoretic version of ZFC. However, the continuum hypothesis does not hold in the Cohen topos, thus proving its independence of ZFC.

A similar strategy involving double negation sheaves was used by the mathematician Peter Freyd to develop a topos-theoretic version of Cohen’s proof of the independence of the axiom of choice from the other axioms ZF, using a different underlying category (since a partially ordered set would automatically satisfy the axiom of choice). In both cases the theory of elementary toposes would provide a more “natural” language for Cohen’s original proofs.

III. Geometric Morphisms

We now discuss morphisms between toposes. The elementary topos was inspired by the Grothendieck topos, which was in turn inspired by sheaves on a topological space, so we turn to the classical theory once more and look at morphisms between sheaves. Given a continuous function f: X\rightarrow Y, and a sheaf \mathcal{F} on X, we can define a sheaf, called the direct image sheaf, f_{*}\mathcal{F} on Y by setting f_{*}\mathcal{F}(V)=\mathcal{F}(f^{-1}(V)) for every open subset V\subseteq Y. Similarly, given a sheaf \mathcal{G} on Ywe also have the inverse image sheaf, however it cannot similarly be defined as f^{*}\mathcal{G}(U)=\mathcal{G}(f(U)) for an open subset U\subseteq X, since the image of U in Y may not be an open subset of Y.

This can be remedied by the process of “sheafification”; we think instead in terms of the “stalks” of the sheaf \mathcal{G}, i.e. sets that are in some way “parametrized” by the points y of Y. Then we can obtain sets “parametrized” by the points f(x); these sets then form the inverse image sheaf f^{*}\mathcal{G} on X. The points of a space are of course not open sets in the usual topologies that we use, so the definition of a stalk involves the “direct limit” of open sets containing the point. It is worth noting that the inverse image “preserves” finite limits.

The process of taking the direct image sheaf can be expressed as a functor between the category \text{Sh }(X) of sheaves on X to the category \text{Sh }(Y) of sheaves on Y. The inverse image sheaf is then the right adjoint to the direct image functor, and it has the property that it preserves finite limits.

A geometric morphism is a pair of adjoint functors between toposes such that the left adjoint preserves finite limits. This allows us to form the category \mathfrak{Top} whose objects are elementary toposes and whose morphisms are geometric morphisms. The natural transformations between geometric morphisms, called geometric transformations, give the category \mathfrak{Top} the extra structure of a 2-category. There are also logical morphisms between toposes, which preserve all structure, and with them and their natural transformations we can form the 2-category \mathfrak{Log}.

We can also define the topos \mathfrak{Top}/\mathcal{S} as the category whose objects are geometric morphisms p: \mathcal{E}\rightarrow \mathcal{S} and whose morphisms (p: \mathcal{F}\rightarrow \mathcal{S})\rightarrow (q: \mathcal{E}\rightarrow \mathcal{S}) are pairs (f, \alpha) where f: \mathcal{F}\rightarrow \mathcal{E} is a geometric morphism and \alpha: q\cong p\circ f is a geometric transformation. Together with “2-cells” (f, \alpha)\rightarrow (g, \beta) given by geometric transformations f\rightarrow g that are “compatible” in some sense with \alpha and \beta\mathfrak{Top}/\mathcal{S} also forms a 2-category.

Geometric morphisms can now be used to define the points of a topos. In the category of sets, we can use the morphisms of the set consisting of only one element to all the other sets to indicate the elements of these other sets. The same goes for topological spaces and their points. We have mentioned earlier the category \mathbf{Sets} as the topos of sheaves on a point. Therefore, we define the points of a topos \mathcal{E} as the geometric morphisms from \mathbf{Sets} to \mathcal{E}.

There exist, however, toposes (including Grothendieck toposes) without points. Sheaves, however, are defined only using open sets, therefore to deal with toposes satisfactorily we can make use of the concept of locales, which abstract the properties of open sets and the study of topological spaces, while “forgetting” the underlying sets of points. A topos which is equivalent to the category of sheaves on some locale is called a localic topos.

An important result in the theory of localic toposes is Barr’s theorem, which states that for every Grothendieck topos \mathcal{E} there exists a sheaf \text{Sh }(\mathbf{B}) on a locale \mathbf{B} with a “complete” Boolean algebra structure and an epimorphism \text{Sh }(\mathbf{B})\rightarrow \mathcal{E}. Another important results is Deligne’s theorem, which states that a coherent topos, i.e. a topos \mathcal{E} for which there is a  site (\mathbf{C}, J) where \mathbf{C} has finite limits and the Grothendieck topology has a “basis” which consists of finite covering families, has “enough points“, i.e. for any two arrows \alpha: E\rightarrow D and \alpha: E\rightarrow D in \mathcal{E} there exists a point p: \mathbf{Sets}\rightarrow \mathcal{E} such that the stalk p^{*}(\alpha) is not equal to the stalk p^{*}(\beta) .

We can also use geometric morphisms to define the idea of a classifying topos. A classifying topos is an elementary topos such that objects in any other topos can be “classified” by the geometric morphisms of the topos to the classifying topos. For example, ring objects in any topos \mathcal{E} are classified by the topos given by the opposite category of the category of “finitely presented” rings \mathbf{fp}\textbf{-}\mathbf{rings^{op}}. The object in \mathbf{fp}\textbf{-}\mathbf{rings^{op}} given by the polynomial ring \mathbf{Z}[X] is then a universal object, such that any ring object in \mathcal{E} can be obtained by constructing the pullback of \mathbf{Z}[X]\rightarrow \mathbf{fp}\textbf{-}\mathbf{rings^{op}} along \mathcal{E}\rightarrow \mathbf{fp}\textbf{-}\mathbf{rings^{op}}.

We now combine the idea of classifying toposes (which was inspired by the idea of classifying spaces in algebraic topology) with the applications of topos theory to first-order logic discussed earlier. A theory \mathbb{T} is a set of formulas, called the axioms of the theory, and a model of \mathbb{T} in a topos \mathcal{E} is an interpretation, i.e. an assignment of an object of \mathcal{E} to every type of the first-order language, a subobject of \mathcal{E} to every relation, and a morphism of \mathcal{E} to every function, with quantifiers and binary connectives provided by the corresponding adjoint functors and Heyting algebra structures respectively.

A theory is called a coherent theory if it is of the form \forall x (\phi(x)\Rightarrow \psi(x)), where \phi(x) and \psi(x) are coherent formulas, i.e. formulas which are built up using only the operations of finitary conjunction \wedge, finitary disjunction \vee, and existential quantification \exists. If we also allow as well the operation of infinitary disjunction \bigvee, then we will obtain a geometric formula, and a theory of the form \forall x (\phi(x)\Rightarrow \psi(x)), where \phi(x) and \psi(x) are geometric formulas is called a geometric theory.

Most theories in mathematics are coherent theories. For those which are not, however, there is a certain process called Morleyization which associates to those theories a coherent theory.

For any model of a coherent theory \mathbb{T} in an elementary topos \mathcal{E}, there exists a classifying topos \mathcal{E}_\mathbb{T} and a universal object (in this context also called a universal model) such that said model can be obtained as a pullback of U\rightarrow \mathcal{E}_\mathbb{T} along the geometric morphism \mathcal{E}\rightarrow \mathcal{E}_\mathbb{T}.

We mention yet another aspect of topos theory where logic and geometry combine. We have earlier mentioned the theorems of Deligne and Barr in the context of studying toposes as sheaves on locales. Combined with the logical aspects of the toposes, and the theory of classifying toposes, Deligne’s theorem implies that a statement of the form \forall x (\phi(x)\Rightarrow \psi(x)) where \phi(x) and \psi(x) are coherent formulas holds in all models of the coherent theory \mathbb{T} in any topos if and only if it holds in all models of \mathbb{T} in \mathbf{Sets}.

Meanwhile, Barr’s theorem implies that a statement of the form \forall x (\phi(x)\Rightarrow \psi(x)) where \phi(x) and \psi(x) are geometric formulas holds in all models of the geometric theory \mathbb{T} in any topos if  and only if it holds in all models of \mathbb{T} in Boolean toposes.

In this context, Deligne’s theorem and Barr’s theorem respectively correspond to finitary and infinitary versions of a famous theorem in classical logic called Godel’s completeness theorem.


Topos on Wikipedia

Topos on the nLab

What is … a Topos? by Zhen Lin Low

An Informal Introduction to Topos Theory by Tom Leinster

Topos Theory by Peter T. Johnstone

Sketches of an Elephant: A Compendium of Topos Theory by Peter T. Johnstone

Handbook of Categorical Algebra 3: Categories of Sheaves by Francis Borceux

Sheaves in Geometry and Logic: A First Introduction to Topos Theory by Saunders Mac Lane and Ieke Moerdijk

More Category Theory: The Grothendieck Topos

In Category Theory, we generalized the notion of a presheaf (see Presheaves) to denote a contravariant functor from a category \mathbf{C} to sets. In this post, we do the same to sheaves (see Sheaves).

We note that the notion of an open covering was necessary in order to define the concept of a sheaf, since this was what allowed us to “patch together” the sections of the presheaf over the open subsets of a topological space. So before we can generalize sheaves we must first generalize open coverings and other concepts associated to it, such as intersections.

product, which is a diagram of objects P, X, Y, and morphisms p_{1}: P\rightarrow X and p_{2}: P\rightarrow Y, and if there is another object Q and morphisms q_{1}: Q\rightarrow X and q_{2}: Q\rightarrow Y, then there is a unique morphism u from Q to P such that p_{1}\circ u=q_{1} and p_{2}\circ u=q_{2}. The object P is often also referred to as the product and written X\times Y.

A related notion is that of a  pullback (also called a fiber product) is a diagram of objects P, X, Y, and Z, and morphisms p_{1}: P\rightarrow X, p_{2}: P\rightarrow Y, f: X\rightarrow Z, and g: Y\rightarrow Z, such that f\circ p_{1}=g\circ p_{2}, and if there is another object Q and morphisms q_{1}: Q\rightarrow X and q_{2}: Q\rightarrow Y with f\circ q_{1}=g\circ q_{2}, then there is a unique morphism u from Q to P such that p_{1}\circ u=q_{1} and p_{2}\circ u=q_{2}. The object P is often also referred to as the fibered product and written X\times_{Z}Y.

Another related concept is that of a terminal object. A terminal object T in a category \mathbf{C} is just an object such that for every other object C in \mathbf{C} there is a unique morphism C\rightarrow T.

Finally, we give the definition of an equalizer. We will need this notion when we construct sheaves on our generalization of the open covering of a topological space. An equalizer is a diagram of objects E, X, Y and morphisms eq: E\rightarrow X, f: X\rightarrow Y, and g: X\rightarrow Y, such that f\circ eq=g\circ eq and if there is another object O and morphism m: O\rightarrow X such that f\circ m=g\circ m, there is a unique morphism u: O\rightarrow E such that eq\circ u=m.

By simply reversing the directions of the morphisms on these definitions, we obtain the “dual” notions of coproductpushout (also called fiber coproduct), initial object, and coequalizer.

The objects that we have defined above are called universal constructions and are subsumed under the more general concepts of limits and colimits. These universal constructions are unique up to unique isomorphism (An isomorphism in a category \mathbf{C} is a morphism f: C\rightarrow D in \mathbf{C} for which there exists a necessarily unique morphism g: D\rightarrow C in \mathbf{C}, called the inverse of f, such that f\circ g=1_{\mathbf{D}} and g\circ f=1_{\mathbf{C}}).

These universal constructions are generalizations of familiar concepts. For example, the product in the category of sets corresponds to the cartesian product, while its dual, the coproduct, corresponds to the disjoint union. The terminal object in the category of sets is any set composed of a single element, since every other set has only one function to it, while its dual, the initial object, is the empty set, which has only one function to every other set.

We now proceed with our generalization of an open covering. Let \mathbf{C} be a category and C an object of \mathbf{C}. A sieve S on C is given by a family of morphisms on \mathbf{C}, all with codomain C, such that whenever a morphism f is in S, it is guaranteed that the composition g\circ f is also in S for all morphisms g for which the composition g\circ f makes sense.

If S is a sieve on C and h: D\rightarrow C is any morphism with codomain C, then we denote by h^{*}(S) the family of morphisms g with codomain D such that the composition h\circ g is in S. h^{*}(S) is a sieve on D.

 We now quote from the book Sheaves in Geometry and Logic: A First Introduction to Topos Theory by Saunders Mac Lane and Ieke Moerdijk the axioms for a Grothendieck topology.

A (Grothendieck) topology on a category \mathbf{C} is a function J which assigns to each object C of \mathbf{C} a collection J(C) of sieves on C, in such a way that

(i) the maximal sieve t_{C}=\{f|\text{cod}(f) =C\} is in J(C);

(ii) (stability axiom) if S\in J(C) , then h^{*}(S)\in J(D) for any arrow h: D\rightarrow C;

(iii) (transitivity axiom) if S\in J(C) and R is any sieve on C such that h^{*} (R)\in J(D) for all h: D\rightarrow C in S, then R\in  J(C).

The intuitions behind these axioms might perhaps best be seen by considering a category whose objects are open sets and whose morphisms are inclusions of these open sets. Axiom (i) essentially says that the open set C is covered by the collection of all its open subsets. Axiom (ii) says that the open subset D of C is covered by the intersections D\cap C_{i} of D with the open subsets C_{i} covering C. Axiom (iii) says that if a collection D_{i,j} of open subsets covers every open subset C_{j} covering C, then the collection D_{i,j} covers C.

We then quote from the same book the definition of a site:

A site will mean a pair (\mathbf{C}, J) consisting of a small category \mathbf{C} and a Grothendieck topology J on \mathbf{C}. If S\in J(C),  one says that S is a covering sieve, or that S covers C (or if necessary, that S J-covers C).

(The book uses the terminology of a small category to specify that the objects and morphisms of the category form a set, instead of a proper class. The terminology of sets and classes was developed to prevent what is known as “Russell’s paradox” and its variants. In many of the posts on this blog we will not need to explicitly specify whether a category is a small category or not.)

We already know how to construct a presheaf on \mathbf{C}; a presheaf is just a contravariant functor from \mathbf{C} to \mathbf{Sets}. Now we just need to generalize the conditions for a presheaf to become a sheaf.

We go back to the conditions that make a (classical) presheaf a sheaf. They can be summarized in the language of category theory by saying that

\displaystyle e:\mathcal{F}(U)\longrightarrow \prod_{i}\mathcal{F}(U_{i})

is the equalizer of

\displaystyle p: \prod_{i}\mathcal{F}(U_{i})\longrightarrow\prod_{i,j}\mathcal{F}(U_{i}\cap U_{j})


\displaystyle q: \prod_{i}\mathcal{F}(U_{i})\longrightarrow\prod_{i,j}\mathcal{F}(U_{i}\cap U_{j})

where for s\in \mathcal{F}(U), e(s)=\{s|_{U_{i}}|i\in I\} and for a family s_{i}\in \mathcal{F}(U_{i}),

p\{s_{i}\}=\{t_{i}|_{U_{i}\cap U_{j}}\}, \quad q\{s_{i}\}=\{s_{i}|_{U_{i}\cap U_{j}}\}.

The analogous condition for a (generalized) presheaf P on a category \mathbf{C} equipped with a Grothendieck topology J is for

\displaystyle e: P(C)\longrightarrow\prod_{f\in S}P(\text{dom f})

to be an equalizer for

\displaystyle p: \prod_{f\in S}P(\text{dom f})\longrightarrow\prod_{f, g f\in S\ \text{dom }f=\text{cod }g}P(\text{dom g})


\displaystyle q: \prod_{f\in S}P(\text{dom f})\longrightarrow\prod_{f, g f\in S\ \text{dom }f=\text{cod }g}P(\text{dom g})

We now introduce the notion of equivalent categories. We first establish some more notation. The set of morphisms from an object C to C' in a category \mathbf{C} will be denoted by \text{Hom}_{\mathbf{C}}(C, C'). A functor F: \mathbf{C}\rightarrow \mathbf{D} is called full (respectively faithful) if for any two objects C and C' of \mathbf{C}, the operation

\text{Hom}_{\mathbf{C}}(C, C')\rightarrow \text{Hom}_{\mathbf{C}}(F(C), F(C'))\quad f\rightarrow F(f)

is surjective (respectively injective). A functor F: C\rightarrow D is called an equivalence of categories if it is full and faithful and if any object D in \mathbf{D} is isomorphic to an object F(C) in the image of F in \mathbf{D}.

We again refer to the book of Mac Lane and Moerdijk for the definition of a Grothendieck topos:

A Grothendieck topos is a category which is equivalent to the category \text{Sh}(\mathbf{C}, J) of sheaves on some site (\mathbf{C}, J).

A Grothendieck topos is often referred to in the literature as some sort of a “generalized space”. In everyday life we think of “space” as something that objects occupy. Or perhaps we may think of a “place” as something that we live in (the word “topos” itself is the Greek word for “place”). The concept of sheaves expresses the idea that when we look at the objects on portions of a space, they can be “patched together” (it seems rather surreal, even unthinkable, for objects in everyday life not to patch together properly).

We have expressed the notion of a topology as being some sort of “arrangement” on a set. A Grothendieck topology is also an arrangement, but instead of making use of the “parts” (subsets) of a set, it instead makes use of the “relations” or “interactions” between objects in a category.

So we can think of the idea of a topos, perhaps, as making a “place” for our objects of interest (such as sets, groups, rings, modules, etc.) to “live in”. This place has an “arrangement” that our objects of interest “respect”, analogous to how open coverings are used to express how objects are “patched together” to form a sheaf on a topological space. This point of view has already become fruitful in algebraic geometry, where the geometry is described in terms of the algebra; so for instance, the “points” of a “shape” correspond to the prime ideals of a ring (see Rings, Fields, and Ideals and More on Ideals), so they may not correspond with the idea of a space we are usually used to, where the points are described by coordinates which are real numbers.

The idea of making a “place” for mathematical objects to “live in” is abstract enough, however, to not be confined to any one branch of mathematics. Thus, the idea of a topos, sufficiently generalized, has found many applications in everything from logic to differential geometry.


Topos in Wikipedia

Sheaf on Wikipedia

Grothendieck Topology on Wikipedia

Sheaves in Geometry and Logic: A First Introduction to Topos Theory by Saunders Mac Lane and Ieke Moerdijk