The Theory of Motives

The theory of motives originated from the observation, sometime in the 1960’s, that in algebraic geometry there were several different cohomology theories (see Homology and Cohomology and Cohomology in Algebraic Geometry), such as Betti cohomology, de Rham cohomology, l-adic cohomology, and crystalline cohomology. The search for a “universal cohomology theory”, such that all these other cohomology theories could be obtained from such a universal cohomology theory is what led to the theory of motives.

The four cohomology theories enumerated above are examples of what is called a Weil cohomology theory. A Weil cohomology theory, denoted H^{*}, is a functor (see Category Theory) from the category \mathcal{V}(k) of smooth projective varieties over some field k to the category \textbf{GrAlg}(K) of graded K-algebras, for some other field K which must be of characteristic zero, satisfying the following axioms:

(1) (Finite-dimensionality) The homogeneous components H^{i}(X) of H^{*}(X) are finite dimensional for all i, and H^{i}(X)=0 whenever i<0 or i>2n, where n is the dimension of the smooth projective variety X.

(2) (Poincare duality) There is an orientation isomorphism H^{2n}\cong K, and a nondegenerate bilinear pairing H^{i}(X)\times H^{2n-i}(X)\rightarrow H^{2n}\cong K.

(3) (Kunneth formula) There is an isomorphism

\displaystyle H^{*}(X\times Y)\cong H^{*}(X)\otimes H^{*}(Y).

(4) (Cycle map) There is a mapping \gamma_{X}^{i} from C^{i}(X), the abelian group of algebraic cycles of codimension i on X (see Algebraic Cycles and Intersection Theory), to H^{i}(X), which is functorial with respect to pullbacks and pushforwards, has the multiplicative property \gamma_{X\times Y}^{i+j}(Z\times W)=\gamma_{X}^{i}(Z)\otimes \gamma_{Y}^{j}(W), and such that \gamma_{\text{pt}}^{i} is the inclusion \mathbb{Z}\hookrightarrow K.

(5) (Weak Lefschetz axiom) If W is a smooth hyperplane section of X, and j:W\rightarrow X is the inclusion, the induced map j^{*}:H^{i}(X)\rightarrow H^{i}(W) is an isomorphism for i\leq n-2, and a monomorphism for i\leq n-1.

(6) (Hard Lefschetz axiom) The Lefschetz operator

\displaystyle \mathcal{L}:H^{i}(X)\rightarrow H^{i+2}(X)

given by

\displaystyle \mathcal{L}(x)=x\cdot\gamma_{X}^{1}(W)

for some smooth hyperplane section W of X, with the product \cdot provided by the graded K-algebra structure of H^{*}(X), induces an isomorphism

\displaystyle \mathcal{L}^{i}:H^{n-i}(X)\rightarrow H^{n+i}(X).

The idea behind the theory of motives is that all Weil cohomology theories should factor through a “category of motives”, i.e. any Weil cohomology theory

\displaystyle H^{*}: \mathcal{V}(k)\rightarrow \textbf{GrAlg}(K)

can be expressed as the following composition of functors:

\displaystyle H^{*}: \mathcal{V}(k)\xrightarrow{h} \mathcal{M}(k)\rightarrow\textbf{GrAlg}(K)

where \mathcal{M}(k) is the category of motives. We can get different Weil cohomology theories, such as Betti cohomology, de Rham cohomology, l-adic cohomology, and crystalline cohomology, via different functors (called realization functors) from the category of motives to a category of graded algebras over some field K. This explains the term “motive”, which actually comes from the French word “motif”, which itself is already used in music and visual arts, among other things, as some kind of common underlying “theme” with different possible manifestations.

Let us now try to construct this category of motives. This category is often referred to in the literature as a “linearization” of the category of smooth projective varieties. This means that we obtain it from some sense starting with the category of smooth projective varieties, but we also want to modify it so that it we can do linear algebra, or more properly homological algebra, in some sense. In other words, we want it to behave like the category of modules over some ring. With this in mind, we want the category to be an abelian category, so that we can make sense of notions such as kernels, cokernels, and exact sequences.

An abelian category is a category that satisfies the following properties:

(1) The morphisms form an abelian group.

(2) There is a zero object.

(3) There are finite products and coproducts.

(4) Every morphism f:X\rightarrow Y has a kernel and cokernel, and satisfies a decomposition

\displaystyle K\xrightarrow{k} X\xrightarrow{i} I\xrightarrow{j} Y\xrightarrow{c} K'

where K is the kernel of f, K' is the cokernel of f, and I is the kernel of c and the cokernel of k (not to be confused with our notation for fields).

In order to proceed with our construction of the category of motives, which we now know we want to be an abelian category, we discuss the notion of correspondences.

The group of correspondences of degree r from a smooth projective variety X to another smooth projective variety Y, written \text{Corr}^{r}(X,Y), is defined to be the group of algebraic cycles of X\times Y of codimension n+r, where n is the dimension of X, i.e.

\text{Corr}^{r}(X,Y)=C^{n+r}(X\times Y)

A morphism (of varieties, in the usual sense) f:Y\rightarrow X determines a correspondence from X to Y of degree 0 given by the transpose of the graph of f in X\times Y. Therefore we may think of correspondences as generalizations of the usual concept of morphisms of varieties.

As we have learned in Algebraic Cycles and Intersection Theory, whenever we are dealing with algebraic cycles, it is often useful to consider them only up to some equivalence relation. In the aforementioned post we introduced the notion of rational equivalence. This time we consider also homological equivalence and numerical equivalence between algebraic cycles.

We say that two algebraic cycles Z_{1} and Z_{2} are homologically equivalent if they have the same image under the cycle map, and we say that they are numerically equivalent if the intersection numbers Z_{1}\cdot Z and Z_{2}\cdot Z are equal for all Z of complementary dimension. There are other such equivalence relations on algebraic cycles, but in this post we will only mostly be using rational equivalence, homological equivalence, and numerical equivalence.

Since correspondences are algebraic cycles, we often consider them only up to these equivalence relations, and denote the quotient group we obtain by \text{Corr}_{\sim}^{r}(X,Y), where \sim is the equivalence relation imposed, for example, for numerical equivalence we write \text{Corr}_{\text{num}}^{r}(X,Y).

Taking the tensor product of the abelian group \text{Corr}_{\sim}^{r}(X,Y) with the rational numbers \mathbb{Q}, we obtain the vector space

\displaystyle \text{Corr}_{\sim}^{r}(X,Y)_{\mathbb{Q}}=\text{Corr}_{\sim}^{r}(X,Y)\otimes_{\mathbb{Z}}\mathbb{Q}

To obtain something closer to an abelian category (more precisely, we will obtain what is known as a pseudo-abelian category, but in the case where the equivalence relation is numerical equivalence, we will actually obtain an abelian category), we need to consider “projectors”, correspondences p of degree 0 from a variety X to itself such that p^{2}=p. So now we form a category, whose objects are h(X,p) for a variety X and projector p, and whose morphisms are given by

\displaystyle \text{Hom}(h(X,p),h(Y,q))=q\circ\text{Corr}_{\sim}^{0}(X,Y)_{\mathbb{Q}}\circ p.

We call this category the category of pure effective motives, and denote it by \mathcal{M}_{\sim}^{\text{eff}}(k). The process described above is also known as passing to the pseudo-abelian (or Karoubian) envelope.

We write h^{i}(X,p) for the objects of \mathcal{M}_{\sim}^{\text{eff}}(k) that map to H^{i}(X). In the case that X is the projective line \mathbb{P}^{1}, and p is the diagonal \Delta_{\mathbb{P}^{1}}, we find that

h(\mathbb{P}^{1},\Delta_{\mathbb{P}^{1}})=h^{0}\mathbb{P}^{1}\oplus h^{2}\mathbb{P}^{1}

which can be rewritten also as

\displaystyle h(\mathbb{P}^{1},\Delta_{\mathbb{P}^{1}})=\mathbb{I}\oplus\mathbb{L}

where \mathbb{I} is the image of a point in the category of pure effective motives, and \mathbb{L} is known as the Lefschetz motive. It is also denoted by \mathbb{Q}(-1). The above decomposition corresponds to the projective line \mathbb{P}^{1} being a union of the affine line \mathbb{A}^{1} and a “point at infinity”, which we may denote by \mathbb{A}^{0}:

\displaystyle \mathbb{P}^{1}=\mathbb{A}^{0}\cup\mathbb{A}^{1}

More generally, we have

\displaystyle h(\mathbb{P}^{n},\Delta_{\mathbb{P}^{n}})=\mathbb{I}\oplus\mathbb{L}\oplus...\oplus\mathbb{L}^{n}

corresponding to

\displaystyle \mathbb{P}^{n}=\mathbb{A}^{0}\cup\mathbb{A}^{1}\cup...\cup\mathbb{A}^{n}.

The category of effective pure motives is an example of a tensor category. This means it has a bifunctor \otimes: \mathcal{M}_{\sim}^{\text{eff}}\times\mathcal{M}_{\sim}^{\text{eff}}\rightarrow\mathcal{M}_{\sim}^{\text{eff}} which generalizes the usual notion of a tensor product, and in this particular case it is given by taking the product of two varieties. We can ask for more, however, and construct a category of motives which is not just a tensor category but a rigid tensor category, which provides us with a notion of duals.

By formally inverting the Lefschetz motive (the formal inverse of the Lefschetz motive is then known as the Tate motive, and is denoted by \mathbb{Q}(1)), we can obtain this rigid tensor category, whose objects are triples h(X,p,m), where X is a variety, e is a projector, and m is an integer. The morphisms of this category are given by

\displaystyle \text{Hom}(h(X,p,n),h(Y,q,m))=q\circ\text{Corr}_{\sim}^{n-m}(X,Y)_{\mathbb{Q}}\circ p.

This category is called the category of pure motives, and is denoted by \mathcal{M}_{\sim}(k). The category \mathcal{M}_{\text{rat}}(k) is called the category of Chow motives, while the category \mathcal{M}_{\text{num}}(k) is called the category of Grothendieck (or numerical) motives.

The category of Chow motives has the advantage that it is known to be “universal”, in the sense that every Weil cohomology theory factors through it, as discussed earlier; however, in general it is not even abelian, which is a desirable property we would like our category of motives to have. Meanwhile, the category of Grothendieck motives is known to be abelian, but it is not yet known if it is universal. If the so-called “standard conjectures on algebraic cycles“, which we will enumerate below, are proved, then the category of Grothendieck motives will be known to be universal.

We have seen that the category of pure motives forms a rigid tensor category. Closely related to this concept, and of interest to us, is the notion of a Tannakian category. More precisely, a Tannakian category is a k-linear rigid tensor category with an exact faithful functor (called a fiber functor) to the category of finite-dimensional vector spaces over some field extension K of k.

One of the things that makes Tannakian categories interesting is that there is an equivalence of categories between a Tannakian category \mathcal{C} and the category \text{Rep}_{G} of finite-dimensional linear representations of the group of automorphisms of its fiber functor, which is also known as the Tannakian Galois group, or, if the Tannakian category is a “category of motives” of some sort, the motivic Galois group. This aspect of Tannakian categories may be thought of as a higher-dimensional analogue of the classical theory of Galois groups, which can be stated as an equivalence of categories between the category of finite separable field extensions of a field k and the category of finite sets equipped with an action of the Galois group \text{Gal}(\bar{k}/k), where \bar{k} is the algebraic closure of k.

So we see that being a Tannakian category is yet another desirable property that we would like our category of motives to have. For this not only do we have to tweak the tensor product structure of our category, we also need certain conjectural properties to hold. These are the same conjectures we have hinted at earlier, called the standard conjectures on algebraic cycles, formulated by Alexander Grothendieck at around the same time he initially developed the theory of motives.

These conjectures have some very important consequences in algebraic geometry, and while they remain unproved to this day, the search for their proof (or disproof) is an important part of modern mathematical research on the theory of motives. They are the following:

(A) (Standard conjecture of Lefschetz type) For i\leq n, the operator \Lambda defined by

\displaystyle \Lambda=(\mathcal{L}^{n-i+2})^{-1}\circ\mathcal{L}\circ (\mathcal{L}^{n-i}):H^{i}\rightarrow H^{i-2}

\displaystyle \Lambda=(\mathcal{L}^{n-i})\circ\mathcal{L}\circ (\mathcal{L}^{n-i+2})^{-1}:H^{2n-i+2}\rightarrow H^{2n-i}

is induced by algebraic cycles.

(B) (Standard conjecture of Hodge type) For all i\leq n/2, the pairing

\displaystyle x,y\mapsto (-1)^{i}(\mathcal{L}x\cdot y)

is positive definite.

(C) (Standard conjecture of Kunneth type) The projectors H^{*}(X)\rightarrow H^{i}(X) are induced by algebraic cycles in X\times X with rational coefficients. This implies the following decomposition of the diagonal:

\displaystyle \Delta_{X}=\pi_{0}+...+\pi_{2n}

which in turn implies the decomposition

\displaystyle h(X,\Delta_{X},0)=h(X,\pi_{0},0)\oplus...\oplus h(X,\pi_{2n},0)

which, writing h(X,\Delta_{X},0) as hX and h(X,\pi_{i},0) as h^{i}(X), we can also compactly and suggestively write as

\displaystyle hX=h^{0}X\oplus...\oplus h^{2n}X.

In other words, every object hX=h(X,\Delta_{X},0) of our “category of motives” decomposes into graded “pieces” h^{i}(X)=h(X,\pi_{i},0) of pure “weighti. We have already seen earlier that this is indeed the case when X=\mathbb{P}^{n}. We will need this conjecture to hold if we want our category to be a Tannakian category.

(D) (Standard conjecture on numerical equivalence and homological equivalence) If an algebraic cycle is numerically equivalent to zero, then its cohomology class is zero. If the category of Grothendieck motives is to be “universal”, so that every Weil cohomology theory factors through it, this conjecture must be satisfied.

In Algebraic Cycles and Intersection Theory and Some Useful Links on the Hodge Conjecture, Kahler Manifolds, and Complex Algebraic Geometry, we have made mention of the two famous conjectures in algebraic geometry known as the Hodge conjecture and the Tate conjecture. In fact, these two closely related conjectures can be phrased in the language of motives as the conjectures stating that the realization functors from the category of motives to the category of pure Hodge structures and continuous l-adic representations of \text{Gal}(\bar{k}/k), respectively, be fully faithful. These conjectures are closely related to the standard conjectures on algebraic cycles as well.

We have now constructed the category of pure motives, for smooth projective varieties. For more general varieties and schemes, there is an analogous idea of “mixed motives“, which at the moment remain conjectural, although there exist several related constructions which are the closest thing we currently have to such a theory of mixed motives.

If we want to construct a theory of mixed motives, instead of Weil cohomology theories we must instead consider what are known as “mixed Weil cohomology theories“, which are expected to have the following properties:

(1) (Homotopy invariance) The projection \pi:X\rightarrow\mathbb{A}^{1} induces an isomorphism

\displaystyle \pi^{*}:H^{*}(X)\xrightarrow{\cong}H^{*}(X\times\mathbb{A}^{1})

(2) (Mayer-Vietoris sequence) If U and V are open coverings of X, then there is a long exact sequence

\displaystyle ...\rightarrow H^{i}(U\cap V)\rightarrow H^{i}(X)\rightarrow H^{i}(U)\oplus H^{i}(V)\rightarrow H^{i}(U\cap V)\rightarrow...

(3) (Duality) There is a duality between cohomology H^{*} and cohomology with compact support H_{c}^{*}.

(4) (Kunneth formula) This is the same axiom as the one in the case of pure motives.

We would like a category of mixed motives, which serves as an analogue to the category of pure motives in that all mixed Weil cohomology theories factor through it, but as mentioned earlier, no such category exists at the moment. However, the mathematicians Annette Huber-Klawitter, Masaki Hanamura, Marc Levine, and Vladimir Voevodsky have constructed different versions of a triangulated category of mixed motives, denoted \mathcal{DM}(k).

A triangulated category \mathcal{T} is an additive category with an automorphism T: \mathcal{T}\rightarrow\mathcal{T} called the “shift functor” (we will also denote T(X) by X[1], and T^{n}(X) by X[n], for n\in\mathbb{Z}) and a family of “distinguished triangles

\displaystyle X\rightarrow Y\rightarrow Z\rightarrow X[1]

 which satisfies the following axioms:

(1) For any object X of \mathcal{T}, the triangle X\xrightarrow{\text{id}}X\rightarrow 0\rightarrow X[1] is a distinguished triangle.

(2) For any morphism u:X\rightarrow Y of \mathcal{T}, there is an object Z of \mathcal{T} such that X\xrightarrow{u}Y\rightarrow Z\rightarrow X[1] is a distinguished triangle.

(3) Any triangle isomorphic to a distinguished triangle is a distinguished triangle.

(4) If X\rightarrow Y\rightarrow Z\rightarrow X[1] is a distinguished triangle, then the two “rotations” Y\rightarrow Z\rightarrow Z[1]\rightarrow Y[1] and Z[-1]\rightarrow X\rightarrow Y\rightarrow Z are also distinguished triangles.

(5) Given two distinguished triangles X\xrightarrow{u}Y\xrightarrow{v}Z\xrightarrow{w}X[1] and X'\xrightarrow{u'}Y'\xrightarrow{v'}Z'\xrightarrow{w'}X'[1] and morphisms f:X\rightarrow X' an g:Y\rightarrow Y' such that the square “commutes”, i.e. u'\circ f=g\circ u, there exists a morphisms h:Z\rightarrow Z such that all other squares commute.

(6) Given three distinguished triangles X\xrightarrow{u}Y\xrightarrow{j}Z'\xrightarrow{k}X[1]Y\xrightarrow{v}Z\xrightarrow{l}X'\xrightarrow{i}Y[1], and X\xrightarrow{v\circ u}Z\xrightarrow{m}Y'\xrightarrow{n}X[1], there exists a distinguished triangle Z'\xrightarrow{f}Y'\xrightarrow{g}X'\xrightarrow{h}Z'[1] such that “everything commutes”.

A t-structure on a triangulated category \mathcal{T} is made up of two full subcategories \mathcal{T}^{\geq 0} and \mathcal{T}^{\leq 0} satisfying the following properties (writing \mathcal{T}^{\leq n} and \mathcal{T}^{\leq n} to denote \mathcal{T}^{\leq 0}[-n] and \mathcal{T}^{\geq 0}[-n] respectively):

(1) \mathcal{T}^{\leq -1}\subset \mathcal{T}^{\leq 0} and \mathcal{T}^{\geq 1}\subset \mathcal{T}^{\geq 0}

(2) \displaystyle \text{Hom}(X,Y)=0 for any object X of \mathcal{T}^{\leq 0} and any object Y of \mathcal{T}^{\geq 1}

(3) for any object Y of \mathcal{T} we have a distinguished triangle

\displaystyle X\rightarrow Y\rightarrow Z\rightarrow X[1]

where X is an object of \mathcal{T}^{\leq 0} and Z is an object of \mathcal{T}^{\geq 1}.

The full subcategory \mathcal{T}^{0}=\mathcal{T}^{\leq 0}\cap\mathcal{T}^{\geq 0} is called the heart of the t-structure, and it is an abelian category.

It is conjectured that the category of mixed motives \mathcal{MM}(k) is the heart of the t-structure of the triangulated category of mixed motives \mathcal{DM}(k).

Voevodsky’s construction proceeds in a manner somewhat analogous to the construction of the category of pure motives as above, starting with schemes (say, over a field k, although a more general scheme may be used) as objects and correspondences as morphisms, but then makes use of concepts from abstract homotopy theory, such as taking the bounded homotopy category of bounded complexes, and localization with respect to a certain subcategory, before passing to the pseudo-abelian envelope and then formally inverting the Tate object \mathbb{Z}(1). The triangulated category obtained is called the category of geometric motives, and is denoted by \mathcal{DM}_{\text{gm}}(k). The schemes and correspondences involved in the construction of \mathcal{DM}_{\text{gm}}(k) are required to satisfy certain properties which eliminates the need to consider the equivalence relations which form a large part of the study of the category of pure motives.

Closely related to the triangulated category of mixed motives is motivic cohomology, which is defined in terms of the former as

\displaystyle H^{i}(X,\mathbb{Z}(m))=\text{Hom}_{\mathcal{DM}(k)}(X,\mathbb{Z}(m)[i])

where \mathbb{Z}(m) is the tensor product of m copies of the Tate object \mathbb{Z}(1), and the notation \mathbb{Z}(m)[i] tells us that the shift functor of the triangulated category is applied to the object \mathbb{Z}(m) i times.

Motivic cohomology is related to the Chow group, which we have introduced in Algebraic Cycles and Intersection Theory, and also to algebraic K-theory, which is another way by which the ideas of homotopy theory are applied to more general areas of abstract algebra and linear algebra. These ideas were used by Voevodsky to prove several related theorems, from the Milnor conjecture to its generalization, the Bloch-Kato conjecture (also known as the norm residue isomorphism theorem).

Historically, one of the motivations for Grothendieck’s attempt to obtain a universal cohomology theory was to prove the Weil conjectures, which is a higher-dimensional analogue of the Riemann hypothesis for curves over finite fields first proved by Andre Weil himself (see The Riemann Hypothesis for Curves over Finite Fields). In fact, if the standard conjectures on algebraic cycles are proved, then a proof of the Weil conjectures would follow via an approach that closely mirrors Weil’s original proof (since cohomology provides a Lefschetz fixed-point formula –  we have mentioned in The Riemann Hypothesis for Curves over Finite Fields that the study of fixed points is an important part of Weil’s proof). The last of the Weil conjectures were eventually proved by Grothendieck’s student Pierre Deligne, but via a different approach that bypassed the standard conjectures. A proof of the standard conjectures, which would lead to a perhaps more elegant proof of the Weil conjectures, is still being pursued to this day.

The theory of motives is not only related to analogues of the Riemann hypothesis, which concerns the location of zeroes of L-functions, but to L-functions in general. For instance, it is also related to the Langlands program, which concerns another aspect of L-functions, namely their analytic continuation and functional equation, and to the Birch and Swinnerton-Dyer conjecture, which concerns their values at special points.

We recall in The Riemann Hypothesis for Curves over Finite Fields that the Frobenius morphism played an important part in counting the points of a curve over a finite field, which in turn we needed to define the zeta function (of which the L-function can be thought of as a generalization) of the curve. The Frobenius morphism is an element of the Galois group, and we recall that a category of motives which is a Tannakian category is equivalent to the category of representations of its motivic Galois group. Therefore we can see how we can define “motivic L-functions” using the theory of motives.

As the L-functions occupy a central place in many areas of modern mathematics, the theory of motives promises much to be gained from its study, if only we could make progress in deciphering the many mysteries that surround it, of which we have only scratched the surface in this post. The applications of motives are not limited to L-functions either – the study of periods, which relate Betti cohomology and de Rham cohomology, and lead to transcendental numbers which can be defined using only algebraic concepts, is also strongly connected to the theory of motives. Recent work by the mathematicians Alain Connes and Matilde Marcolli has also suggested applications to physics, particularly in relation to Feynman diagrams in quantum field theory. There is also another generalization of the theory of motives, developed by Maxim Kontsevich, in the context of noncommutative geometry.


Weil Cohomology Theory on Wikipedia

Motive on Wikipedia

Standard Conjectures on Algebraic Cycles on Wikipedia

Motive on nLab

Pure Motive on nLab

Mixed Motive on nLab

The Tate Conjecture over Finite Fields on Hard Arithmetic

What is…a Motive? by Barry Mazur

Motives – Grothendieck’s Dream by James S. Milne

Noncommutative Geometry, Quantum Fields, and Motives by Alain Connes and Matilde Marcolli

Algebraic Cycles and the Weil Conjectures by Steven L. Kleiman

The Standard Conjectures by Steven L. Kleiman

Feynman Motives by Matilde Marcolli

Une Introduction aux Motifs (Motifs Purs, Motifs Mixtes, Periodes) by Yves Andre


Basics of Math and Physics

I’ve added another new page to the blog, called Basics of Math and Physics. As most of the posts lately have tackled subjects of increasing technical sophistication, I thought it would be a good idea to collect some of the more introductory posts in one page. It is not an exhaustive list, however, as this blog has not tackled all the basic subjects in math and physics; for example, there are no posts yet discussing real analysis or complex analysis. Perhaps if these subjects are tackled in future posts they will be added to that page too. It is also best supplemented with the references listed in Book List.

Some Useful Links on the History of Algebraic Geometry

It’s been a while since I’ve posted on this blog, but there are some posts I’m currently working on about some subjects I’m also currently studying (that’s why it’s taking so long, as I’m trying to digest the ideas as much as I can before I can post about it). But anyway, for the moment, in this short post I’ll be putting up some links to articles on the history of algebraic geometry. Aside from telling an interesting story on its own, there is also much to be learned about a subject from studying its historical development.

We know that the origins of algebraic geometry can be traced back to Rene Descartes and Pierre de Fermat in the 17th century. This is the high school subject also known as “analytic geometry” (which, as we have mentioned in Basics of Algebraic Geometry, can be some rather confusing terminology, because in modern times the word “analytic” is usually used to refer to concepts in complex calculus).

The so-called “analytic geometry” seems to be a rather straightforward subject compared to modern-day algebraic geometry, which, as may be seen on many of the previous posts on this blog, is very abstract (but it is also this abstraction that gives it its power). How did this transformation come to be?

The mathematician Jean Dieudonne, while perhaps more known for his work in the branch of mathematics we call analysis (the more high-powered version of calculus), also served as adviser to Alexander Grothendieck, one of the most important names in the development of modern algebraic geometry. Together they wrote the influential work known as Elements de Geometrie Algebrique, often simply referred to as EGA. Dieudonne was also among the founding members of the “Bourbaki group”, a group of mathematicians who greatly influenced the development of modern mathematics. Himself a part of its development, Dieudonne wrote many works on the history of mathematics, among them the following article on the history of algebraic geometry which can be read for free on the website of the Mathematical Association of America:

The Historical Development of Algebraic Geometry by Jean Dieudonne

But before the sweeping developments instituted by Alexander Grothendieck, the modern revolution in algebraic geometry was first started by the mathematicians Oscar Zariski and Andre Weil (we discussed some of Weil’s work in The Riemann Hypothesis for Curves over Finite Fields). Zariski himself learned from the so-called “Italian school of algebraic geometry”, particularly the mathematicians Guido Castelnuovo, Federigo Enriques, and Francesco Severi.

At the International Congress of Mathematicians in 1950, both Zariski and Weil presented, separately, a survey of the developments in algebraic geometry at the time, explaining how the new “abstract algebraic geometry” was different from the old “classical algebraic geometry”, and the new advantages it presented. The proceedings of this conference are available for free online:

Proceedings of the 1950 International Congress of Mathematicians, Volume I

Proceedings of the 1950 International Congress of Mathematicians, Volume II

The articles by Weil and Zariski can be found in the second volume, but I included also the first volume for “completeness”.

All proceedings of the International Congress of Mathematicians, which is held every four years, are actually available for free online:

Proceedings of the International Congress of Mathematicians, 1983-2010

The proceedings of the 2014 International Congress of Mathematicians in Seoul, Korea, can be found here:

Proceedings of the 2014 International Congress of Mathematicians

Going back to algebraic geometry, a relatively easy to understand (for those with some basic mathematical background, anyway) summary of the work of Alexander Grothendieck’s work in algebraic geometry can be found in the following article by Colin McLarty, published in April 2016 issue of the Notices of the American Mathematical Society:

How Grothendieck Simplified Algebraic Geometry by Colin McLarty

Tangent Spaces in Algebraic Geometry

We have discussed the notion of a tangent space in Differentiable Manifolds Revisited in the context of differential geometry. In this post we take on the same topic, but this time in the context of algebraic geometry, where it is also known as the Zariski tangent space (when no confusion arises, however, it is often simply referred to as the tangent space).

This will present us with challenges, since the concept of the tangent space is perhaps best tackled using the methods of calculus, but in algebraic geometry, we want to have a notion of tangent spaces in cases where we would not usually think of calculus as being applicable, for instance in the case of varieties over finite fields. In other words, we want our treatment to be algebraic. Nevertheless, we will use the methods of calculus as an inspiration.

We don’t want to be too dependent on the parts of calculus that make use of properties of the real and complex numbers that will not carry over to the more general cases. Fortunately, if we are dealing with polynomials, we can just “borrow” the “power rule” of calculus, since that “rule” only makes use of algebraic procedures, and we need not make use of sequences, limits, and so on. Namely, if we have a polynomial given by

\displaystyle f=\sum_{j=1}^{n}ax^{j}

We set

\displaystyle \frac{\partial f}{\partial x}=\sum_{j=1}^{n}jax^{j-1}

We recall the rules for partial derivatives – in the case that we are differentiating over some variable x, we simply treat all the other variables as constants, and follow the usual rules of differential calculus. With these rules, we can now make the definition of the tangent space at the point P with coordinates (a_{1},a_{2},...,a_{n}) as the algebraic set which satisfies the equation

\displaystyle \sum_{j}\frac{\partial f}{\partial x_{j}}(P)(x_{j}-a_{j})=0

For example, consider the parabola given by the equation y-x^{2}=0. Let us take the tangent space at the point P with coordinates x=1, y=1. The procedure above gives us

\displaystyle \frac{\partial f}{\partial x}(P)(x-1)+\frac{\partial f}{\partial y}(P)(y-1)=0


\displaystyle \frac{\partial f}{\partial x}=-2x

\displaystyle \frac{\partial f}{\partial y}=1

We then have

\displaystyle -2x|_{x=1,y=1}(x-1)+1|_{x=1,y=1}(y-1)=0

\displaystyle -2(1)(x-1)+1(y-1)=0

\displaystyle -2x+2+y-1=0

\displaystyle y-2x+1=0

The parabola is graphed (its real part, at least, using the Desmos graphing calculator) in the diagram below in red, with its tangent space, a line, in blue:


In case the reader is not convinced by our “borrowing” of concepts from calculus and claiming that they are “algebraic” in the specific case we are dealing with, another way to look at things without making reference to calculus is the following procedure, which comes from basic high school-level “analytic geometry”. First we translate the coordinate system so that the origin is at the point P where we want to take the tangent space. Then we simply take the “linear part” of the polynomial equation, then translate again so that the origin is where it used to be originally. This gives the same results as the earlier procedure (the technical justification is given by the theory of Taylor series). More explicitly we have:

\displaystyle y-x^{2}=0

Translating the origin of coordinates to the point x=1, y=1, we have

\displaystyle (y+1)-(x+1)^{2}=0

\displaystyle y+1-(x^{2}+2x+1)=0

\displaystyle y+1-x^{2}-2x-1=0

\displaystyle y-x^{2}-2x=0

We take only the linear part, which is

\displaystyle y-2x=0

And then we translate the origin of coordinates back to the original one:

\displaystyle (y-1)-2(x-1)=0

\displaystyle y-1-2x+2=0

\displaystyle y-2x+1=0

which is the same result we had earlier.

But it may happen that the polynomial has no “linear part”. In this case the tangent space is the entirety of the ambient space. However, there is another related concept which may be useful in these cases, called the tangent cone. The tangent cone is the algebraic set which satisfies the equations we get by extracting the lowest degree part of the polynomial, which may or may not be the linear part. In the case that the lowest degree part is the linear part, the tangent space and the tangent cone coincide, and if this holds for all points of a variety, we say that the variety is nonsingular.

To give an explicit example, consider the curve y^{2}=x^{3}+x^{2}, as seen in the diagram below in red (its real part graphed once again using the Desmos graphing calculator):

desmos-graph (2)

The equation that defines this curve has no linear part. Therefore the tangent space at the origin consists of all x and y which satisfy the trivial equation 0=0; but then, all values of x and y satisfy this equation, and therefore the tangent space is the “affine plane” \mathbb{A}^{2}. However, the lowest order part is y^{2}=x^{2}, which is satisfied by all points which also satisfy either of the two equations y=x or y=-x. These points form the blue and orange diagonal lines in the diagram. Since the tangent space and the tangent cone do not agree, the curve is singular at the origin.

We can also define the tangent space in a more abstract manner, using the concepts we have discussed in Localization. Let \mathfrak{m} be the unique maximal ideal of the local ring O_{X,P}, and let \mathfrak{m}^{2} be the product ideal whose elements are the sums of products of elements of \mathfrak{m}. The quotient \mathfrak{m}/\mathfrak{m}^{2} is then a vector space over the residue field k. The tangent space of X at P is then defined as the dual of this vector space (the vector space of linear transformations from \mathfrak{m}/\mathfrak{m}^{2} to k). The vector space \mathfrak{m}/\mathfrak{m}^{2} itself is called the cotangent space of X at P. We can think of its elements as linear polynomial functions on the tangent space. There is an analogous abstract definition of the tangent cone, namely as the spectrum of the graded ring \oplus_{i\geq 0}\mathfrak{m}^{i}/\mathfrak{m}^{i+1}.


Zariski Tangent Space on Wikipedia

Tangent Cone on Wikipedia

Desmos Graphing Calculator

Algebraic Geometry by J.S. Milne

Algebraic Geometry by Andreas Gathmann

Algebraic Geometry by Robin Hartshorne

Metric, Norm, and Inner Product

In Vector Spaces, Modules, and Linear Algebra, we defined vector spaces as sets closed under addition and scalar multiplication (in this case the scalars are the elements of a field; if they are elements of a ring which is not a field, we have not a vector space but a module). We have seen since then that the study of vector spaces, linear algebra, is very useful, interesting, and ubiquitous in mathematics.

In this post we discuss vector spaces with some more additional structure – which will give them a topology (Basics of Topology and Continuous Functions), giving rise to topological vector spaces. This also leads to the branch of mathematics called functional analysis, which has applications to subjects such as quantum mechanics, aside from being an interesting subject in itself. Two of the important objects of study in functional analysis that we will introduce by the end of this post are Banach spaces and Hilbert spaces.

I. Metric

We start with the concept of a metric. We have to get two things out of the way. First, this is not the same as the metric tensor in differential geometry, although it also gives us a notion of a “distance”. Second, the concept of metric is not limited to vector spaces only, unlike the other two concepts we will discuss in this post. It is actually something that we can put on a set to define a topology, called the metric topology.

As we discussed in Basics of Topology and Continuous Functions, we may think of a topology as an “arrangement”. The notion of “distance” provided by the metric gives us an intuitive such arrangement. We will make this concrete shortly, but first we give the technical definition of the metric. We quote from the book Topology by James R. Munkres:

A metric on a set X is a function

\displaystyle d: X\times X\rightarrow \mathbb{R}

having the following properties:

1) d(x, y)>0 for all x,y \in X; equality holds if and only if x=y.

2) d(x,y)=d(y,x) for all x,y \in X.

3) (Triangle inequality) d(x,y)+d(y,z)>d(x,z), for all x,y,z \in X.

We quote from the same book another important definition:

Given a metric d on X, the number d(x, y) is often called the distance between x and y in the metric d. Given \epsilon >0, consider the set

\displaystyle B_{d}(x,\epsilon)=\{y|d(x,y)<\epsilon\}

of all points у whose distance from x is less than \epsilon. It is called the \epsilon-ball centered at x. Sometimes we omit the metric d from the notation and write this ball simply as B(x,\epsilon) when no confusion will arise.

Finally, once more from the same book, we have the definition of the metric topology:

If d is a metric on the set X, then the collection of all \epsilon-balls B_{d}(x,\epsilon), for x\in X and \epsilon>0, is a basis for a topology on X, called the metric topology induced by d.

We recall that the basis of a topology is a collection of open sets such that every other open set can be described as a union of the elements of this collection. A set with a specific metric that makes it into a topological space with the metric topology is called a metric space.

An example of a metric on the set \mathbb{R}^{n} is given by the ordinary “distance formula”:

\displaystyle d(x,y)=\sqrt{\sum_{i=1}^{n}(x_{i}-y_{i})^{2}}

Note: We have followed the notation of the book of Munkres, which may be different from the usual notation. Here x and y are two different points on \mathbb{R}^{n}, and x_{i} and y_{i} are their respective coordinates.

The above metric is not the only one possible however. There are many others. For instance, we may simply put

\displaystyle d(x,y)=0 if \displaystyle x=y

\displaystyle d(x,y)=1 if \displaystyle x\neq y.

This is called the discrete metric, and one may check that it satisfies the definition of a metric. One may think of it as something that simply specifies the distance from a point to itself as “near”, and the distance to any other point that is not itself as “far”. There is also the taxicab metric, given by the following formula:

\displaystyle d(x,y)=\sum_{i=1}^{n}|x_{i}-y_{i}|

One way to think of the taxicab metric, which reflects the origins of the name, is that it is the “distance” important to taxi drivers (needed to calculate the fare) in a certain city with perpendicular roads. The ordinary distance formula is not very helpful since one needs to stay on the roads – therefore, for example, if one needs to go from point x to point y which are on opposite corners of a square, the distance traversed is not equal to the length of the diagonal, but is instead equal to the length of two sides. Again, one may check that the taxicab metric satisfies the definition of a metric.

II. Norm

Now we move on to vector spaces (we will consider in this post only vector spaces over the real or complex numbers), and some mathematical concepts that we can associate with them, as suggested in the beginning of this post. Being a set closed under addition and scalar multiplication is already a useful concept, as we have seen, but we can still add on some ideas that would make them even more interesting. The notion of metric that we have discussed earlier will show up repeatedly over this discussion.

We first discuss the notion of a norm, which gives us a notion of a “magnitude” of a vector. We quote from the book Introductory Functional Analysis with Applications by Erwin Kreyszig for the definition:

A norm on a (real or complex) vector space X is a real valued function on X whose value at an x\in X is denoted by

\displaystyle \|x\|    (read “norm of x“)

and which has the properties

(N1) \|x\|\geq 0

(N2) \|x\|=0\iff x=0

(N3) \|\alpha x\|=|\alpha|\|x\|

(N4) \|x+y\|\leq\|x\|+\|y\|    (triangle inequality)

here x and y are arbitrary vectors in X and \alpha is any scalar.

A vector space with a specified norm is called a normed space.

A norm automatically provides a vector space with a metric; in other words, a normed space is always a metric space. The metric is given in terms of the norm by the following equation:

\displaystyle d(x,y)=\|x-y\|

However, not all metrics come from a norm. An example is the discrete metric, which satisfies the properties of the metric but not the norm.

III. Inner Product

Next we discuss the inner product. The inner product gives us a notion of “orthogonality”, a concept which we already saw in action in Some Basics of Fourier Analysis. Intuitively, when two vectors are “orthogonal”, they are “perpendicular” in some sense. However, our geometric intuition may not be as useful when we are discussing, say, the infinite-dimensional vector space whose elements are functions. For this we need a more abstract notion of orthogonality, which is embodied by the inner product. Again, for the technical definition we quote from the book of Kreyszig:

With every pair of vectors x and y there is associated a scalar which is written

\displaystyle \langle x,y\rangle

and is called the inner product of x and y, such that for all vectors x, y, z and scalars \alpha we have

(IPl) \langle x+y,z\rangle=\langle x,z\rangle+\langle y,z\rangle

(IP2) \langle \alpha x,y\rangle=\alpha\langle x,y\rangle

(IP3) \langle x,y\rangle=\overline{\langle y,x\rangle}

(IP4) \langle x,x\rangle\geq 0,    \langle x,x\rangle=0 \iff x=0

A vector space with a specified inner product is called an inner product space.

One of the most basic examples, in the case of a finite-dimensional vector space, is given by the following procedure. Let x and y be elements (vectors) of some n-dimensional real vector space X, with respective components x_{1}, x_{2},...,x_{n} and y_{1},y_{2},...,y_{n} in some basis. Then we can set

\displaystyle \langle x,y\rangle=x_{1}y_{1}+x_{2}y_{2}+...+x_{n}y_{n}

This is the familiar “dot product” taught in introductory university-level mathematics courses.

Let us now see how the inner product gives us a notion of “orthogonality”. To make things even easier to visualize, let us set n=2, so that we are dealing with vectors (which we can now think of as quantities with magnitude and direction) in the plane. A unit vector x pointing “east” has components x_{1}=1, x_{2}=0, while a unit vector y pointing “north” has components y_{1}=0, y_{2}=1. These two vectors are perpendicular, or orthogonal. Computing the inner product we discussed earlier, we have

\displaystyle \langle x,y\rangle=(1)(0)+(0)(1)=0.

We say, therefore, that two vectors are orthogonal when their inner product is zero. As we have mentioned earlier, we can extend this to cases where our geometric intuition may no longer be as useful to us. For example, consider the infinite dimensional vector space of (real-valued) functions which are “square integrable” over some interval (if we square them and integrate over this interval, we have a finite answer), say [0,1]. We set our inner product to be

\displaystyle \int_{0}^{1}f(x)g(x)dx.

As an example, let f(x)=\text{cos}(2\pi x) and g(x)=\text{sin}(2\pi x). We say that these functions are “orthogonal”, but it is hard to imagine in what way. But if we take the inner product, we will see that

\displaystyle \int_{0}^{1}\text{cos}(2\pi x)\text{sin}(2\pi x)dx=0.

Hence we see that \text{cos}(2\pi x) and \text{sin}(2\pi x) are orthogonal. Similarly, we have

\displaystyle \int_{0}^{1}\text{cos}(2\pi x)\text{cos}(4\pi x)dx=0

and \text{cos}(2\pi x) and \text{cos}(4\pi x) are also orthogonal. We have discussed this in more detail in Some Basics of Fourier Analysis. We have also seen in that post that orthogonality plays a big role in the subject of Fourier analysis.

Just as a norm always induces a metric, an inner product also induces a norm, and by extension also a metric. In other words, an inner product space is also a normed space, and also a metric space. The norm is given in terms of the inner product by the following expression:

\displaystyle \|x\|=\sqrt{\langle x,x\rangle}

Just as with the norm and the metric, although an inner product always induces a norm, not every norm is induced by an inner product.

IV. Banach Spaces and Hilbert Spaces

There is one more concept I want to discuss in this post. In Valuations and Completions, we discussed Cauchy sequences and completions. Those concepts still carry on here, because they are actually part of the study of metric spaces (in fact, the valuations discussed in that post actually serve as a metric on the fields that were discussed, showing how in number theory the concept of metric and metric spaces still make an appearance). If every Cauchy sequence in a metric space X converges to an element in X, then we say that X is a complete metric space.

Since normed spaces and inner product spaces are also metric spaces, the notion of a complete metric space still makes sense, and we have special names for them. A normed space which is also a complete metric space is called a Banach space, while an inner product space which is also a complete metric space is called a Hilbert space. Finite-dimensional vector spaces (over the real or complex numbers) are always complete, and therefore we only really need the distinction when we are dealing with infinite dimensional vector spaces.

Banach spaces and Hilbert spaces are important in quantum mechanics. We recall in Some Basics of Quantum Mechanics that the possible states of a system in quantum mechanics form a vector space. However, more is true – they actually form a Hilbert space, and the states that we can observe “classically” are orthogonal to each other. The Dirac “bra-ket” notation that we have discussed makes use of the inner product to express probabilities.

Meanwhile, Banach spaces often arise when studying operators, which correspond to observables such as position and momentum. Of course the states form Banach spaces too, since all Hilbert spaces are Banach spaces, but there is much motivation to study the Banach spaces formed by the operators as well instead of just that formed by the states. This is an important aspect of the more mathematically involved treatments of quantum mechanics.


Topological Vector Space on Wikipedia

Functional Analysis on Wikipedia

Metric on Wikipedia

Norm on Wikipedia

Inner Product Space on Wikipedia

Complete Metric Space on Wikipedia

Banach Space on Wikipedia

Hilbert Space on Wikipedia

A Functional Analysis Primer on Bahcemizi Yetistermeliyiz

Topology by James R. Munkres

Introductory Functional Analysis with Applications by Erwin Kreyszig

Real Analysis by Halsey Royden

Differentiable Manifolds Revisited

In many posts on this blog, such as Geometry on Curved Spaces and Connection and Curvature in Riemannian Geometry, we have discussed the subject of differential geometry, usually in the context of physics. We have discussed what is probably its most famous application to date, as the mathematical framework of general relativity, which in turn is the foundation of modern day astrophysics. We have also seen its other applications to gauge theory in particle physics, and in describing the phase space, whose points corresponds to the “states” (described by the position and momentum of particles) of a physical system in the Hamiltonian formulation of classical mechanics.

In this post, similar to what we have done in Varieties and Schemes Revisited for the subject of algebraic geometry, we take on the objects of study of differential geometry in more technical terms. These objects correspond to our everyday intuition, but we must develop some technical language in order to treat them “rigorously”, and also to be able to generalize them into other interesting objects. As we give the technical definitions, we will also discuss the intuitive inspiration for these definitions.

Just as varieties and schemes are the main objects of study in algebraic geometry (that is until the ideas discussed in Grothendieck’s Relative Point of View were formulated), in differential geometry the main objects of study are the differentiable manifolds. Before we give the technical definition, we first discuss the intuitive idea of a manifold.

A manifold is some kind of space that “locally” looks like Euclidean space \mathbb{R}^{n}. 1-dimensional Euclidean space is just the line \mathbb{R}, 2-dimensional Euclidean space is the plane \mathbb{R}^{2}, and so on. Obviously, Euclidean space itself is a manifold, but we want to look at more interesting examples, i.e. spaces that “locally” look like Euclidean space but “globally” are very different from it.

As an example, consider the surface of the Earth. “Locally”, that is, on small regions, the surface of the Earth appears flat. However, “globally”, we know that it is actually round.

Another way to think about things is that any small region on the surface of the Earth can be put on a flat map (possibly with some distortion of distances). However, there is no flat map that can include every point on the surface of the Earth while continuing to make sense. The best we can do is use several maps with some overlaps between them, transitioning between different maps when we change the regions we are looking at. We want these overlaps and transitions to make sense in some way.

In differential geometry, what we want is to be able to do calculus on these more general manifolds the way we can do calculus on the line, on the plane, and so on. In order to do this, we require that the “transitions” alluded to in the previous paragraph are given by differentiable functions.

Summarizing the above discussion in technical terms, an n-dimensional differentiable manifold is a topological space X with homeomorphisms \varphi_{\alpha} from the open subsets U_{\alpha} covering X to \mathbb{R}^{n}, such that the composition \varphi_{\alpha}\circ\varphi_{\beta}^{-1} is a differentiable function on \varphi_{\beta}(U_{\alpha}\cap U_{\beta})\subset\mathbb{R}^{n}.

Following the analogy with maps we discussed earlier, the pair \{U_{\alpha}, \varphi_{\alpha}\} is called a chart, and the collection of all these charts that cover the manifold is called an atlas. The map \varphi_{\alpha}\circ\varphi_{\beta}^{-1}|_{\varphi_{\beta}(U_{\alpha}\cap U_{\beta})} is called a transition map.

Now that we have defined what a manifold technically is, we discuss some related concepts, in particular the objects that “live” on our manifold. Perhaps the most basic of these objects are the functions on the manifold; however, we won’t discuss the functions themselves too much since there are not that many new concepts regarding them.

Instead, we will use one of the most useful concepts when it comes to discussing objects that “live” on manifolds – fiber bundles (see Vector Fields, Vector Bundles, and Fiber Bundles). A fiber bundle is given by a topological space E with a projection \pi from E to a base space B, with the requirement that the space \pi^{-1}(U) is homeomorphic to the product space U\times F, where F is the fiber, defined as \pi^{-1}(x) for any point x of B. When the fiber F is also a vector space, we refer to E as a vector bundle. In differential geometry, we require that the relevant maps be also diffeomorphic, i.e. differentiable and bijective.

One of the most important kinds of vector bundles in differential geometry are the tangent bundles, which can be thought of as the collection of all the tangent spaces of a manifold at every point, for all the points of the manifold. We have already made use of these concepts in Geometry on Curved Spaces, and Connection and Curvature in Riemannian Geometry. We needed it, for example, to discuss the notion of parallel transport and the covariant derivative in Riemannian geometry. We will now discuss these concepts more technically.

Let \mathcal{O}_{p} be the ring of real-valued differentiable functions defined in a neighborhood of a point p in a differentiable manifold X. We define the real tangent space at p, written T_{\mathbb{R},p}(X), to be the vector space of p-centered \mathbb{R}-linear derivations, which are \mathbb{R}-linear maps D: \mathcal{O}_{p}\rightarrow\mathbb{R} satisfying Leibniz’s rule D(fg)=f(p)Dg-g(p)Df. Any such derivation D can be written in the following form:

\displaystyle D=\sum_{i}a_{i}\frac{\partial}{\partial x_{i}}\bigg\rvert_{p}

This means that \frac{\partial}{\partial x_{i}} is a basis for the real tangent space at p. It might be a little jarring to see “differential operators” serving as a basis for a vector space, but it might perhaps be helpful to think of tangent vectors as giving “how fast” functions on the manifold are changing at a certain point. See the following picture:


The manifold is M, and its tangent space at the point x is T_{x}M. One of the tangent vectors, v, is shown. The parametrized curve \gamma(t) is often used to define the tangent vector, although that is not the approach we have given here (it may be found in the references, and is closely related to the definition we have given).

Another concept that we will need is the concept of 1-forms. A 1-form on a particular point on the manifold takes a single tangent vector (an element of the tangent space at that particular point) as an input and gives a number as an output. Just as we have the notion of tangent vectors, tangent spaces, and tangent bundles, we also have the “dual” notion of 1-forms, cotangent spaces, and cotangent bundles, and just as the basis of the tangent vectors are given by \frac{\partial}{\partial x_{i}}, we also have a basis of 1-forms given by dx_{i}.

Aside from 1-forms, we also have mathematical objects that take two elements of the tangent space at a point (i.e. two tangent vectors at that point) as an input and gives a number as an output.

An example that we have already discussed in this blog is the metric tensor, which we refer to sometimes as simply the metric (calling it the metric tensor, however, helps prevent confusion as there are many different concepts in mathematics also referred to as a metric). We have been thinking of the metric tensor as expressing the “infinitesimal distance formula” at a certain point on the manifold.

The metric tensor is defined as a symmetric, nondegenerate, bilinear form. “Symmetric” means that we can interchange the two inputs (the tangent vectors) and get the same output. “Nondegenerate” means that, holding one of the inputs fixed and letting the other vary, having an output of zero for all the varying inputs means that the fixed input must be zero. “Bilinear form” means that it is linear in either input – it respects addition of vectors and multiplication by scalars. If we hold one input fixed, it is then a linear transformation of the other input.

In the case of our previous discussions on Riemannian geometry, the output of the metric tensor is a positive real number, expressing the infinitesimal distance. Hence, a metric tensor on a differentiable manifold which always gives a positive real number as an output is called a Riemannian metric. A manifold with a Riemannian metric is of course called a Riemannian manifold.

In general relativity, the spacetime interval, unlike the distance, may not necessarily be positive. More technically, spacetime in general relativity is an example of a pseudo-Riemannian (or semi-Riemannian) manifold, which do not require the metric to be positive (more specifically it is a Lorentzian manifold – we will leave the details of these definitions to the references for now). As we have seen though, many concepts from the study of Riemannian manifolds carry over to the pseudo-Riemannian case.

Another example of these kinds of objects are the differential forms (see Differential Forms). One important example of these objects is the symplectic form in symplectic geometry (see An Intuitive Introduction to String Theory and (Homological) Mirror Symmetry), which is used as the mathematical framework of the Hamiltonian formulation of classical mechanics. Just as the metric tensor is related to the “infinitesimal distance”, the symplectic form is related to the “infinitesimal area”.

As an example of the symplectic form, the “phase space” in the Hamiltonian formulation of classical mechanics is made up of points which correspond to a “state” of a system as given by the position and momentum of its particles. For the simple case of one particle constrained to move in a line, the symplectic form (written \omega) is given by

\displaystyle \omega=\displaystyle dq\wedge dp

where q is the position and p is the momentum, serving as the coordinates of the phase space (by the way, the phase space is itself already the cotangent bundle of the configuration space, the space whose points are the different “configurations” of the system, which we can think of as a generalization of the concept of position).

Technically, the symplectic form is defined as a closed, nondegenerate, 2-form. By “2-form“, we mean that it is a differential form, obeying the properties we gave in Differential Forms, such as antisymmetry. The notion of a differential being “closed“, also already discussed in the same blog post, means that its exterior derivative is zero. “Nondegenerate” of course was already defined in the preceding paragraphs. The symplectic form is also a bilinear form, although this is a property of all 2-forms, considered as functions of two tangent vectors at some point on the manifold. More generally, all differential forms are examples of multilinear forms. A manifold with a symplectic form is called a symplectic manifold.

There is still so much more to differential geometry, but for now, we have at least accomplished the task of defining some of its most basic concepts in a more technical manner. The language we have discussed here is important to deeper discussions of differential geometry.


Differential Geometry on Wikipedia

Differentiable Manifold on Wikipedia

Tangent Space on Wikipedia

Tangent Bundle on Wikipedia

Cotangent Space on Wikipedia

Cotangent Bundle on Wikipedia

Riemannian Manifold on Wikipedia

Pseudo-Riemannian Manifold on Wikipedia

Symplectic Manifold on Wikipedia

Differential Geometry of Curves and Surfaces by Manfredo P. do Carmo

Differential Geometry: Bundles, Connections, Metrics and Curvature by Clifford Henry Taubes

Foundations of Differential Geometry by Shoshichi Kobayashi and Katsumi Nomizu

Geometry, Topology, and Physics by Mikio Nakahara

Grothendieck’s Relative Point of View

In Varieties and Schemes Revisited we defined the notion of schemes, which is a far-reaching generalization inspired by the concept of varieties, which is essentially a kind of “shape” defined by polynomials in some way. However, the definition of schemes were but one of many innovations in algebraic geometry developed by the mathematician Alexander Grothendieck. In this post, we discuss another of these innovations, the so-called “relative point of view“, in which the focus is not just on schemes in isolation, but schemes relative to (with a morphism to) some “base scheme”.

Let S be a scheme. A scheme over S, or an S-scheme, is a scheme X with a morphism  f:X\rightarrow S called the structural morphism. If Y is another S-scheme with structural morphism g:Y\rightarrow S, a morphism of S-schemes is a morphism u:X\rightarrow Y such that f=g\circ u.

If the scheme S is the spectrum of some ring R, we may also refer to X above as a scheme over R. Every ring has a morphism from the ring of ordinary integers \mathbb{Z}, and every scheme therefore has a morphism to the scheme \text{Spec}(\mathbb{Z}), so we may think of all schemes as schemes over \mathbb{Z}.

Given two schemes X and Y over a third scheme S, we define the fiber product X\times_{S}Y to be a scheme together with projection morphisms \pi_{X}:X\times_{S}Y\rightarrow X and \pi_{Y}:X\times_{S}Y\rightarrow Y such that f\circ\pi_{X}=g\circ\pi_{Y}, and such that for any other scheme Z and morphisms p:Z\rightarrow X and q:Z\rightarrow Y, there is a unique morphism Z\rightarrow X\times_{S}Y up to isomorphism (the concept of fiber product is part of category theory – see also More Category Theory: The Grothendieck Topos).

We can use the fiber product to introduce the concept of base change. Given a scheme X over a scheme S, and a morphism S'\rightarrow S, the fiber product X\times_{S}S' is a scheme over S'. We may think of it as being “induced” by the morphism S'\rightarrow S. One of the things that can be done with this idea of base change is to look at the properties of X\times_{S}S' and see if we can use these to learn about the properties of X, which may be useful if the properties of X are difficult to determine directly compared to the properties of X\times_{S}S' (in essence we want to be able to attack a difficult problem indirectly by first attacking an easier problem related to it, which is a common strategy in mathematics).

A special case of base change is when S' is given by the spectrum of the residue field (see Localization) k corresponding to a point P of S. There is a morphism of schemes \text{Spec}(k)\rightarrow S which we may think of as the inclusion of the point P into the scheme X. Then the fiber product X\times_{S}\text{Spec}(k) is called the fiber of X at the point P. The terminology is perhaps reminiscent of fiber bundles (see Vector Fields, Vector Bundles, and Fiber Bundles), and is also rather similar to the concept of covering spaces (see Covering Spaces) in that we have some kind of space “over” every point of our “base” scheme. However, unlike those two earlier concepts, the spaces which make up our fibers may now vary as the points vary.

Actually, the concept that this special case of fiber product and base change should bring to mind is that of a moduli space (see The Moduli Space of Elliptic Curves), where every point represents a space, and the spaces vary as the points vary. Or, as we worded it in The Moduli Space of Elliptic Curves, every point of the moduli space (given by the base scheme) corresponds to a space (given by the fiber), and the moduli space tells us how these spaces vary, so that spaces which are similar to each other in some way correspond to points in the moduli space that are close together.

The lecture notes of Andreas Gathmann listed among the references below contain some nice diagrams to help visualize the idea of the fiber product and base change (these can be found in chapter 5 of the 2002 version). To see these ideas in action, one can look at the article Arithmetic on Curves by Barry Mazur (also among the references) which discusses, among other things, the approach taken by Gerd Faltings in proving the famous conjecture of Louis J. Mordell which says that there is a finite number of rational points on a curve of genus greater than 1.


Grothendieck’s Relative Point of View on Wikipedia

Arithmetic on Curves by Barry Mazur

Algebraic Geometry by Andreas Gathmann

The Rising Sea: Foundations of Algebraic Geometry by Ravi Vakil

Algebraic Geometry by Robin Hartshorne

Varieties and Schemes Revisited

In Basics of Algebraic Geometry we introduced the idea of varieties and schemes as being kinds of “shapes” defined by polynomials (or rings, more generally) in some way. In this post we discuss the definitions of these concepts in more technical detail, and introduce other important concepts related to algebraic geometry as well.

I. Preliminaries: Affine Space, Algebraic Sets and Ringed Spaces

We start with some preliminary definitions.

Affine n-space, written \mathbb{A}^{n}, is the set of all n-tuples of elements of a field k, i.e.

\displaystyle \mathbb{A}^{n}=\{(a_{1},...,a_{n})|a_{i}\in k \text{ for }1\leq i\leq n\}.

An algebraic set is a subset of \mathbb{A}^{n} that is the zero set Z(T) of some set T of polynomials, i.e. Y=Z(T), where

\displaystyle Z(T)=\{P\in \mathbb{A}^{n}|f(P)=0 \text{ for all } f\in T\}.

Intuitively, we want to define a “variety” as some kind of space which “locally” looks like an irreducible algebraic set. “Irreducible” means it cannot be expressed as the union of other algebraic sets. However, we want to think of a variety as more than just a space; we want to think of it as a space with things (namely functions) “living on it”. This leads us to the notion of a ringed space.

A ringed space is simply a pair (X,\mathcal{O}_{X}), where X is a topological space and \mathcal{O}_{X} is a sheaf (see  Sheaves) of rings on X. A morphism of ringed spaces from (X,O_{X}) to (Y,O_{Y}) is given by a continuous map f: X\rightarrow Y and a morphism of sheaves of rings f^{\#}: \mathcal{O}_{Y}\rightarrow f_{*}\mathcal{O}_{X}.

Recall that a morphism of sheaves of rings \varphi:\mathcal{F}\rightarrow \mathcal{G} for sheaves of rings \mathcal{F} and \mathcal{G} on X is given by a morphism of rings \varphi(U): \mathcal{F}(U)\rightarrow \mathcal{G}(U) for every open set U of X such that for V\subseteq{U} we have \rho_{U,V}\circ\varphi(U)=\varphi(V)\circ\rho'_{U,V}, where \rho_{U,V} and \rho'_{U,V} are the restriction maps of \mathcal{F} and \mathcal{G}.

We might as well mention locally ringed spaces here, since they will be used to define the concept of schemes later on:

A locally ringed space is a ringed space (X,\mathcal{O}_{X}) such that for each point P of X, the stalk \mathcal{O}_{X,P} is a local ring (see Localization). A morphism of locally ringed spaces from (X,O_{X}) to (Y,O_{Y}) is given by a continuous map f: X\rightarrow Y and a morphism of sheaves of rings f^{\#}: \mathcal{O}_{Y}\rightarrow f_{*}\mathcal{O}_{X} such that (f_{P}^{\#})^{-1}(\mathfrak{m}_{X,P})=\mathfrak{m}_{Y,f(P)} for all P where f_{P}^{\#}: \mathcal{O}_{Y,f(P)}\rightarrow \mathcal{O}_{X,P} is the map induced on the stalk at P.

II. Varieties in Three Steps:  Affine Varieties, Prevarieties, and Varieties

We now set out to accomplish our goal of defining “varieties” as spaces that locally look like irreducible algebraic sets. We first start with a ringed space that just “looks like” an irreducible algebraic set:

An affine variety is a ringed space (X,\mathcal{O}_{X}) such that X is irreducible, O_{X} is a sheaf of k-valued functions, and X is isomorphic to an irreducible algebraic set in \mathbb{A}^{n}.

Next, we define a more general kind of ringed space, that is required to look like an irreducible algebraic set only “locally”:

A prevariety is a ringed space (X,\mathcal{O}_{X}) such that X is irreducible, O_{X} is a sheaf of k-valued functions, and there is a finite open cover U_{i} such that (U_{i},\mathcal{O}_{X}|_{U_{i}}) is an affine variety for all i.

We are almost done. However, there is one more nice property that we would like our varieties to have. A topological space X is said to have the Hausdorff property if two distinct points always have two disjoint neighborhoods. With the Zariski topology this is almost always impossible; however there is an analogous notion which is satisfied if the image of the “diagonal morphism” which sends the point P in X to the point (P,P) in X\times X is closed in X\times X. There is an analogous notion of “product” in algebraic geometry; therefore, we can define the concept of variety as follows:

A variety is a prevariety X such that the diagonal morphism is closed in X\times X. In the rest of this post, we will refer to this property as the “algebro-geometric” analogue of the Hausdorff property.

III. Schemes

We now define the concept of schemes, which, as we shall show in the next section, generalize the concept of varieties, i.e. varieties are just a special case of schemes. Inspired by the correspondence between the maximal ideals of the “ring of polynomial functions” (with coefficients in an “algebraically closed field” like the complex numbers) of an algebraic set and the points of the algebraic set mentioned in Basics of Algebraic Geometry, we go further and consider a ringed space whose underlying topological space has points corresponding to the prime ideals of a ring (which is not necessarily a ring of polynomials – we might even consider, for example, the ring of ordinary integers \mathbb{Z}, or the ring of integers of an algebraic number field –  see Algebraic Numbers).

The spectrum (note that the word “spectrum” has many different meanings in mathematics, and this particular usage is different, say, from that in Eilenberg-MacLane Spaces, Spectra, and Generalized Cohomology Theories) of a ring is a  locally ringed space (\text{Spec}(A)),\mathcal{O}, where \text{Spec}(A) is the set of prime ideals of A equipped with the Zariski topology, and \mathcal O is a sheaf on \text{Spec}(A) given by defining \mathcal{O}(U) to be the set of functions s:U\rightarrow \coprod_{\mathfrak{p}\in U}A_{\mathfrak{p}}, such that s(\mathfrak{p})\in A_\mathfrak{p} for each \mathfrak{p}\in U, and such that for each \mathfrak{p}\in U, there is an open set V\subseteq U containing \mathfrak{p} and elements a,f\in A such that for each \mathfrak{q}\in V, f\notin \mathfrak{q}, and s(\mathfrak{q})=a/f in A_{\mathfrak{q}}.

We now proceed to define schemes, closely mirroring how we defined varieties earlier:

An affine scheme is a locally ringed space (X,\mathcal{O}_{X}) that is isomorphic as a locally ringed space to the spectrum of some ring.

A scheme is a locally ringed space (X,\mathcal{O}_{X}) where every point is contained in some open set U such that U considered as a topological space, together with the restricted sheaf \mathcal{O}_{X}|_{U}, is an affine scheme. A morphism of schemes is a morphism as locally ringed spaces.

Finally, to complete the analogy with varieties, we refer to schemes which have the (analogue of the) Hausdorff property as separated schemes.

Note: In some of the (mostly older) literature, what we refer to as schemes in this post are instead referred to as preschemes, in analogy with prevarieties. What they call a scheme is what we refer to as a separated scheme, i.e. a scheme possessing the Hausdorff property. I have no idea at the moment as to why this rather nice terminology was changed, but in this post we stick with the modern convention.

IV. Prevarieties and Varieties as Special Kinds of Schemes

We now discuss varieties as special cases of schemes. First we need to define what properties we would like our schemes to have, in order to fit with how we described varieties earlier (as ringed spaces which locally look like irreducible spaces defined by polynomials). Therefore, we have to mimic certain properties of polynomial rings.

We first note that polynomials over a field are finitely generated algebras over some field k. A scheme is said to be of finite type over the field k if the affine open sets are each isomorphic to the spectrum of some ring which is a finitely generated algebra over k. More generally, given a morphism of schemes X\rightarrow Y, there is a concept of X being a scheme of finite type over Y, but we will leave this to the references for now.

Next we note that polynomials over a field are integral domains. This means that whenever there are two polynomials f and g with the property that fg=0, then either f=0 or g=0. A scheme is integral if each the affine open sets are each isomorphic to the spectrum of some ring which is an integral domain. An equivalent condition is for the scheme to be irreducible and reduced (this means that the ring specified above has no nilpotent elements, i.e. elements where some power is equal to zero).

We therefore redefine a prevariety as an integral scheme of finite type over the field k. As with the earlier definition, a variety is a prevariety with the (analogue of the) Hausdorff property (i.e. an integral separated scheme of finite type over k).

V. Conclusion

In conclusion, we have started with essentially the same ideas as the “analytic geometry” of Pierre de Fermat and Rene Descartes, familiar to high school students everywhere, used to describe shapes such as lines, circles, conics (parabolas, hyperbolas, circles, and ellipses), and so on. From there we generalized to get more shapes, which resemble only these old shapes “locally” (we may also think of these new shapes as being “glued” from the old ones). To maintain certain familiar properties expected of shapes, we impose the analogue of the Hausdorff property. We then obtain the concept of a variety.

But we can generalize much, much farther to more than just polynomial rings. We can define “spaces” which come from rings which need not be polynomial rings, such as the ring of ordinary integers \mathbb{Z} (or more generally algebraic integers – we have actually hinted at these applications of algebraic geometry in Divisors and the Picard Group). We can then have a kind of “geometry” of these rings, which gives us methods analogous to the powerful methods of geometry, which can be applied to branches of mathematics we would not usually think of as being “geometric”, such as number theory, as we have mentioned above. We end this post with quotes from two of the pioneers of modern mathematics (these quotes are also found in the book Algebra by Michael Artin):

“To me algebraic geometry is algebra with a kick.”

-Solomon Lefschetz

“In helping geometry, modern algebra is helping itself above all.”

-Oscar Zariski


Algebraic Variety on Wikipedia

Scheme on Wikipedia

Ringed Space on Wikipedia

Abstract Varieties on Rigorous Trivialities

Schemes on Rigorous Trivialities

Algebraic Geometry by Andreas Gathmann

The Rising Sea: Foundations of Algebraic Geometry by Ravi Vakil

Algebraic Geometry by Robin Hartshorne

Algebra by Michael Artin

Covering Spaces

In Homotopy Theory we defined the fundamental group of a topological space as the group of equivalence classes of “loops” on the space. In this post, we discuss the fundamental group from another point of view, this time making use of the concept of covering spaces. In doing so, we will uncover some interesting analogies with the theory of Galois groups (see Galois Groups). Galois groups are usually associated with number theory, and not usually thought of as being related to algebraic topology, therefore one might find these analogies to be quite surprising and unexpected.

We will start with an example, which we are already somewhat familiar with, the circle. For simplicity, we set the circle to have a circumference equal to 1. We also consider the real line, which we will think of as being “wrapped” over the circle, like a spring. We may think of this “spring” as casting a “shadow”, which is the circle. See also the following image by user Yonatan of Wikipedia:


Looking at the diagram, we see that we can map the line to the circle by a kind of “projection”. As we move around the line, we “project” to different points on the circle. However, if we move by any distance equal to an integer multiple of the circumference of the circle (which as we said above we have set equal to 1), we come back to the same point if we project to the circle. At this point we recall that the fundamental group of the circle is the group of integers under addition. We can think of an element of this group (an ordinary integer) as giving the “winding number” of a loop on the circle.

In this example, we refer to the line as a covering space of the circle. Since the line is simply connected (see Homotopy Theory), it is also the universal covering space of the circle. The mapping of one point to another point on the line, such that they both “project” to the same point on the circle, is called a deck transformation. The deck transformations of a covering space form a group, and as hinted at in the discussion in the preceding paragraph, the group of deck transformations of the universal covering space of some topological space X is exactly the fundamental group of X.

More generally, a covering space for a topological space X is another topological space \tilde{X} with a continuous surjective map p: \tilde{X}\rightarrow X such that the “inverse image” of a small neighborhood in X is a disjoint union of small neighborhoods of \tilde{X}. In the diagram above, the inverse image of the small neighborhood of U of X is the disjoint union of the small neighborhoods S_{1}, S_{2}, S_{3}... of \tilde{X}.

There are many possible covering spaces for a topological space. Here is another example for the circle (courtesy of user Pappus of Wikipedia):


We can think of this as a circle “covering” another circle. However, the first example above, the line covering the circle, is special. It is a universal covering space, which means that it is a covering space which is simply connected. The word “universal” however, means that this particular covering space also “covers” all the others.

Another example is the torus. Its universal covering space is the plane, and as we recall from The Moduli Space of Elliptic Curves, we can think of the torus as being obtained from the plane by dividing it into parallelograms using a lattice (which is also a group), and then identifying opposite edges of the parallelogram. Hence we can think of the torus as a quotient space (see Modular Arithmetic and Quotient Sets) obtained from the plane. The case of the circle and the line, which we have discussed earlier, is also very similar. Yet another example, which we have discussed in Rotations in Three Dimensions, is that of the 3-dimensional real projective space \mathbb{RP}^{3} (which is also known in the theory of Lie groups as \text{SO}(3)), whose universal covering space is the 3-sphere S^{3}(which is also known as \text{SU}(2)). Similar to the above examples, we can think of \mathbb{RP}^{3} as a quotient space obtained from S^{3} by identifying antipodal points (which are “opposite” points on the sphere which can be connected by a straight line passing through the center) on the sphere. From all these examples, we see that we can think of the universal covering space as being some sort of “unfolding” of the quotient space.

A perhaps more abstract way to think of the universal covering space is as the space whose points correspond to homotopy classes (see Homotopy Theory) of paths which start at a certain fixed basepoint (but is free to end on some other point). The set of these endpoints themselves correspond to the points of the topological space which is to be covered. However, we can get to the same endpoint through different paths which are not homotopic, i.e. they cannot be deformed into each other. If we construct a topological space whose points correspond to the homotopy classes of these paths, we will obtain a simply connected space, which is the universal covering space of our topological space.

We now go back to the definition of the fundamental group as the group of deck transformations of the universal covering space. Any covering space (of the same topological space) has its own group of deck transformations, and similar to how covering spaces can be covered by other covering spaces (and they are all covered by the universal covering space), the group of deck transformations of a covering space are also subgroups of the group of deck transformations of the covering space that covers the other covering space, and all the groups of deck transformations of covering spaces of the topological space are subgroups of the fundamental group (since it is the group of deck transformations of the universal covering space which covers all the other covering spaces of the topological space). In other words, the way that the covering spaces cover each other is reflected in the group structure of the fundamental group. This is reminiscent of the theory of Galois groups, where the group structure of the Galois group can shed light into the way certain fields are contained in other fields. This is the analogy mentioned earlier, and it has inspired many fruitful ideas in modern mathematics  – for instance, it was one of the inspirations for the idea of the Grothendieck topos (see More Category Theory: The Grothendieck Topos).


Fundamental group on Wikipedia

Covering Space on Wikipedia

Image by User Yonatan of Wikipedia

Image by User Pappus of Wikipedia

Coverings of the Circle on Youtube

Algebraic Topology by Allen Hatcher

A Concise Course in Algebraic Topology by J. P. May

Universal Covers on The Princeton Companion to Mathematics by Timothy Gowers, June Barrow-Green, and Imre Leader

Sheaves in Geometry and Logic: A First Introduction to Topos Theory by Saunders Mac Lane and Ieke Moerdijk

Rotations in Three Dimensions

In Rotating and Reflecting Vectors Using Matrices we learned how to express rotations in 2-dimensional space using certain special 2\times 2 matrices which form a group (see Groups) we call the special orthogonal group in dimension 2, or \text{SO}(2) (together with other matrices which express reflections, they form a bigger group that we call the orthogonal group in 2 dimensions, or \text{O}(2)).

In this post, we will discuss rotations in 3-dimensional space. As we will soon see, notations in 3-dimensional space have certain interesting features not present in the 2-dimensional case, and despite being seemingly simple and mundane, play very important roles in some of the deepest aspects of fundamental physics.

We will first discuss rotations in 3-dimensional space as represented by the special orthogonal group in dimension 3, written as \text{SO}(3).

We recall some relevant terminology from Rotating and Reflecting Vectors Using Matrices. A matrix is called orthogonal if it preserves the magnitude of (real) vectors. The magnitude of the vector v must be equal to the magnitude of the vector Av, for a matrix A, to be orthogonal. Alternatively, we may require, for the matrix A to be orthogonal, that it satisfy the condition

\displaystyle AA^{T}=A^{T}A=I

where A^{T} is the transpose of A and I is the identity matrix. The word “special” denotes that our matrices must have determinant equal to 1. Therefore, the group \text{SO}(3) consists of the 3\times3 orthogonal matrices whose determinant is equal to 1.

The idea of using the group \text{SO}(3) to express rotations in 3-dimensional space may be made more concrete using several different formalisms.

One popular formalism is given by the so-called Euler angles. In this formalism, we break down any arbitrary rotation in 3-dimensional space into three separate rotations. The first, which we write here by \varphi, is expressed as a counterclockwise rotation about the z-axis. The second, \theta, is a counterclockwise rotation about an x-axis that rotates along with the object. Finally, the third, \psi, is expressed as a counterclockwise rotation about a z-axis that, once again, has rotated along with the object. For readers who may be confused, animations of these steps can be found among the references listed at the end of this post.

The matrix which expresses the rotation which is the product of these three rotations can then be written as

\displaystyle g(\varphi,\theta,\psi) = \left(\begin{array}{ccc} \text{cos}(\varphi)\text{cos}(\psi)-\text{cos}(\theta)\text{sin}(\varphi)\text{sin}(\psi) & -\text{cos}(\varphi)\text{sin}(\psi)-\text{cos}(\theta)\text{sin}(\varphi)\text{cos}(\psi) & \text{sin}(\varphi)\text{sin}(\theta) \\ \text{sin}(\varphi)\text{cos}(\psi)+\text{cos}(\theta)\text{cos}(\varphi)\text{sin}(\psi) & -\text{sin}(\varphi)\text{sin}(\psi)+\text{cos}(\theta)\text{cos}(\varphi)\text{cos}(\psi) & -\text{cos}(\varphi)\text{sin}(\theta) \\ \text{sin}(\psi)\text{sin}(\theta) & \text{cos}(\psi)\text{sin}(\theta) & \text{cos}(\theta) \end{array}\right).

The reader may check that, in the case that the rotation is strictly in the xy plane, i.e. \theta and \psi are zero, we will obtain

\displaystyle g(\varphi,\theta,\psi) = \left(\begin{array}{ccc} \text{cos}(\varphi) & -\text{sin}(\varphi) & 0 \\ \text{sin}(\varphi) & \text{cos}(\varphi) & 0 \\ 0 & 0 & 1 \end{array}\right).

Note how the upper left part is an element of \text{SO}(2), expressing a counterclockwise rotation by an angle \varphi, as we might expect.

Contrary to the case of \text{SO}(2), which is an abelian group, the group \text{SO}(3) is not an abelian group. This means that for two elements a and b of \text{SO}(3), the product ab may not always be equal to the product ba. One can check this explicitly, or simply consider rotating an object along different axes; for example, rotating an object first counterclockwise by 90 degrees along the z-axis, and then counterclockwise again by 90 degrees along the x-axis, will not end with the same result as performing the same operations in the opposite order.

We now know how to express rotations in 3-dimensional space using 3\times 3 orthogonal matrices. Now we discuss another way of expressing the same concept, but using “unitary”, instead of orthogonal, matrices. However, first we must revisit rotations in 2 dimensions.

The group \text{SO}(2) is not the only way we have of expressing rotations in 2-dimensions. For example, we can also make use of the unitary (we will explain the meaning of this word shortly) group in 1-dimension, also written \text{U}(1). It is the group formed by the complex numbers with magnitude equal to 1. The elements of this group can always be written in the form e^{i\theta}, where \theta is the angle of our rotation. As we have seen in Connection and Curvature in Riemannian Geometry, this group is related to quantum electrodynamics, as it expresses the gauge symmetry of the theory.

The groups \text{SO}(2) and \text{U}(1) are actually isomorphic. There is a one-to-one correspondence between the elements of \text{SO}(2) and the elements of \text{U}(1) which respects the group operation. In other words, there is a bijective function f:\text{SO}(2)\rightarrow\text{U}(1), which satisfies ab=f(a)f(b) for a, b elements of \text{SO}(2). When two groups are isomorphic, we may consider them as being essentially the same group. For this reason, both \text{SO}(2) and U(1) are often referred to as the circle group.

We can now go back to rotations in 3 dimensions and discuss the group \text{SU}(2), the special unitary group in dimension 2. The word “unitary” is in some way analogous to “orthogonal”, but applies to vectors with complex number entries.

Consider an arbitrary vector

\displaystyle v=\left(\begin{array}{c}v_{1}\\v_{2}\\v_{3}\end{array}\right).

An orthogonal matrix, as we have discussed above, preserves the quantity (which is the square of what we have referred to earlier as the “magnitude” for vectors with real number entries)

\displaystyle v_{1}^{2}+v_{2}^{2}+v_{3}^{2}

while a unitary matrix preserves

\displaystyle v_{1}^{*}v_{1}+v_{2}^{*}v_{2}+v_{3}^{*}v_{3}

where v_{i}^{*} denotes the complex conjugate of the complex number v_{i}. This is the square of the analogous notion of “magnitude” for vectors with complex number entries.

Just as orthogonal matrices must satisfy the condition

\displaystyle AA^{T}=A^{T}A=I,

unitary matrices are required to satisfy the condition

\displaystyle AA^{\dagger}=A^{\dagger}A=I

where A^{\dagger} is the Hermitian conjugate of A, a matrix whose entries are the complex conjugates of the entries of the transpose A^{T} of A.

An element of the group \text{SU}(2) is therefore a 2\times 2 unitary matrix whose determinant is equal to 1. Like the group \text{SO}(3), the group \text{SU}(2) is also a group which is not abelian.

Unlike the analogous case in 2 dimensions, the groups \text{SO}(3) and \text{SU}(2) are not isomorphic. There is no one-to-one correspondence between them. However, there is a homomorphism from \text{SU}(2) to \text{SO}(3) that is “two-to-one”, i.e. there are always two elements of \text{SU}(2) that get mapped to the same element of \text{SO}(3) under this homomorphism. Hence, \text{SU}(2) is often referred to as a “double cover” of \text{SO}(3).

In physics, this concept underlies the weird behavior of quantum-mechanical objects called spinors (such as electrons), which require a rotation of 720, not 360, degrees to return to its original state!

The groups we have so far discussed are not “merely” groups. They also possesses another kind of mathematical structure. They describe certain shapes which happen to have no sharp corners or edges. Technically, such a shape is called a manifold, and it is the object of study of the branch of mathematics called differential geometry, which we have discussed certain basic aspects of in Geometry on Curved Spaces and Connection and Curvature in Riemannian Geometry.

For the circle group, the manifold that it describes is itself a circle. The elements of the circle group correspond to the points of the circle. The group \text{SU}(2) is the surface of the 4– dimensional sphere, or what we call a 3-sphere (for those who might be confused by the terminology, recall that we are only considering the surface of the sphere, not the entire volume, and this surface is a 3-dimensional, not a 4-dimensional, object). The group \text{SO}(3) is 3-dimensional real projective space, written \mathbb{RP}^{3}. It is a manifold which can be described using the concepts of projective geometry (see Projective Geometry).

A group that is also a manifold is called a Lie group (pronounced like “lee”) in honor of the mathematician Marius Sophus Lie who pioneered much of their study. Lie groups are very interesting objects of study in mathematics because they bring together the techniques of group theory and differential geometry, which teaches us about Lie groups on one hand, and on the other hand also teaches us more about both group theory and differential geometry themselves.


Orthogonal Group on Wikipedia

Rotation Group SO(3) on Wikipedia

Euler Angles on Wikipedia

Unitary Group on Wikipedia

Spinor on Wikipedia

Lie Group on Wikipedia

Real Projective Space on Wikipedia

Algebra by Michael Artin