Differentiable Manifolds Revisited

In many posts on this blog, such as Geometry on Curved Spaces and Connection and Curvature in Riemannian Geometry, we have discussed the subject of differential geometry, usually in the context of physics. We have discussed what is probably its most famous application to date, as the mathematical framework of general relativity, which in turn is the foundation of modern day astrophysics. We have also seen its other applications to gauge theory in particle physics, and in describing the phase space, whose points corresponds to the “states” (described by the position and momentum of particles) of a physical system in the Hamiltonian formulation of classical mechanics.

In this post, similar to what we have done in Varieties and Schemes Revisited for the subject of algebraic geometry, we take on the objects of study of differential geometry in more technical terms. These objects correspond to our everyday intuition, but we must develop some technical language in order to treat them “rigorously”, and also to be able to generalize them into other interesting objects. As we give the technical definitions, we will also discuss the intuitive inspiration for these definitions.

Just as varieties and schemes are the main objects of study in algebraic geometry (that is until the ideas discussed in Grothendieck’s Relative Point of View were formulated), in differential geometry the main objects of study are the differentiable manifolds. Before we give the technical definition, we first discuss the intuitive idea of a manifold.

A manifold is some kind of space that “locally” looks like Euclidean space \mathbb{R}^{n}. 1-dimensional Euclidean space is just the line \mathbb{R}, 2-dimensional Euclidean space is the plane \mathbb{R}^{2}, and so on. Obviously, Euclidean space itself is a manifold, but we want to look at more interesting examples, i.e. spaces that “locally” look like Euclidean space but “globally” are very different from it.

As an example, consider the surface of the Earth. “Locally”, that is, on small regions, the surface of the Earth appears flat. However, “globally”, we know that it is actually round.

Another way to think about things is that any small region on the surface of the Earth can be put on a flat map (possibly with some distortion of distances). However, there is no flat map that can include every point on the surface of the Earth while continuing to make sense. The best we can do is use several maps with some overlaps between them, transitioning between different maps when we change the regions we are looking at. We want these overlaps and transitions to make sense in some way.

In differential geometry, what we want is to be able to do calculus on these more general manifolds the way we can do calculus on the line, on the plane, and so on. In order to do this, we require that the “transitions” alluded to in the previous paragraph are given by differentiable functions.

Summarizing the above discussion in technical terms, an n-dimensional differentiable manifold is a topological space X with homeomorphisms \varphi_{\alpha} from the open subsets U_{\alpha} covering X to \mathbb{R}^{n}, such that the composition \varphi_{\alpha}\circ\varphi_{\beta}^{-1} is a differentiable function on \varphi_{\beta}(U_{\alpha}\cap U_{\beta})\subset\mathbb{R}^{n}.

Following the analogy with maps we discussed earlier, the pair \{U_{\alpha}, \varphi_{\alpha}\} is called a chart, and the collection of all these charts that cover the manifold is called an atlas. The map \varphi_{\alpha}\circ\varphi_{\beta}^{-1}|_{\varphi_{\beta}(U_{\alpha}\cap U_{\beta})} is called a transition map.

Now that we have defined what a manifold technically is, we discuss some related concepts, in particular the objects that “live” on our manifold. Perhaps the most basic of these objects are the functions on the manifold; however, we won’t discuss the functions themselves too much since there are not that many new concepts regarding them.

Instead, we will use one of the most useful concepts when it comes to discussing objects that “live” on manifolds – fiber bundles (see Vector Fields, Vector Bundles, and Fiber Bundles). A fiber bundle is given by a topological space E with a projection \pi from E to a base space B, with the requirement that the space \pi^{-1}(U) is homeomorphic to the product space U\times F, where F is the fiber, defined as \pi^{-1}(x) for any point x of B. When the fiber F is also a vector space, we refer to E as a vector bundle. In differential geometry, we require that the relevant maps be also diffeomorphic, i.e. differentiable and bijective.

One of the most important kinds of vector bundles in differential geometry are the tangent bundles, which can be thought of as the collection of all the tangent spaces of a manifold at every point, for all the points of the manifold. We have already made use of these concepts in Geometry on Curved Spaces, and Connection and Curvature in Riemannian Geometry. We needed it, for example, to discuss the notion of parallel transport and the covariant derivative in Riemannian geometry. We will now discuss these concepts more technically.

Let \mathcal{O}_{p} be the ring of real-valued differentiable functions defined in a neighborhood of a point p in a differentiable manifold X. We define the real tangent space at p, written T_{\mathbb{R},p}(X), to be the vector space of p-centered \mathbb{R}-linear derivations, which are \mathbb{R}-linear maps D: \mathcal{O}_{p}\rightarrow\mathbb{R} satisfying Leibniz’s rule D(fg)=f(p)Dg-g(p)Df. Any such derivation D can be written in the following form:

\displaystyle D=\sum_{i}a_{i}\frac{\partial}{\partial x_{i}}\bigg\rvert_{p}

This means that \frac{\partial}{\partial x_{i}} is a basis for the real tangent space at p. It might be a little jarring to see “differential operators” serving as a basis for a vector space, but it might perhaps be helpful to think of tangent vectors as giving “how fast” functions on the manifold are changing at a certain point. See the following picture:


The manifold is M, and its tangent space at the point x is T_{x}M. One of the tangent vectors, v, is shown. The parametrized curve \gamma(t) is often used to define the tangent vector, although that is not the approach we have given here (it may be found in the references, and is closely related to the definition we have given).

Another concept that we will need is the concept of 1-forms. A 1-form on a particular point on the manifold takes a single tangent vector (an element of the tangent space at that particular point) as an input and gives a number as an output. Just as we have the notion of tangent vectors, tangent spaces, and tangent bundles, we also have the “dual” notion of 1-forms, cotangent spaces, and cotangent bundles, and just as the basis of the tangent vectors are given by \frac{\partial}{\partial x_{i}}, we also have a basis of 1-forms given by dx_{i}.

Aside from 1-forms, we also have mathematical objects that take two elements of the tangent space at a point (i.e. two tangent vectors at that point) as an input and gives a number as an output.

An example that we have already discussed in this blog is the metric tensor, which we refer to sometimes as simply the metric (calling it the metric tensor, however, helps prevent confusion as there are many different concepts in mathematics also referred to as a metric). We have been thinking of the metric tensor as expressing the “infinitesimal distance formula” at a certain point on the manifold.

The metric tensor is defined as a symmetric, nondegenerate, bilinear form. “Symmetric” means that we can interchange the two inputs (the tangent vectors) and get the same output. “Nondegenerate” means that, holding one of the inputs fixed and letting the other vary, having an output of zero for all the varying inputs means that the fixed input must be zero. “Bilinear form” means that it is linear in either input – it respects addition of vectors and multiplication by scalars. If we hold one input fixed, it is then a linear transformation of the other input.

In the case of our previous discussions on Riemannian geometry, the output of the metric tensor is a positive real number, expressing the infinitesimal distance. Hence, a metric tensor on a differentiable manifold which always gives a positive real number as an output is called a Riemannian metric. A manifold with a Riemannian metric is of course called a Riemannian manifold.

In general relativity, the spacetime interval, unlike the distance, may not necessarily be positive. More technically, spacetime in general relativity is an example of a pseudo-Riemannian (or semi-Riemannian) manifold, which do not require the metric to be positive (more specifically it is a Lorentzian manifold – we will leave the details of these definitions to the references for now). As we have seen though, many concepts from the study of Riemannian manifolds carry over to the pseudo-Riemannian case.

Another example of these kinds of objects are the differential forms (see Differential Forms). One important example of these objects is the symplectic form in symplectic geometry (see An Intuitive Introduction to String Theory and (Homological) Mirror Symmetry), which is used as the mathematical framework of the Hamiltonian formulation of classical mechanics. Just as the metric tensor is related to the “infinitesimal distance”, the symplectic form is related to the “infinitesimal area”.

As an example of the symplectic form, the “phase space” in the Hamiltonian formulation of classical mechanics is made up of points which correspond to a “state” of a system as given by the position and momentum of its particles. For the simple case of one particle constrained to move in a line, the symplectic form (written \omega) is given by

\displaystyle \omega=\displaystyle dq\wedge dp

where q is the position and p is the momentum, serving as the coordinates of the phase space (by the way, the phase space is itself already the cotangent bundle of the configuration space, the space whose points are the different “configurations” of the system, which we can think of as a generalization of the concept of position).

Technically, the symplectic form is defined as a closed, nondegenerate, 2-form. By “2-form“, we mean that it is a differential form, obeying the properties we gave in Differential Forms, such as antisymmetry. The notion of a differential being “closed“, also already discussed in the same blog post, means that its exterior derivative is zero. “Nondegenerate” of course was already defined in the preceding paragraphs. The symplectic form is also a bilinear form, although this is a property of all 2-forms, considered as functions of two tangent vectors at some point on the manifold. More generally, all differential forms are examples of multilinear forms. A manifold with a symplectic form is called a symplectic manifold.

There is still so much more to differential geometry, but for now, we have at least accomplished the task of defining some of its most basic concepts in a more technical manner. The language we have discussed here is important to deeper discussions of differential geometry.


Differential Geometry on Wikipedia

Differentiable Manifold on Wikipedia

Tangent Space on Wikipedia

Tangent Bundle on Wikipedia

Cotangent Space on Wikipedia

Cotangent Bundle on Wikipedia

Riemannian Manifold on Wikipedia

Pseudo-Riemannian Manifold on Wikipedia

Symplectic Manifold on Wikipedia

Differential Geometry of Curves and Surfaces by Manfredo P. do Carmo

Differential Geometry: Bundles, Connections, Metrics and Curvature by Clifford Henry Taubes

Foundations of Differential Geometry by Shoshichi Kobayashi and Katsumi Nomizu

Geometry, Topology, and Physics by Mikio Nakahara

Rotations in Three Dimensions

In Rotating and Reflecting Vectors Using Matrices we learned how to express rotations in 2-dimensional space using certain special 2\times 2 matrices which form a group (see Groups) we call the special orthogonal group in dimension 2, or \text{SO}(2) (together with other matrices which express reflections, they form a bigger group that we call the orthogonal group in 2 dimensions, or \text{O}(2)).

In this post, we will discuss rotations in 3-dimensional space. As we will soon see, notations in 3-dimensional space have certain interesting features not present in the 2-dimensional case, and despite being seemingly simple and mundane, play very important roles in some of the deepest aspects of fundamental physics.

We will first discuss rotations in 3-dimensional space as represented by the special orthogonal group in dimension 3, written as \text{SO}(3).

We recall some relevant terminology from Rotating and Reflecting Vectors Using Matrices. A matrix is called orthogonal if it preserves the magnitude of (real) vectors. The magnitude of the vector v must be equal to the magnitude of the vector Av, for a matrix A, to be orthogonal. Alternatively, we may require, for the matrix A to be orthogonal, that it satisfy the condition

\displaystyle AA^{T}=A^{T}A=I

where A^{T} is the transpose of A and I is the identity matrix. The word “special” denotes that our matrices must have determinant equal to 1. Therefore, the group \text{SO}(3) consists of the 3\times3 orthogonal matrices whose determinant is equal to 1.

The idea of using the group \text{SO}(3) to express rotations in 3-dimensional space may be made more concrete using several different formalisms.

One popular formalism is given by the so-called Euler angles. In this formalism, we break down any arbitrary rotation in 3-dimensional space into three separate rotations. The first, which we write here by \varphi, is expressed as a counterclockwise rotation about the z-axis. The second, \theta, is a counterclockwise rotation about an x-axis that rotates along with the object. Finally, the third, \psi, is expressed as a counterclockwise rotation about a z-axis that, once again, has rotated along with the object. For readers who may be confused, animations of these steps can be found among the references listed at the end of this post.

The matrix which expresses the rotation which is the product of these three rotations can then be written as

\displaystyle g(\varphi,\theta,\psi) = \left(\begin{array}{ccc} \text{cos}(\varphi)\text{cos}(\psi)-\text{cos}(\theta)\text{sin}(\varphi)\text{sin}(\psi) & -\text{cos}(\varphi)\text{sin}(\psi)-\text{cos}(\theta)\text{sin}(\varphi)\text{cos}(\psi) & \text{sin}(\varphi)\text{sin}(\theta) \\ \text{sin}(\varphi)\text{cos}(\psi)+\text{cos}(\theta)\text{cos}(\varphi)\text{sin}(\psi) & -\text{sin}(\varphi)\text{sin}(\psi)+\text{cos}(\theta)\text{cos}(\varphi)\text{cos}(\psi) & -\text{cos}(\varphi)\text{sin}(\theta) \\ \text{sin}(\psi)\text{sin}(\theta) & \text{cos}(\psi)\text{sin}(\theta) & \text{cos}(\theta) \end{array}\right).

The reader may check that, in the case that the rotation is strictly in the xy plane, i.e. \theta and \psi are zero, we will obtain

\displaystyle g(\varphi,\theta,\psi) = \left(\begin{array}{ccc} \text{cos}(\varphi) & -\text{sin}(\varphi) & 0 \\ \text{sin}(\varphi) & \text{cos}(\varphi) & 0 \\ 0 & 0 & 1 \end{array}\right).

Note how the upper left part is an element of \text{SO}(2), expressing a counterclockwise rotation by an angle \varphi, as we might expect.

Contrary to the case of \text{SO}(2), which is an abelian group, the group \text{SO}(3) is not an abelian group. This means that for two elements a and b of \text{SO}(3), the product ab may not always be equal to the product ba. One can check this explicitly, or simply consider rotating an object along different axes; for example, rotating an object first counterclockwise by 90 degrees along the z-axis, and then counterclockwise again by 90 degrees along the x-axis, will not end with the same result as performing the same operations in the opposite order.

We now know how to express rotations in 3-dimensional space using 3\times 3 orthogonal matrices. Now we discuss another way of expressing the same concept, but using “unitary”, instead of orthogonal, matrices. However, first we must revisit rotations in 2 dimensions.

The group \text{SO}(2) is not the only way we have of expressing rotations in 2-dimensions. For example, we can also make use of the unitary (we will explain the meaning of this word shortly) group in 1-dimension, also written \text{U}(1). It is the group formed by the complex numbers with magnitude equal to 1. The elements of this group can always be written in the form e^{i\theta}, where \theta is the angle of our rotation. As we have seen in Connection and Curvature in Riemannian Geometry, this group is related to quantum electrodynamics, as it expresses the gauge symmetry of the theory.

The groups \text{SO}(2) and \text{U}(1) are actually isomorphic. There is a one-to-one correspondence between the elements of \text{SO}(2) and the elements of \text{U}(1) which respects the group operation. In other words, there is a bijective function f:\text{SO}(2)\rightarrow\text{U}(1), which satisfies ab=f(a)f(b) for a, b elements of \text{SO}(2). When two groups are isomorphic, we may consider them as being essentially the same group. For this reason, both \text{SO}(2) and U(1) are often referred to as the circle group.

We can now go back to rotations in 3 dimensions and discuss the group \text{SU}(2), the special unitary group in dimension 2. The word “unitary” is in some way analogous to “orthogonal”, but applies to vectors with complex number entries.

Consider an arbitrary vector

\displaystyle v=\left(\begin{array}{c}v_{1}\\v_{2}\\v_{3}\end{array}\right).

An orthogonal matrix, as we have discussed above, preserves the quantity (which is the square of what we have referred to earlier as the “magnitude” for vectors with real number entries)

\displaystyle v_{1}^{2}+v_{2}^{2}+v_{3}^{2}

while a unitary matrix preserves

\displaystyle v_{1}^{*}v_{1}+v_{2}^{*}v_{2}+v_{3}^{*}v_{3}

where v_{i}^{*} denotes the complex conjugate of the complex number v_{i}. This is the square of the analogous notion of “magnitude” for vectors with complex number entries.

Just as orthogonal matrices must satisfy the condition

\displaystyle AA^{T}=A^{T}A=I,

unitary matrices are required to satisfy the condition

\displaystyle AA^{\dagger}=A^{\dagger}A=I

where A^{\dagger} is the Hermitian conjugate of A, a matrix whose entries are the complex conjugates of the entries of the transpose A^{T} of A.

An element of the group \text{SU}(2) is therefore a 2\times 2 unitary matrix whose determinant is equal to 1. Like the group \text{SO}(3), the group \text{SU}(2) is also a group which is not abelian.

Unlike the analogous case in 2 dimensions, the groups \text{SO}(3) and \text{SU}(2) are not isomorphic. There is no one-to-one correspondence between them. However, there is a homomorphism from \text{SU}(2) to \text{SO}(3) that is “two-to-one”, i.e. there are always two elements of \text{SU}(2) that get mapped to the same element of \text{SO}(3) under this homomorphism. Hence, \text{SU}(2) is often referred to as a “double cover” of \text{SO}(3).

In physics, this concept underlies the weird behavior of quantum-mechanical objects called spinors (such as electrons), which require a rotation of 720, not 360, degrees to return to its original state!

The groups we have so far discussed are not “merely” groups. They also possesses another kind of mathematical structure. They describe certain shapes which happen to have no sharp corners or edges. Technically, such a shape is called a manifold, and it is the object of study of the branch of mathematics called differential geometry, which we have discussed certain basic aspects of in Geometry on Curved Spaces and Connection and Curvature in Riemannian Geometry.

For the circle group, the manifold that it describes is itself a circle. The elements of the circle group correspond to the points of the circle. The group \text{SU}(2) is the surface of the 4– dimensional sphere, or what we call a 3-sphere (for those who might be confused by the terminology, recall that we are only considering the surface of the sphere, not the entire volume, and this surface is a 3-dimensional, not a 4-dimensional, object). The group \text{SO}(3) is 3-dimensional real projective space, written \mathbb{RP}^{3}. It is a manifold which can be described using the concepts of projective geometry (see Projective Geometry).

A group that is also a manifold is called a Lie group (pronounced like “lee”) in honor of the mathematician Marius Sophus Lie who pioneered much of their study. Lie groups are very interesting objects of study in mathematics because they bring together the techniques of group theory and differential geometry, which teaches us about Lie groups on one hand, and on the other hand also teaches us more about both group theory and differential geometry themselves.


Orthogonal Group on Wikipedia

Rotation Group SO(3) on Wikipedia

Euler Angles on Wikipedia

Unitary Group on Wikipedia

Spinor on Wikipedia

Lie Group on Wikipedia

Real Projective Space on Wikipedia

Algebra by Michael Artin

An Intuitive Introduction to String Theory and (Homological) Mirror Symmetry

String theory is by far the most popular of the current proposals to unify the as of now still incompatible theories of quantum mechanics and general relativity. In this post we will give a short overview of the concepts involved in string theory, but not with the goal of discussing the theory itself in depth (hopefully there will be more posts in the future working towards this task). Instead, we will focus on introducing a very interesting and very beautiful branch of mathematics that arose out of string theory called mirror symmetry. In particular, we will focus on a version of it originally formulated by the mathematician Maxim Kontsevich in 1994 called homological mirror symmetry.

We will start with string theory. String theory started out as a theory of the nuclear forces that held together the protons and electrons in the nucleus of an atom. It was abandoned later on, due to a more successful theory called quantum chromodynamics taking its place. However, it was soon found out that string theory could model the elusive graviton, a particle “carrier” of gravity in the same way that a photon is a particle “carrier” of electromagnetism (the photon is more popularly referred to as a particle of light, but because light itself is an electromagnetic wave, it is also a manifestation of an electromagnetic field), and since then physicists have started developing string theory, no longer in the sole context of nuclear forces, but as a possible candidate for a working theory of quantum gravity.

The incompatibility of quantum mechanics and general relativity (which is currently our accepted theory of gravity) arises from the nonrenormalizability of gravity. In calculations in quantum field theory (see Some Basics of Relativistic Quantum Field Theory and Some Basics of (Quantum) Electrodynamics), there appear certain “nonsensical” quantities which are made sense of via a “corrective” procedure called renormalization (not to be confused with some other procedures called “normalization”). While the way that renormalization works is not really completely understood at the moment, it is known that this procedure at least “works” – this means that it produces the correct values of quantities, as can be checked via experiment.

Renormalization, while it works for the other forces, however fails for gravity. Roughly this is sometimes described as gravity “wildly fluctuating” at the smallest scales. What we know is that this signals, for us, a lack of knowledge of  what physics is like at these extremely small scales (much smaller than the current scale of quantum mechanics).

String theory attempts to solve this conundrum by proposing that particles, at the very smallest scales, are not “particles” at all, but “strings”. This takes care of the problem of fluctuations at the smallest scales, since there is a limit to how small the scale can be, set by the length of the strings. It is perhaps worth noting at this point that the next most popular contender to string theory, loop quantum gravity, tackles this problem by postulating that space itself is not continuous, but “discretized” into units of a certain length. For both theories, this length is predicted to be around 10^{-35} meters, a constant quantity which is known as the Planck length.

Over time, as string theory was developed, it became more ambitious, aiming to provide not only the unification of quantum mechanics and general relativity, but also the unification of the four fundamental forces – electromagnetism, the weak nuclear force, the strong nuclear force, and gravity, under one “theory of everything“. At the same time, it needed more ingredients – to be able to account for bosons, the particles carrying “forces”, such as photons and gravitons, and the fermions, particles that make up matter, such as electrons, protons, and neutrons, a new ingredient had to be added, called supersymmetry. In addition, it worked not in the four dimensions of spacetime that we are used to, but instead required ten dimensions (for the “bosonic” string theory, before supersymmetry, the number of dimensions required was a staggering twenty-six)!

How do we explain spacetime having ten dimensions, when we experience only four? It turns out, even before string theory, the idea of extra dimensions was already explored by the physicists Theodor Kaluza and Oskar Klein. They proposed a theory unifying electromagnetism and gravity by postulating an “extra” dimension which was “curled up” into a loop so small we could never notice it. The usual analogy is that of an ant crossing a wire – when the radius of the wire is big, the ant realizes that it can go sideways along the wire, but when the radius of the wire is small, it is as if there is only one dimension that the ant can move along.

So we now have this idea of six curled up dimensions of spacetime, in addition to the usual four. It turns out that there are so many ways that these dimensions can be curled up. This phenomenon is called the string theory landscape, and it is one of the biggest problems facing string theory today. What could be the specific “shape” in which these dimensions are curled up, and why are they not curled up in some other way? Some string theorists answer this by resorting to the controversial idea of a multiverse, so that there are actually several existing universes, each with its own way of how the extra six dimensions are curled up, and we just happen to be in this one because, perhaps, this is the only one where the laws of physics (determined by the way the dimensions are curled up) are able to support life. This kind of reasoning is called the anthropic principle.

In addition to the string theory landscape, there was also the problem of having several different versions of string theory. These problems were perhaps alleviated by the discovery of mysterious dualities. For example, there is the so-called T-duality, where a compactification (a “curling up”) with a bigger radius gives the same laws of physics as a compactification with a smaller, “reciprocal” radius. Not only do the concept of dualities connect the different ways in which the extra dimensions are curled up, they also connect the several different versions of string theory! In 1995, the physicist Edward Witten conjectured that this is perhaps because all these different versions of string theory come from a single “mother theory”, which he called “M-theory“.

In 1991, physicists Philip Candelas, Xenia de la Ossa, Paul Green, and Linda Parkes used these dualities to solve a mathematical problem that had occupied mathematicians for decades, that of counting curves on a certain manifold (a manifold is a shape without sharp corners or edges) known as a Calabi-Yau manifold. In the context of Calabi-Yau manifolds, which are some of the shapes in which the extra dimensions of spacetime are postulated to be curled up, these dualities are known as mirror symmetry. With the success of Candelas, de la Ossa, Green, and Parkes, mathematicians would take notice of mirror symmetry and begin to study it as a subject of its own.

Calabi-Yau manifolds are but special cases of Kahler manifolds, which themselves are very interesting mathematical objects because they can be studied using three aspects of differential geometry – Riemannian geometry, symplectic geometry, and complex geometry.

We have already encountered examples of Kahler manifolds on this blog – they are the elliptic curves (see Elliptic Curves and The Moduli Space of Elliptic Curves). In fact elliptic curves are not only Kahler manifolds but also Calabi-Yau manifolds, and they are the only two-dimensional Calabi-Yau manifolds (we sometimes refer to them as “one-dimensional” when we are considering “complex dimensions”, as is common practice in algebraic geometry – this apparent “discrepancy” in counting dimensions arises because we need two real numbers to specify a complex number). In string theory of course we consider six-dimensional (three-dimensional when considering complex dimensions) Calabi-Yau manifolds, since there are six extra curled up dimensions of spacetime, but often it is also fruitful to study also the other cases, especially the simpler ones, since they can serve as our guide for the study of the more complicated cases.

Riemannian geometry studies Riemannian manifolds, which are manifolds equipped with a metric tensor, which intuitively corresponds to an “infinitesimal distance formula” dependent on where we are on the manifold. We have already encountered Riemannian geometry before in Geometry on Curved Spaces and Connection and Curvature in Riemannian Geometry. There we have seen that Riemannian geometry is very important in the mathematical formulation of general relativity, since in this theory gravity is just the curvature of spacetime, and the metric tensor expresses this curvature by showing how the formula for the infinitesimal distance between two points (actually the infinitesimal spacetime interval between two events) changes as we move around the manifold.

Symplectic geometry, meanwhile, studies symplectic manifolds. If Riemannian manifolds are equipped with a metric tensor that measures “distances”, symplectic manifolds are equipped with a symplectic form that measures “areas”. The origins of symplectic geometry are actually related to William Rowan Hamilton’s formulation of classical mechanics (see Lagrangians and Hamiltonians), as developed later on by Henri Poincare. There the object of study is phase space, which gives the state of a system based on the position and momentum of the objects that comprise it. It is this phase space that is expressed as a symplectic manifold.

Complex geometry, following our pattern, studies complex manifolds. These are manifolds which locally look like \mathbb{C}^{n}, in the same way that ordinary differentiable manifolds locally look like \mathbb{R}^{n}. Just as Riemannian geometry has metric tensors and symplectic geometry has symplectic forms, complex geometry has complex structures, mappings of tangent spaces with the property that applying them twice is the same as multiplication by -1, mimicking the usual multiplication by the imaginary unit i on the complex plane.

Complex manifolds are not only part of differential geometry, they are also often studied using the methods of algebraic geometry! We recall (see Basics of Algebraic Geometry) that algebraic geometry studies varieties and schemes, which are shapes such as lines, conic sections (parabolas, hyperbolas, ellipses, and circles), and elliptic curves, that can be described by polynomials (their modern definitions are generalizations of this concept). In fact, all Calabi-Yau manifolds can be described by polynomials, such as the following example, due to user Andrew J. Hanson of Wikipedia:


This is a visualization (actually a sort of “cross section”, since we can only display two dimensions and this object is actually six-dimensional) of the Calabi-Yau manifold described by the following polynomial equation:

\displaystyle V^{5}+W^{5}+X^{5}+Y^{5}+Z^{5}=0

This polynomial equation (known as the Fermat quintic) actually describes the Calabi-Yau manifold  in projective space using homogeneous coordinates. This means that we are using the concepts of projective geometry (see Projective Geometry) to include “points at infinity“.

We note at this point that Kahler manifolds and Calabi-Yau manifolds are interesting in their own right, even outside of the context of string theory. For instance, we have briefly mentioned in Algebraic Cycles and Intersection Theory the Hodge conjecture, one of seven “Millenium Problems” for which the Clay Mathematics Institute is currently offering a million-dollar prize, and it concerns Kahler manifolds. Perhaps most importantly, it “unifies” several different branches of mathematics; as we have already seen, the study of Kahler manifolds and Calabi-Yau manifolds involves Riemannian geometry, symplectic geometry, complex geometry, and algebraic geometry. The more recent version of mirror symmetry called homological mirror symmetry further adds category theory and homological algebra to the mix.

Now what mirror symmetry more specifically states is that a version of string theory called Type IIA string theory, on a spacetime with extra dimensions compactified onto a certain Calabi-Yau manifold V, is the same as another version of string theory, called Type IIB string theory, on a spacetime with extra dimensions compactified onto another Calabi-Yau manifold W, which is “mirror” to the Calabi-Yau manifold V.

The statement of homological mirror symmetry (which is still conjectural, but mathematically proven in certain special cases) expresses the idea of the previous paragraph as follows (quoted verbatim from the paper Homological Algebra of Mirror Symmetry by Maxim Kontsevich):

Let (V,\omega) be a 2n-dimensional symplectic manifold with c_{1}(V)=0 and W be a dual n-dimensional complex algebraic manifold.

The derived category constructed from the Fukaya category F(V) (or a suitably enlarged one) is equivalent to the derived category of coherent sheaves on a complex algebraic variety W.

The statement makes use of the language of category theory and homological algebra (see Category TheoryMore Category Theory: The Grothendieck ToposEven More Category Theory: The Elementary ToposExact SequencesMore on Chain Complexes, and The Hom and Tensor Functors), but the idea that it basically expresses is that there exists a relation between the symplectic aspects of the Calabi-Yau manifold V, as encoded in its Fukaya category, and the complex aspects of the Calabi-Yau manifold W, as encoded in its category of coherent sheaves (see Sheaves and More on Sheaves). As we have said earlier, the subjects of algebraic geometry and complex geometry are closely related, and hence the language of sheaves show up in (and is an important part of) both subjects. The concept of derived categories, which generalize derived functors like the Ext and Tor functors, allow us to relate the two categories, which otherwise would be expressing different concepts. Inspired by string theory, therefore, we have now a deep and beautiful idea in geometry, relating its different aspects.

Is string theory the correct way towards a complete theory of quantum gravity, or the so-called “theory of everything”? As of the moment, we don’t know. Quantum gravity is a very difficult problem, and the scales involved are still far out of our reach – in order to probe smaller and smaller scales we need particle accelerators with higher and higher energies, and right now the technologies that we have are still very, very far from the scales which are relevant to quantum gravity. Still, it is hoped for that whatever we find in experiments in the near future, not only in the particle accelerators but also in the radio telescopes that look out into space, will at least guide us towards the correct path.

There are some who believe that, in the absence of definitive experimental evidence, mathematical beauty is our next best guide. And, without a doubt, string theory is related to, and has inspired, some very beautiful and very interesting mathematics, including that which we have discussed in this post. Still, physics, like all natural science, is empirical (based on evidence and observation), and hence it is ultimately physical evidence that will be the judge of correctness. It may yet turn out that string theory is wrong, and that it is a different theory which describes the fundamental physical laws of nature, or that it needs drastic modifications to its ideas. This will not invalidate the mathematics that we have described here, anymore than the discoveries of Copernicus invalidated the mathematics behind the astronomical model of Ptolemy – in fact this mathematics not only outlived the astronomy of Ptolemy, but served the theories of Copernicus, and his successors, just as well. Hence we cannot really say that the efforts of Ptolemy were wasted, since even though his scientific ideas were shown to be wrong, still his mathematical methods were found very useful by those who succeeded him. Thus, while our current technological limitations prohibit us from confirming or ruling out proposals for a theory of quantum gravity such as string theory, there is still much to be gained from such continued efforts on the part of theory, while experiment is still in the process of catching up.

Our search for truth continues. Meanwhile, we have beauty to cultivate.


String Theory on Wikipedia

Mirror Symmetry on Wikipedia

Homological Mirror Symmetry on Wikipedia

Calabi-Yau Manifold on Wikipedia

Kahler Manifold on Wikipedia

Riemannian Geometry on Wikipedia

Symplectic Geometry on Wikipedia

Complex Geometry on Wikipedia

Fukaya Category on Wikipedia

Coherent Sheaf on Wikipedia

Derived Category on Wikipedia

Image by User Andrew J. Hanson of Wikipedia

Homological Algebra of Mirror Symmetry by Maxim Kontsevich

The Elegant Universe: Superstrings, Hidden Dimensions, and the Quest for the Ultimate Theory by Brian Greene

String Theory by Joseph Polchinski

String Theory and M-Theory: A Modern Introduction by Katrin Becker, Melanie Becker, and John Schwarz

Differential Forms

Differential forms are important concepts in differential geometry and mathematical physics. For example, they can be used to express Maxwell’s equations (see Some Basics of (Quantum) Electrodynamics) in a very elegant form. In this post, however, we will introduce these mathematical objects as generalizing certain aspects of integral calculus (see An Intuitive Introduction to Calculus), allowing us to perform integration over surfaces, volumes, or their higher-dimensional analogues.

We recall from An Intuitive Introduction to Calculus the statement of the fundamental theorem of calculus:

\displaystyle \int_{a}^{b}\frac{df}{dx}dx=f(b)-f(a).

Regarding the left hand side of this equation, we usually we say that we integrate over the interval from a to b; we may therefore also write it more suggestively as

\displaystyle \int_{[a,b]}\frac{df}{dx}dx=f(b)-f(a).

We note that a and b form the boundary of the interval [a,b]. We denote the boundary of some “shape” M by \partial M. Therefore, in this case, \partial [a,b]=\{a\}\cup\{b\}.

Next we are going to perform some manipulations on the notation, which, while we will not thoroughly justify in this post, are meant to be suggestive and provide intuition for the discussion on differential forms. First we need the notion of orientation. We can imagine, for example, an “arrow” pointing from a to b; this would determine one orientation. Another would be determined by an “arrow” pointing from b to a. This is important because we need a notion of integration “from a to b” or “from b to a“, and the two are not the same. In fact,

\displaystyle \int_{a}^{b}\frac{df}{dx}dx=-\int_{b}^{a}\frac{df}{dx}dx

i.e. there is a change of sign if we “flip” the orientation. Although an interval such as [a,b] is one-dimensional, the notion of orientation continues to make sense in higher dimension. If we have a surface, for example, we may consider going “clockwise” or “counterclockwise” around the surface. Alternatively we may consider an “arrow” indicating which “side” of the surface we are on. For three dimensions or higher it is harder to visualize, but we will be able to make this notion more concrete later on with differential forms.

Given the notion of orientation, let us now denote the boundary of the interval [a,b], taken with orientation, for instance, “from a to b“, by \{a\}^{-}\cup\{b\}^{+}.

Let us now write

\displaystyle \frac{df}{dx}dx=df

and then we can write the fundamental theorem of calculus as

\displaystyle \int_{[a,b]}df=f(b)-f(a).

Then we consider the idea of “integration over points”, by which we refer to simply evaluating the function at those points, with the orientation taken into account, such that we have

\displaystyle \int_{\{a\}^{-}\cup\{b\}^{+}}f=f(b)-f(a)

Recalling that \partial [a,b]=\{a\}^{-}\cup\{b\}^{+}, this now gives us the following expression for the fundamental theorem of calculus:

\displaystyle \int_{[a,b]}df=\int_{\{a\}^{-}\cup\{b\}^{+}}f

\displaystyle \int_{[a,b]}df=\int_{\partial [a,b]}f

Things may still be confusing to the reader at this point – for instance, that integral on the right hand side looks rather weird – we will hopefully make things more concrete shortly. For now, the rough idea that we want to keep in mind is the following:

The integral of a “differential” of some function over some shape is equal to the integral of the function over the boundary of the shape.

In one dimension, this is of course the fundamental theorem of calculus as we have stated it earlier. For two dimensions, there is a famous theorem called Green’s theorem. In three dimensions, there are two manifestations of this idea, known as Stokes’ theorem and the divergence theorem. The more “concrete” version of this statement, which we want to discuss in this post, is the following:

The integral of the exterior derivative of a differential form over a manifold with boundary is equal to the integral of the differential form over the boundary.

We now discuss what these differential forms are. Instead of the formal definitions, we will start with special cases, develop intuition with examples, and attempt to generalize. The more formal definitions will be left to the references. We will start with the so-called 1-forms, which are “linear combinations” of the “differentials”.

We can think of these “differentials” as merely symbols for now, or perhaps consider them analogous to “infinitesimal quantities” in calculus. In differential geometry, however, they are actually “dual” to vectors, mapping vectors to numbers in the same way that row matrices map column matrices to the numbers which serve as their scalars (see Matrices) of the coordinates, with coefficients which are functions:

\displaystyle f_{1}dx+f_{2}dy+f_{3}dz

From now on, to generalize, instead of the coordinates x, y, and z we will use x^{1}, x^{2}, x^{3}, and so on. We will write exponents as (x^{1})^{2}, to hopefully avoid confusion.

From these 1-forms we can form 2-forms by taking the wedge product. In ordinary multivariable calculus, the following expression

\displaystyle dxdy

represents an “infinitesimal area”, and so for example the integral

\displaystyle \int_{0}^{1}\int_{0}^{1}dxdy

gives us the area of a square with vertices at (0,0)(1,0)(0,1), and (1,1). The wedge product expresses this same idea (in fact the wedge product dx\wedge dy is often also called the area form, mirroring the idea expressed by dxdy earlier), except that we want to include the concept of orientation that we discussed earlier. Therefore, in order to express this idea of orientation, we require the wedge product to satisfy the following property called antisymmetry:

\displaystyle dx^{1}\wedge dx^{2}=-dx^{2}\wedge dx^{1}

Note that antisymmetry implies the following relation:

\displaystyle dx^{i}\wedge dx^{i}=-dx^{i}\wedge dx^{i}

\displaystyle dx^{i}\wedge dx^{i}=0

In other words, the wedge product of such a differential form with itself is equal to zero.

We can also form 3-forms, 4-forms, etc. using the wedge product. The collection of all these n-forms, for every n, is the algebra of differential forms. This means that we can add, subtract, and form wedge products of differential forms. Ordinary functions themselves form the 0-forms.

We can also take what is called the exterior derivative of differential forms. If, for example, we have a differential form \omega given by the following expression,

\displaystyle \omega=f dx^{a}

then the exterior derivative of \omega, written d\omega, is given by

\displaystyle d\omega=\sum_{i=1}^{n}\frac{\partial f}{\partial x_{i}}dx^{i}\wedge dx^{a}.

We note that the exterior derivative of a n-form is an n+1-form. We also note that the exterior derivative of an exterior derivative is always zero, i.e. d(d\omega)=0 for any differential form \omega. A differential form which is the exterior derivative of some other differential form is called exact. A differential form whose exterior derivative is zero is called closed. The statement d(d\omega)=0 can also be expressed as follows:

All exact forms are closed.

However, not all closed forms are exact. This is reminiscent of the discussion in Homology and Cohomology, and in fact the study of closed forms which are not exact leads to the theory of de Rham cohomology, which is a very important part of modern mathematics and mathematical physics.

Given the idea of the exterior derivative, the general form of the fundamental theorem of calculus is now given by the generalized Stokes’ theorem (sometimes simply called the Stokes’ theorem; historically however, as alluded to earlier, the original Stokes’ theorem only refers to a special case in three dimensions):

\displaystyle \int_{M}d\omega=\int_{\partial M}\omega

This is the idea we alluded to earlier, relating the integral of a differential form (which includes functions as 0-forms) over some “shape” to the integral of the exterior derivative of the differential form over the boundary of that “shape”.

There is much more to the theory of differential forms than we have discussed here. For example, although we have referred to these “shapes” as manifolds with boundary, more generally they are “chains” (see also Homology and Cohomology – the similarities are not coincidental!). There are restrictions on these chains in order for the integral to give a function; for example, an n-form must be integrated over an n-dimensional chain (or simply n-chain) to give a function, otherwise they will give some other differential form. An m-form integrated over an n-chain gives an m-n form. Also, more rigorously the concept of integration on more complicated spaces involves the notion of “pullback”. We will leave these concepts to the references for now, contenting ourselves with the discussion of the wedge product and exterior derivative in this post. The application of differential forms to physics is discussed in the very readable book Gauge Fields, Knots and Geometry by John Baez and Javier P. Muniain.


Differential Forms on Wikipedia

Green’s Theorem on Wikipedia

Divergence Theorem on Wikipedia

Stokes’ Theorem on Wikipedia

De Rham Cohomology on Wikipedia

Calculus on Manifolds by Michael Spivak

Gauge Fields, Knots and Gravity by John Baez and Javier P. Muniain

Geometry, Topology, and Physics by Mikio Nakahara

Connection and Curvature in Riemannian Geometry

In Geometry on Curved Spaces, we showed how different geometry can be when we are working on curved space instead of flat space, which we are usually more familiar with. We used the concept of a metric to express how the distance formula changes depending on where we are on this curved space. This gives us some way to “measure” the curvature of the space.

We also described the concept of parallel transport, which is in some way even more general than the metric, and can also be used to provide us with some measure of the curvature of a space. Although we can use concepts analogous to parallel transport even without the metric, if we do have a metric on the space and an expression for it, we can relate the concept of parallel transport to the metric, which is perhaps more intuitive. In this post, we formalize the concept of parallel transport by defining the Christoffel symbol and the Riemann curvature tensor, both of which we can obtain given the form of the metric. The Christoffel symbol and the Riemann curvature tensor are examples of the more general concepts of a connection and a curvature form, respectively, which need not be obtained from the metric.

Some Basics of Tensor Notation

First we establish some notation. We have already seen some tensor notation in Some Basics of (Quantum) Electrodynamics, but we explain a little bit more of that notation here, since it will be the language we will work in. Many of the ordinary vectors we are used to, such as the position, will be indexed by superscripts. We refer to these vectors as contravariant vectors. A common convention is to use Latin letters, such as i or j, as indices when we are working with space, and Greek letters, such as \mu and \nu, as indices when we are working with spacetime. Let us consider , for example, spacetime. An event in this spacetime is specified by its 4-position x^{\mu}, where x^{0}=ctx^{1}=xx^{2}=y, and x^{3}=z.

We will use the symbol g_{\mu\nu} for our metric, and we will also often express it as a matrix. For the case of flat spacetime, our metric is given by the Minkowski metric \eta_{\mu\nu}:

\displaystyle \eta_{\mu\nu}=\left(\begin{array}{cccc}-1&0&0&0\\0&1&0&0\\0&0&1&0\\ 0&0&0&1\end{array}\right)

We can use the metric to “raise” and “lower” indices. This is done by multiplying the metric and a vector, and summing over a common index (one will be a superscript and the other a subscript). We have introduced the Einstein summation convention in Some Basics of (Quantum) Electrodynamics, where repeated indices always imply summation, unless explicitly stated otherwise, and we will continue to use this convention for posts discussing differential geometry and the theory of relativity.

Here is an example of “lowering” the index of x^{\nu} in flat spacetime using the metric \eta_{\mu\nu} to obtain a new quantity x_{\mu}:

\displaystyle x_{\mu}=\eta_{\mu\nu}x^{\nu}

Explicitly, the components of the quantity x_{\mu} are given by x_{0}=-ctx_{1}=xx_{2}=y, and x_{3}=z. Note that the “time” component x_{0} has changed sign; this is because \eta_{00}=-1. A quantity such as x_{\mu}, which has a subscript index, is called a covariant vector.

In order to “raise” indices, we need the “inverse metricg^{\mu\nu}. For the Minkowski metric \eta_{\mu\nu}, the inverse metric \eta^{\mu\nu} has the exact same components as \eta_{\mu\nu}, but for more general metrics this may not be the case. The general procedure for obtaining the inverse metric is to consider the expression


where \delta_{\mu}^{\rho} is the Kronecker delta, a quantity that can be expressed as the matrix

\displaystyle \delta_{\mu}^{\rho}=\left(\begin{array}{cccc}1&0&0&0\\0&1&0&0\\0&0&1&0\\ 0&0&0&1\end{array}\right).

As a demonstration of what our notation can do, we recall the formula for the invariant spacetime interval:

\displaystyle (ds)^2=-(cdt)^2+(dx)^2+(dy)^2+(dz)^2

Using tensor notation combined with the Einstein summation convention, this can be written simply as

\displaystyle (ds)^2=\eta_{\mu\nu}dx^{\mu}dx^{\nu}.

The Christoffel Symbol and the Covariant Derivative

We now come back to the Christoffel symbol \Gamma^{\mu}_{\nu\lambda}. The idea behind the Christoffel symbol is that it is used to define the covariant derivative \nabla_{\nu}V^{\mu} of a vector V^{\mu}.

The covariant derivative is a very important concept in differential geometry (and not just in Riemannian geometry). When we take derivatives, we are actually comparing two vectors. To further explain what we mean, we recall that individually the components of the vectors can be thought of as functions on the space, and we recall the expression for the derivative from An Intuitive Introduction to Calculus:

\displaystyle \frac{df}{dx}=\frac{f(x+\epsilon)-f(x)}{(x+\epsilon)-(x)} when \epsilon is extremely small (essentially negligible)

More formally, we can write

\displaystyle \frac{df}{dx}=\lim_{\epsilon\to 0}\frac{f(x+\epsilon)-f(x)}{(x+\epsilon)-(x)}.

Therefore, employing the language of partial derivatives, we could have written the following partial derivative of the \mu-th component of an m-dimensional vector V^{\mu} on an m-dimensional space with respect to the coordinate x^{\nu}:

\displaystyle \frac{\partial V^{\mu}}{\partial x^{\nu}}=\lim_{\Delta x^{\nu}\to 0}\frac{V^{\mu}(x^{1},...,x^{\nu}+\Delta x^{\nu},...,x^{m})-V^{\mu}(x_{1},...,x^{\nu},...,x^{m})}{(x^{\nu}+\Delta x^{\nu})-(x^{\nu})}

The problem is that we are comparing vectors from different vector spaces. Recall from Vector Fields, Vector Bundles, and Fiber Bundles that we can think of a vector bundle as having a vector space for every point on the base space. The vector V^{\mu}(x^{1},...,x^{\nu}+\Delta x^{\nu},...,x^{m}) belongs to the vector space on the point (x^{1},...,x^{\nu}+\Delta x^{\nu},...,x^{m}), while the vector V^{\mu}(x_{1},...,x^{\nu},...,x^{m}) belongs to the vector space on the point (x_{1},...,x^{\nu},...,x^{m}). To be able to compare the two vectors we need to “transport” one to the other in the “correct” way, by which we mean parallel transport. Now we have seen in Geometry on Curved Spaces that parallel transport can have weird effects on vectors, and these weird effects are what the Christoffel symbol expresses.

Let \tilde{V}^{\mu}(x^{1},...,x^{\nu}+\Delta x^{\nu},...,x^{m}) denote the vector V^{\mu}(x_{1},...,x^{\nu},...,x^{m}) parallel transported from its original vector space on (x_{1},...,x^{\nu},...,x^{m}) to the vector space on (x^{1},...,x^{\nu}+\Delta x^{\nu},...,x^{m}). The vector \tilde{V}^{\mu}(x^{1},...,x^{\nu}+\Delta x^{\nu},...,x^{m}) is given by the following expression:

\displaystyle \tilde{V}^{\mu}(x^{1},...,x^{\nu}+\Delta x^{\nu},...,x^{m})=V^{\mu}(x_{1},...,x^{\nu},...,x^{m})-V^{\lambda}(x_{1},...,x^{\nu},...,x^{m})\Gamma^{\mu}_{\nu\lambda}(x_{1},...,x^{\nu},...,x^{m})\Delta x^{\nu}

Therefore the Christoffel symbol provides a “correction” for what happens when we parallel transport a vector from one point to another. This is an example of the concept of a connection, which, like the covariant derivative, is part of more general differential geometry beyond Riemannian geometry. The object that is to be parallel transported may not be a vector, for example when we have more general fiber bundles instead of vector bundles. However, in Riemannian geometry we will usually focus on vector bundles, in particular a special kind of vector bundle called the tangent bundle, which consists of the tangent vectors at a point.

Now there is more than one way to parallel transport a mathematical object, which means that there are many choices of a connection. However, in Riemannian geometry there is a special kind of connection that we will prefer. This is the connection that satisfies the following two properties:

\displaystyle \Gamma^{\mu}_{\nu\lambda}=\Gamma^{\mu}_{\lambda\nu}    (torsion-free)

\displaystyle \nabla_{\rho}g_{\mu\nu}    (metric compatibility)

The connection that satisfies these two properties is the one that can be obtained from the metric via the following formula:

\displaystyle \Gamma^{\mu}_{\nu\lambda}=\frac{1}{2}g^{\mu\sigma}(\partial_{\lambda}g_{\mu\sigma}+\partial_{\mu}g_{\sigma\lambda}-\partial_{\sigma}g_{\lambda\mu}).

The covariant derivative is then defined as

\displaystyle \nabla_{\nu}V^{\mu}=\lim_{\Delta x^{\nu}\to 0}\frac{V^{\mu}(x^{1},...,x^{\nu}+\Delta x^{\nu},...,x^{m})-\tilde{V}^{\mu}(x_{1},...,x^{\nu}+\Delta x^{\nu},...,x^{m})}{(x^{\nu}+\Delta x^{\nu})-(x^{\nu})}.

We are now comparing vectors belonging to the same vector space, and evaluating the expression above leads to the formula for the covariant derivative:

\displaystyle \nabla_{\nu}V^{\mu}=\partial_{\nu}V^{\mu}+\Gamma^{\mu}_{\nu\lambda}V^{\lambda}.

The Riemann Curvature Tensor

Next we consider the quantity known as the Riemann curvature tensor. It is once again related to parallel transport, in the following manner. Consider parallel transporting a vector V^{\sigma} through an “infinitesimal” distance specified by another vector A^{\mu}, and after that, through another infinitesimal distance specified by a yet another vector B^{\nu}. Then we go parallel transport it again in the opposite direction to A^{\mu}, then finally in the opposite direction to B^{\nu}. The path forms a parallelogram, and when the vector V^{\sigma} returns to its starting point it will then be changed by an amount \delta V^{\rho}. We can think of the Riemann curvature tensor as the quantity that relates all of these:

\displaystyle \delta V^{\rho}=R^{\rho}_{\ \sigma\mu\nu}V^{\sigma}A^{\mu}B^{\nu}.

Another way to put this is to consider taking the covariant derivative of the vector V^{\rho} along the same path as described above. The Riemann curvature tensor is then related to this quantity as follows:

\displaystyle \nabla_{\mu}\nabla_{\nu}V^{\rho}-\nabla_{\nu}\nabla_{\mu}V^{\rho}=R^{\rho}_{\ \sigma\mu\nu}V^{\sigma}.

Expanding the left hand side, and using the torsion-free property of the Christoffel symbol, we will find that

\displaystyle R^{\rho}_{\ \sigma\mu\nu}=\partial_{\mu}\Gamma^{\rho}_{\nu\sigma}-\partial_{\nu}\Gamma^{\rho}_{\mu\sigma}+\Gamma^{\rho}_{\mu\lambda}\Gamma^{\lambda}_{\nu\sigma}-\Gamma^{\rho}_{\nu\lambda}\Gamma^{\lambda}_{\mu\sigma}.

For connections other than the torsion-free one that we chose, there will be another part of the expansion of the expression \nabla_{\mu}\nabla_{\nu}-\nabla_{\nu}\nabla_{\mu} called the torsion tensor. For our case, however, we need not worry about it and we can focus on the Riemann curvature tensor.

There is another quantity that can be obtained from the Riemann curvature tensor called the Ricci tensor, denoted by R_{\mu\nu}. It is given by

\displaystyle R_{\mu\nu}=R^{\lambda}_{\ \mu\lambda\nu}.

Following the Einstein summation convention, we sum over the repeated index \lambda, and therefore the resulting quantity will have only two indices instead of four. This is an example of the operation on tensors called contraction. If we raise one index using the metric and contract again, we obtain a quantity called the Ricci scalar, denoted R:

\displaystyle R=R^{\mu}_{\ \mu}

Example: The 2-Sphere

To provide an explicit example of the concepts discussed, we show their specific expressions for the case of a 2-sphere. We will only give the final results here. The explicit computations can be found among the references, but the reader may gain some practice, especially on manipulating tensors, by performing the calculations and checking only the answers here. In any case, since the metric is given, it is only a matter of substituting the relevant quantities into the formulas already given above.

We have already given the expression for the metric of the 2-sphere in Geometry on Curved Spaces. We recall that it in matrix form, it is given by (we change our notation for the radius of the 2-sphere to R_{0} to avoid confusion with the symbol for the Ricci scalar)

\displaystyle g_{mn}= \left(\begin{array}{cc}R_{0}^{2}&0\\ 0&R_{0}^{2}\text{sin}(\theta)^{2}\end{array}\right)

Individually, the components are (we will use \theta and \varphi instead of the numbers 1 and 2 for the indices)

\displaystyle g_{\theta\theta}=R_{0}^{2}

\displaystyle g_{\varphi\varphi}=R_{0}^{2}(\text{sin}(\theta))^{2}

The other components (g_{\theta\varphi} and g_{\varphi\theta}) are all equal to zero.

The Christoffel symbols are therefore given by

\displaystyle \Gamma^{\theta}_{\varphi\varphi}=-\text{sin}(\theta)\text{cos}(\theta)

\displaystyle \Gamma^{\varphi}_{\theta\varphi}=\text{cot}(\theta)

\displaystyle \Gamma^{\varphi}_{\varphi\theta}=\text{cot}(\theta)

The other components (\Gamma^{\theta}_{\theta\theta}, \Gamma^{\theta}_{\theta\varphi}, \Gamma^{\theta}_{\varphi\theta}, \Gamma^{\varphi}_{\theta\theta}, and \Gamma^{\varphi}_{\varphi\varphi}) are all equal to zero.

The components of the Riemann curvature tensor are given by

\displaystyle R^{\theta}_{\ \varphi\theta\varphi}=(\text{sin}(\theta))^{2}

\displaystyle R^{\theta}_{\ \varphi\varphi\theta}=-(\text{sin}(\theta))^{2}

\displaystyle R^{\varphi}_{\ \theta\theta\varphi}=-1

\displaystyle R^{\varphi}_{\ \theta\varphi\theta}=1

The other components (there are still twelve of them, so I won’t bother writing all their symbols down here anymore) are all equal to zero.

The components of the Ricci tensor is

\displaystyle R_{\theta\theta}=1

\displaystyle R_{\varphi\varphi}=(\text{sin}(\theta))^{2}

The other components (R_{\theta\varphi} and R_{\varphi\theta}) are all equal to zero.

Finally, the Ricci scalar is

\displaystyle R=\frac{2}{R_{0}^{2}}

We note that the larger the radius of the 2-sphere, the smaller the curvature. We can see this intuitively, for example, when it comes to the surface of our planet, which appears flat because the radius is so large. If our planet was much smaller, this would not be the case.

Bonus: The Einstein Field Equations of General Relativity

Given what we have discussed in this post, we can now write down here the expression for the Einstein field equations (also known simply as Einstein’s equations) of general relativity. It is given in terms of the Ricci tensor and the metric (of spacetime) via the following equation:

\displaystyle R_{\mu\nu}-\frac{1}{2}Rg_{\mu\nu}+\Lambda g_{\mu\nu}=\frac{8\pi}{c^{4}} GT_{\mu\nu}

where G is the gravitational constant, the same constant that appears in Newton’s law of universal gravitation (which is approximated by Einstein’s equations at certain limiting conditions), c is the speed of light in a vacuum, and T_{\mu\nu} is the energy-momentum tensor (also known as the stress-energy tensor), which gives the “density” of energy and momentum, as well as certain other related concepts, such as the pressure and shear stress. The symbol \Lambda refers to what is known as the cosmological constant, which was not there in Einstein’s original formulation but later added to support his view of an unchanging universe. Later, with the dawn of George Lemaitre’s theory of an expanding universe, later known as the Big Bang theory, the cosmological constant was abandoned. More recently, the universe was found to not only be expanding, but expanding at an accelerating rate, necessitating the return of the cosmological constant, with an interpretation in terms of the “vacuum energy”, also known as “dark energy”. Today the nature of the cosmological constant remains one of the great mysteries of modern physics.

Bonus: Connection and Curvature in Quantum Electrodynamics

The concepts of connection and curvature also appear in quantum field theory, in particular quantum electrodynamics (see Some Basics of (Quantum) Electrodynamics). It is the underlying concept in gauge theory, of which quantum electrodynamics is probably the simplest example. However, it is an example of differential geometry which does not make use of the metric. We consider a fiber bundle, where the base space is flat spacetime (also known as Minkowski spacetime), and the fiber is \text{U}(1), which is the group formed by the complex numbers with magnitude equal to 1, with law of composition given by multiplication (we can also think of this as a circle).

We want the group \text{U}(1) to act on the wave function (or field operator) \psi(x), so that the wave function has a “phase”, i.e. we have e^{i\phi(x)}\psi(x), where e^{i\phi(x)} is a complex number which depends on the location x in spacetime. Note that therefore different values of the wave function at different points in spacetime will have different values of the “phase”. In order to compare, them, we need a connection and a covariant derivative.

The connection we want is given by

\displaystyle i\frac{q}{\hbar c}A_{\mu}

where q is the charge of the electron, \hbar is the normalized Planck’s constant, c is the speed of light in a vacuum, and A_{\mu} is the four-potential of electrodynamics.

The covariant derivative (here written using the symbol D_{\mu})is

\displaystyle D_{\mu}\psi(x)=\partial_{\mu}\psi(x)+i\frac{q}{\hbar c}A_{\mu}\psi(x)

We will also have a concept analogous to the Riemann curvature tensor, called the field strength tensor, denoted F_{\mu\nu}. Of course, our “curvature” in this case is not the literal curvature of spacetime, as we have already specified that our spacetime is flat, but an abstract notion of “curvature” that specifies how the phase of our wavefunction changes as we move around the spacetime. This field strength tensor is given by the following expression:


This may be compared to the expression for the Riemann curvature tensor, where the connection is given by the Christoffel symbols. The first two terms of both expressions are very similar. The difference is that the expression for the Riemann curvature tensor has some extra terms that the expression for the field strength tensor does not have. However, a generalization of this procedure for quantum electrodynamics to groups other than \text{U}(1), called Yang-Mills theory, does feature extra terms in the expression for the field strength tensor that perhaps makes the two more similar.

The concepts we have discussed here can be used to derive the theory of quantum electrodynamics simply from requiring that the Lagrangian (from which we can obtain the equations of motion, see also Lagrangians and Hamiltonians) be invariant under \text{U}(1) transformations, i.e. even if we change the “phase” of the wave function at every point the Lagrangian remains the same. This is an example of what is known as gauge symmetry. Generalized to other groups such as \text{SU}(2) and \text{SU}(3), this is the idea behind gauge theories, which include Yang-Mills theory and leads to the standard model of particle physics.


Christoffel Symbols on Wikipedia

Riemannian Curvature Tensor on Wikipedia

Einstein Field Equations on Wikipedia

Gauge Theory on Wikipedia

Riemann Tensor for Surface of a Sphere on Physics Pages

Ricci Tensor and Curvature Scalar for a Sphere on Physics Pages

Spacetime and Geometry by Sean Carroll

Geometry, Topology, and Physics by Mikio Nakahara

Introduction to Elementary Particle Physics by David J. Griffiths

Introduction to Quantum Field Theory by Michael Peskin and Daniel V. Schroeder

Geometry on Curved Spaces

Differential geometry is the branch of mathematics used by Albert Einstein when he formulated the general theory of relativity, where gravity is the curvature of spacetime. It was originally invented by Carl Friedrich Gauss to study the curvature of hills and valleys in the Kingdom of Hanover.

From what I described, one may guess that differential geometry has something to do with curvature. The geometry we learn in high school only occurs on a flat surface. There we can put coordinates x and y and compute distances, angles, areas, and so on.

To imagine what geometry on curved spaces looks like, imagine a globe. Instead of x and y coordinates, we can use latitude and longitude. One can now see just how different geometry is on this globe. Vertical lines (the lines of constant x) on a flat surface are always the same distance apart. On a globe, the analogues of these vertical lines, the lines of constant longitude, are closer near the poles than they are near the equator.

Other weird things happen on our globe: One can have triangles with angles that sum to more than 180 degrees. Run two perpendicular line segments from the north pole to the equator. They will meet the equator at a right angle and form a triangle with three right angles for a total of 270 degrees. Also on the globe the ratio between the circumference of a circle to its diameter might no longer be equal to the number \pi.

To make things more explicit, we will introduce the concept of a metric (the word “metric” refers to a variety of mathematical concepts related to notion of distance – in this post we use it in the sense of differential geometry to refer to what is also called the metric tensor). The metric is an example of a mathematical object called a tensor, which we will not discuss much of in this post. Instead, we will think of the metric as expressing a kind of “distance formula” for our space, which may be curved. The part of differential geometry that makes use of the metric is called Riemannian geometry, named after the mathematician Bernhard Riemann, a student of Gauss who extended his results on curved spaces to higher dimensions.

We recall from From Pythagoras to Einstein several important versions of the “distance formula”, from the case of 2D space, to the case of 4D spacetime. We will focus on the simple case of 2D space in this post, since it is much easier to visualize; in fact, we have already given an example of a 2D space earlier, the globe, which we shall henceforth technically refer to as the 2-sphere. As we have learned in From Pythagoras to Einstein, a knowledge of the most simple cases can go very far toward the understanding of more complicated ones.

We will make a little change in our notation so as to stay consistent with the literature. Instead of the latitude, we will make use of the colatitude, written using the symbol \theta, and defined as the complementary angle to the latitude, i.e. the colatitude is 90 degrees minus the latitude. We will keep using the longitude, and we write it using the symbol \varphi. Note that even though we colloquially express our angles in degrees, for calculations we will always use radians, as is usual practice in mathematics and physics.

On a flat 2D space, the distance formula is given by

\displaystyle (\Delta x)^{2}+(\Delta y)^{2}=(\Delta s)^{2}.

It will be productive for us to work with extremely small quantities for now; from them we can obtain larger quantities later on using the language of calculus (see An Intuitive Introduction to Calculus). Adopting the notation of this language, we write

\displaystyle (dx)^{2}+(dy)^{2}=(ds)^{2}

We now give the distance formula for a 2-sphere:

\displaystyle R^{2}(d\theta)^{2}+R^{2}\text{sin}(\theta)^{2}(d\varphi)^{2}=(ds)^{2}

where R is the radius of the 2-sphere. This formula agrees with our intuition; the same difference in latitude and longitude result in a bigger distance for a bigger 2-sphere than for a smaller one, and the same difference in longitude results in a bigger distance for points near the equator than for points near the poles.

The idea behind the concept of the metric is that it gives how the distance formula changes depending on the coordinates. It is often written as a matrix (see Matrices) whose entries are the “coefficients” of the distance formula. Hence, for a flat 2D space it is given by

\displaystyle \left(\begin{array}{cc}1&0\\ 0&1\end{array}\right)

while for a 2-sphere it is given by

\displaystyle \left(\begin{array}{cc}R^{2}&0\\ 0&R^{2}\text{sin}(\theta)^{2}\end{array}\right).

We have seen that the metric can express how a space is curved. There are several other quantities related to the metric (and which can be derived from it), such as the Christoffel symbol and the Riemann curvature tensor, which express ideas related to curvature – however, unlike the metric which expresses curvature in terms of the distance formula, the Christoffel symbol and the Riemann curvature tensor express curvature in terms of how vectors (see Vector Fields, Vector Bundles, and Fiber Bundles) change as they move around the space.

The main equations of Einstein’s general theory of relativity, called the Einstein equations, relate the Riemann curvature tensor of 4D spacetime to the distribution of mass (or, more properly, the distribution of energy and momentum), expressed via the so-called energy-momentum tensor (also known as the stress-energy tensor).

The application of differential geometry is not limited to general relativity of course, and its objects of study are not limited to the metric. For example, in particle physics, gauge theories such as electrodynamics (see Some Basics of (Quantum) Electrodynamics) use the language of differential geometry to express forces like the electromagnetic force as a kind of “curvature”, even though a metric is not used to express this more “abstract” kind of curvature. Instead, a generalization of the concept of “parallel transport” is used. Parallel transport is the idea behind objects like the Christoffel symbol and the Riemann curvature tensor – it studies how vectors change as they move around the space. To generalize this, we replace vector bundles by more general fiber bundles (see Vector Fields, Vector Bundles, and Fiber Bundles).

To give a rough idea of parallel transport, we give a simple example again in 2D space – this 2D space will be the surface of our planet. Now space itself is 3D (with time it forms a 4D spacetime). But we will ignore the up/down dimension for now and focus only on the north/south and east/west dimensions. In other words, we will imagine ourselves as 2D beings, like the characters in the novel Flatland by Edwin Abbott. The discussion below will not make references to the third up/down dimension.

Imagine that you are somewhere at the Equator, holding a spear straight in front of you, facing north. Now imagine you take a step forward with this spear. The spear will therefore remain parallel to its previous direction. You take another step, and another, walking forward (ignoring obstacles and bodies of water) until you reach the North Pole. Now at the North Pole, without turning, you take a step to the right. The spear is still parallel to its previous direction, because you did not turn. You just keep stepping to the right until you reach the Equator again. You are not at your previous location of course. To go back you need to walk backwards, which once again keeps the spear parallel to its previous direction.

When you finally come back to your starting location, you will find that you are not facing the same direction as when you first started. In fact, you (and the spear) will be facing the east, which is offset by 90 degrees clockwise from the direction you were facing at the beginning, despite the fact that you were keeping the spear parallel all the time.

This would not have happened on a flat space; this “turning” is an indicator that the space (the surface of our planet) is curved. The amount of turning depends, among other things, on the curvature of the space. Hence the idea of parallel transport gives us a way to actually measure this curvature. It is this idea, generalized to mathematical objects other than vectors, which leads to the abstract notion of curvature – it is a measure of the changes that occur in certain mathematical objects when you move around a space in a certain way, which would not have happened if you were on a flat space.

In closing, I would like to note that although differential geometry is probably most famous for its applications in physics (another interesting application in physics, by the way, is the so-called Berry’s phase in quantum mechanics), it is by no means limited to these applications alone, as already reflected in its historical origins, which barely have anything to do with physics. It has even found applications in number theory, via Arakelov theory. Still, it has an especially important role in physics, with much of modern physics written in its language, and many prospects for future theories depending on it. Whether in pure mathematics or theoretical physics, it is one of the most fruitful and active fields of research in modern times.


Since we have restricted ourselves to 2D spaces in this post, here is an example of a metric in 4D spacetime – this is the Schwarzschild metric, which describes the curved spacetime around objects like stars or black holes (it makes use of spherical polar coordinates):

\displaystyle \left(\begin{array}{cccc}-(1-\frac{2GM}{rc^{2}})&0&0&0\\0&(1-\frac{2GM}{rc^{2}})^{-1}&0&0\\0&0&r^{2}&0\\ 0&0&0&r^{2}\text{sin}(\theta)^{2}\end{array}\right)

In other words, the “infinitesimal distance formula” for this curved spacetime is given by

\displaystyle -(1-\frac{2GM}{rc^{2}})(d(ct))^{2}+(1-\frac{2GM}{rc^{2}})^{-1}(dr)^{2}+r^{2}(d\theta)^{2}+r^{2}\text{sin}(\theta)^{2}(d\varphi)^{2}=(ds)^{2}

where G is the gravitational constant and M is the mass. Note also that as a matter of convention the time coordinate is “scaled” by the constant c (the speed of light in a vacuum).


Differential Geometry on Wikipedia

Riemannian Geometry on Wikipedia

Metric Tensor on Wikipedia

Parallel Transport on Wikipedia

Differential Geometry of Curves and Surfaces by Manfredo P. do Carmo

Geometry, Topology, and Physics by Mikio Nakahara

Vector Fields, Vector Bundles, and Fiber Bundles

In physics we have the concept of a vector field. Intuitively, a vector field is given by specifying a vector (in the sense of a quantity with magnitude and direction) at every point in a certain “space”. For instance, the wind velocity on the surface of our planet is a vector field. If we neglect the upward or downward dimension, and look only at the northward, southward, eastward, and westward directions, we have what we usually see on weather maps on the news. In one city the wind might be blowing strongly to the north, in another city it might be blowing weakly to the east, and in a third city it might be blowing moderately to the southwest.

If, instead of specifying a vector space (see Vector Spaces, Modules, and Linear Algebra) at every point, instead of just a single vector, we obtain instead the concept of a vector bundle. Given a vector bundle, we can obtain a vector field by choosing just one vector in the vector space. More technically, we say that a vector field is a section of the vector bundle.

A vector space can be thought of as just a certain kind of space; in our example of wind velocities on the surface of the Earth, the vector space that we attach to every point is the plane \mathbb{R}^{2} endowed with an intuitive vector space structure. Given a point on the plane, we draw an “arrow” with its “tail” at the chosen origin of the plane and its “head” at the given point. We can then add and scale these arrows to obtain other arrows, hence, these arrows form a vector space. This “graphical” method of studying vectors (again in the sense of a quantity with magnitude and direction) is in fact one of the most common ways of introducing the concept of vectors in physics.

If, instead of a vector space such as the plane \mathbb{R}^{2} we generalize to other kinds of spaces such as the circle S^{1}, we obtain the notion of a fiber bundle. A vector space is therefore just a special case of a fiber bundle. In Category Theory, we described the torus as a fiber bundle, obtained by “gluing” a circle to every point of another circle. The shape that is glued is called the “fiber“, and the shape to which the fibers are glued is called the “base“.

Simply gluing spaces to the points of another space does not automatically mean that the space obtained is a fiber bundle, however. There is another requirement. Consider, for example, a cylinder. This can be described as a fiber bundle, with the fibers given by lines, and the base given by a circle (this can also be done the other way around, but we use this description for the moment because we will use it to describe an important condition for a space to be a fiber bundle). However, another fiber bundle can be obtained from lines (as the fibers) and a circle (as the base). This other fiber bundle can be obtained by “twisting” the lines as we “glue” them to the points of a circle, resulting in the very famous shape known as the Mobius strip.

The cylinder, which exhibits no “twisting”, is the simplest kind of fiber bundle, called a trivial bundle. Still, even if the Mobius strip has some kind of “twisting”, if we look at them “locally”, i.e. only on small enough areas, there is no difference between the cylinder and the Mobius strip. It is only when we look at them “globally” that we can distinguish the two. This is the important requirement for a space to be a fiber bundle. Locally, they must “look like” the trivial bundle. This condition is related to the notion of continuity (see Basics of Topology and Continuous Functions).

The concept of fiber bundles can be found everywhere in physics, and forms the language for many of its branches. We have already stated an example, with vector fields on a space. Aside from wind velocities (and the velocities of other fluids), the concept of vector fields are also used to express quantities such as electric and magnetic fields.

Fiber bundles can also be used to express ideas that are not so easily visualized. For example, in My Favorite Equation in Physics we mentioned the concept of a phase space, whose coordinates represent the position and momentum of a system, which is used in the Hamiltonian formulation of classical mechanics. The phase space of a system is an example of a kind of fiber bundle called a cotangent bundle. Meanwhile, in Einstein’s general theory of relativity, the concept of a tangent bundle is used to study the curvature of spacetime (which in the theory is what we know as gravity, and is related to mass, or more generally, energy and momentum).

More generally, the tangent bundle can be used to study the curvature of objects aside from spacetime, including more ordinary objects like a sphere, or hills and valleys on a landscape. This leads to a further generalization of the notion of “curvature” involving other kinds of fiber bundles aside from tangent bundles. This more general idea of curvature is important in the study of gauge theories, which is an important part of the standard model of particle physics. A good place to start for those who want to understand curvature in the context of tangent bundles and fiber bundles is by looking up the idea of parallel transport.

Meanwhile, in mathematics, fiber bundles are also very interesting in their own right. For example, vector bundles on a space can be used to study the topology of a space. One famous result involving this idea is the “hairy ball theorem“, which is related to the observation that on our planet every typhoon must have an “eye”. However, on something that is shaped like a torus instead of a sphere (like, say, a space station with an artificial atmosphere), one can have a typhoon with no eye, simply by running the wind along the walls of the torus. Replacing wind velocities by magnetic fields, this becomes the reason why fusion reactors that use magnetic fields to contain the very hot plasma are shaped like a torus instead of like a sphere. We recall, of course, that the sphere and the torus are topologically inequivalent, and this is reflected in the very different characteristics of vector fields on them.

The use of vector bundles in topology has led to such subjects of mathematics such as the study of characteristic classes and K-theory. The concept of mathematical objects “living on” spaces should be reminiscent of the ideas discussed in Presheaves and Sheaves; in fact, in algebraic geometry the two ideas are very much related. Since algebraic geometry serves as a “bridge” between ideas from geometry and ideas from abstract algebra, this then leads to the subject called algebraic K-theory, where ideas from topology get carried over to abstract algebra and linear algebra (even number theory).


Fiber Bundle on Wikipedia

Vector Bundle on Wikipedia

Vector Field on Wikipedia

Parallel Transport on Wikipedia

What is a Field? at Rationalizing the Universe

Algebraic Geometry by Andreas Gathmann

Algebraic Topology by Allen Hatcher

A Concise Course in Algebraic Topology by J. P. May

Geometrical Methods of Mathematical Physics by Bernard F. Schutz

Geometry, Topology and Physics by Mikio Nakahara