3 Introducing Riemannian Geometry

We have yet to meet the star of the show. There is one object that we can place on a manifold whose importance dwarfs all others, at least when it comes to understanding gravity. This is the metric.

The existence of a metric brings a whole host of new concepts to the table which, collectively, are called Riemannian geometry. In fact, strictly speaking we will need a slightly different kind of metric for our study of gravity, one which, like the Minkowski metric, has some strange minus signs. This is referred to as Lorentzian Geometry and a slightly better name for this section would be “Introducing Riemannian and Lorentzian Geometry”. However, for our immediate purposes the differences are minor. The novelties of Lorentzian geometry will become more pronounced later in the course when we explore some of the physical consequences such as horizons.

3.1 The Metric

In Section 1, we informally introduced the metric as a way to measure distances between points. It does, indeed, provide this service but it is not its initial purpose. Instead, the metric is an inner product on each vector space Tp(M).


Definition: A metric g is a (0,2) tensor field that is:

  • Symmetric: g(X,Y)=g(Y,X).

  • Non-Degenerate: If, for any pM, g(X,Y)|p=0 for all YTp(M) then Xp=0.

With a choice of coordinates, we can write the metric as

g=gμν(x)dxμdxν

The object g is often written as a line element ds2 and this expression is abbreviated as

ds2=gμν(x)dxμdxν

This is the form that we saw previously in (1.5). The metric components can extracted by evaluating the metric on a pair of basis elements,

gμν(x)=g(xμ,xν)

The metric gμν is a symmetric matrix. We can always pick a basis eμ of each Tp(M) so that this matrix is diagonal. The non-degeneracy condition above ensures that none of these diagonal elements vanish. Some are positive, some are negative. Sylvester’s law of inertia is a theorem in algebra which states that the number of positive and negative entries is independent of the choice of basis. (This theorem has nothing to do with inertia. But Sylvester thought that if Newton could have a law of inertia, there should be no reason he couldn’t.) The number of negative entries is called the signature of the metric.

3.1.1 Riemannian Manifolds

For most applications of differential geometry, we are interested in manifolds in which all diagonal entries of the metric are positive. A manifold equipped with such a metric is called a Riemannian manifold. The simplest example is Euclidean space 𝐑n which, in Cartesian coordinates, is equipped with the metric

g=dx1dx1++dxndxn

The components of this metric are simply gμν=δμν.

A general Riemannian metric gives us a way to measure the length of a vector X at each point,

|X|=g(X,X)

It also allows us to measure the angle between any two vectors X and Y at each point, using

g(X,Y)=|X||Y|cosθ

The metric also gives us a way to measure the distance between two points p and q along a curve in M. The curve is parameterised by σ:[a,b]M, with σ(a)=p and σ(b)=q. The distance is then

distance=ab𝑑tg(X,X)|σ(t)

where X is a vector field that is tangent to the curve. If the curve has coordinates xμ(t), the tangent vector is Xμ=dxμ/dt, and the distance is

distance=ab𝑑tgμν(x)dxμdtdxνdt

Importantly, this distance does not depend on the choice of parameterisation of the curve; this is essentially the same calculation that we did in Section 1.2 when showing the reparameterisation invariance of the action for a particle.

3.1.2 Lorentzian Manifolds

For the purposes of general relativity, we will be working with a manifold in which one of the diagonal entries of the metric is negative. A manifold equipped with such a metric is called Lorentzian.

The simplest example of a Lorentzian metric is Minkowski space. This is 𝐑n equipped with the metric

η=-dx0dx0+dx1dx1++dxn-1dxn-1

The components of the Minkowski metric are ημν=diag(-1,+1,,+1). As this example shows, on a Lorentzian manifold we usually take the coordinate index xμ to run from 0,1,,n-1.

At any point p on a general Lorentzian manifold, it is always possible to find an orthonormal basis {eμ} of Tp(M) such that, locally, the metric looks like the Minkowski metric

gμν|p=ημν (3.93)

This fact is closely related to the equivalence principle; we’ll describe the coordinates that allow us to do this in Section 3.3.2.

In fact, if we find one set of coordinates in which the metric looks like Minkowski space at p, it is simple to exhibit other coordinates. Consider a different basis of vector fields related by

e~μ=Λμνeν

Then, in this basis the components of the metric are

g~μν=ΛμρΛνσgρσ

This leaves the metric in Minkowski form at p if

ημν=Λμρ(p)Λνσ(p)ηρσ (3.94)

This is the defining equation for a Lorentz transformation that we saw previously in (1.15). We see that viewed locally – which here means at a point p – we recover some basic features of special relativity. Note, however, that if we choose coordinates so that the metric takes the form (3.93) at some point p, it will likely differ from the Minkowski metric as we move away from p.

Figure 21: The lightcone at a point p, with three different types of tangent vectors.

The fact that, locally, the metric looks like the Minkowski metric means that we can import some ideas from special relativity. At any point p, a vector XpTp(M) is said to be timelike if g(Xp,Xp)<0, null if g(Xp,Xp)=0, and spacelike if g(Xp,Xp)>0.

At each point on M, we can then draw lightcones, which are the null tangent vectors at that point. There are both past-directed and future-directed lightcones at each point, as shown in Figure 21. The novelty is that the directions of these lightcones can vary smoothly as we move around the manifold. This specifies the causal structure of spacetime, which determines who can be friends with whom. We’ll see more of this later in the lectures.

We can again use the metric to determine the length of curves. The nature of a curve at a point is inherited from the nature of its tangent vector. A curve is called timelike if its tangent vector is everywhere timelike. In this case, we can again use the metric to measure the distance along the curve between two points p and q. Given a parametrisation xμ(t), this distance is,

τ=ab𝑑t-gμνdxμdtdxνdt

This is called the proper time. It is, in fact, something we’ve met before: it is precisely the action (1.28) for a point particle moving in the spacetime with metric gμν.

3.1.3 The Joys of a Metric

Whether we’re on a Riemannian or Lorentzian manifold, there are a number of bounties that the metric brings.

The Metric as an Isomophism

First, the metric gives us a natural isomorphism between vectors and covectors, g:Tp(M)Tp(M) for each p, with the one-form constructed from the contraction of g and a vector field X.

In a coordinate basis, we write X=Xμμ. This is mapped to a one-form which, because this is a natural isomorphism, we also call X. This notation is less annoying than you might think; in components the one-form is written is as X=Xμdxμ. The components are then related by

Xμ=gμνXν

Physicists usually say that we use the metric to lower the index from Xμ to Xμ. But in their heart, they mean “the metric provides a natural isomorphism between a vector space and its dual”.

Because g is non-degenerate, the matrix gμν is invertible. We denote the inverse as gμν, with gμνgνρ=δρμ. Here gμν can be thought of as the components of a symmetric (2,0) tensor g^=gμνμν. More importantly, the inverse metric allows us to raise the index on a one-form to give us back the original tangent vector,

Xμ=gμνXν

In Euclidean space, with Cartesian coordinates, the metric is simply gμν=δμν which is so simple it hides the distinction between vectors and one-forms. This is the reason we didn’t notice the difference between these spaces when we were five.

The Volume Form

The metric also gives us a natural volume form on the manifold M. On a Riemannian manifold, this is defined as

v=detgμνdx1dxn

The determinant is usually simply written as g=detgμν. On a Lorentzian manifold, the determinant is negative and we instead have

v=-gdx0dxn-1 (3.95)

As defined, the volume form looks coordinate dependent. Importantly, it is not. To see this, introduce some rival coordinates x~μ, with

dxμ=Aνμdx~ν   whereAνμ=xμx~ν

In the new coordinates, the wedgey part of the volume form becomes

dx1dxn=Aμ11Aμnndx~μ1dx~μn

We can rearrange the one-forms into the order dx~1dx~n. We pay a price of + or -1 depending on whether {μ1,,μn} is an even or odd permutation of {1,,n}. Since we’re summing over all indices, this is the same as summing over all permutations π of {1,,n}, and we have

dx1dxn = perms πsign(π)Aπ(1)1Aπ(n)ndx~1dx~n
= det(A)dx~1dx~n

where det(A)>0 if the change of coordinates preserves the orientation. This factor of det(A) is the usual Jacobian factor that one finds when changing the measure in an integral.

Meanwhile, the metric components transform as

gμν=x~ρxμx~σxνg~ρσ=(A-1)μρ(A-1)νσg~ρσ

and so the determinant becomes

detgμν=(detA-1)2detg~μν=detg~μν(detA)2

We see that the factors of detA cancel, and we can equally write the volume form as

v=|g~|dx~1dx~n

The volume form (3.95) may look more familiar if we write it as

v=1n!vμ1μndxμ1dxμn

Here the components vμ1μn are given in terms of the totally anti-symmetric object ϵμ1μn with ϵ1n=+1 and other components determined by the sign of the permutation,

vμ1μn=|g|ϵμ1μn (3.96)

Note that vμ1μn is a tensor, which means that ϵμ1μn can’t quite be a tensor: instead, it is a tensor divided by |g|. It is sometimes said to be a tensor density. The anti-symmetric tensor density arises in many places in physics. In all cases, it should be viewed as a volume form on the manifold. (In nearly all cases, this volume form arises from a metric as here.)

As with other tensors, we can use the metric to raise the indices and construct the volume form with all indices up

vμ1μn=gμ1ν1gμnνnvν1νn=±1|g|ϵμ1μn

where we get a + sign for a Riemannian manifold, and a - sign for a Lorentzian manifold. Here ϵμ1μn is again a totally anti-symmetric tensor density with ϵ1n=+1. Note, however, that while we raise the indices on vμ1μn using the metric, this statement doesn’t quite hold for ϵμ1μn which takes values 1 or 0 regardless of whether the indices are all down or all up. This reflects the fact that it is a tensor density, rather than a genuine tensor.

The existence of a natural volume form means that, given a metric, we can integrate any function f over the manifold. We will sometimes write this as

Mfv=Mdnx±gf

The metric ±g provides a measure on the manifold that tells us what regions of the manifold are weighted more strongly than the others in the integral.

The Hodge Dual

On an oriented manifold M, we can use the totally anti-symmetric tensor ϵμ1,,μn to define a map which takes a p-form ωΛp(M) to an (n-p)-form, denoted (ω)Λn-p(M), defined by

(ω)μ1μn-p=1p!|g|ϵμ1μn-pν1νpων1νp (3.97)

This map is called the Hodge dual. It is independent of the choice of coordinates.

It’s not hard to check that,

(ω)=±(-1)p(n-p)ω (3.98)

where the + sign holds for Riemannian manifolds and the - sign for Lorentzian manifolds. (To prove this, it’s useful to first show that vμ1μpρ1ρn-pvν1νpρ1ρn-p=±p!(n-p)!δ[ν1μ1δνp]μp, again with the ± sign for Riemannian/Lorentzian manifolds.)

It’s worth returning to some high school physics and viewing it through the lens of our new tools. We are very used to taking two vectors in 𝐑3, say 𝐚 and 𝐛, and taking the cross-product to find a third vector

𝐚×𝐛=𝐜

In fact, we really have objects that live in three different spaces here, related by the Euclidean metric δμν. First we use this metric to relate the vectors to one-forms. The cross-product is then really a wedge product which gives us back a 2-form. We then use the metric twice more, once to turn the two-form back into a one-form using the Hodge dual, and again to turn the one-form into a vector. Of course, none of these subtleties bothered us when we were 15. But when we start thinking about curved manifolds, with a non-trivial metric, these distinctions become important.

The Hodge dual allows us to define an inner product on each Λp(M). If ω,ηΛp(M), we define

η,ω=Mηω

which makes sense because ωΛn-p(M) and so ηω is a top form that can be integrated over the manifold.

With such an inner product in place, we can also start to play the kind of games that are familiar from quantum mechanics and look at operators on Λp(M) and their adjoints. The one operator that we have introduced on the space of forms is the exterior derivative, defined in Section 2.4.1. Its adjoint is defined by the following result:


Claim: For ωΛp(M) and αΛp-1(M),

dα,ω=α,dω (3.99)

where the adjoint operator d:Λp(M)Λp-1(M) is given by

d=±(-1)np+n-1d

with, again, the ± sign for Riemannian/Lorentzian manifolds respectively.


Proof: This is simply the statement of integration-by-parts for forms. On a closed manifold M, Stokes’ theorem tells us that

0=Md(αω)=Mdαω+(-1)p-1αdω

The first term is simply dα,ω. The second term also takes the form of an inner product which, up to a sign, is proportional to α,dω. To determine the sign, note that dωΛn-p+1(M) so, using (3.98), we have dω=±(-1)(n-p+1)(p-1)dω. Putting this together gives

dα,ω=±(-1)np+n-1α,dω

as promised.

3.1.4 A Sniff of Hodge Theory

We can combine d and d to construct the Laplacian, :Λp(M)Λp(M), defined as

=(d+d)2=dd+dd

where the second equality follows because d2=d 2=0. The Laplacian can be defined on both Riemannian manifolds, where it is positive definite, and Lorentzian manifolds. Here we restrict our discussion to Riemannian manifolds.

Acting on functions f, we have df=0 (because f is a top form so df=0). That leaves us with,

(f) = -d(μfdxμ)
= -1(n-1)!d((μf)gμν|g|ϵνρ1ρn-1dxρ1dxρn-1)
= -1(n-1)!σ(|g|gμνμf)ϵνρ1ρn-1dxσdxρ1dxρn-1
= -ν(|g|gμνμf)dx1dxn
= -1|g|ν(|g|gμνμf)

This form of the Laplacian, acting on functions, appears fairly often in applications of differential geometry.

There is a particularly nice story involving p-forms γ that obey

γ=0

Such forms are said to be harmonic. An harmonic form is necessarily closed, meaning dγ=0, and co-closed, meaning dγ=0. This follows by writing

γ,γ=dγ,dγ+dγ,dγ=0

and noting that the inner product is positive-definite.

There are some rather pretty facts that relate the existence of harmonic forms to de Rham cohomology. The space of harmonic p-forms on a manifold M is denoted Harmp(M). First, the Hodge decomposition theorem, which we state without proof: any p-form ω on a compact, Riemannian manifold can be uniquely decomposed as

ω=dα+dβ+γ

where αΛp-1(M) and βΛp+1(M) and γHarmp(M). This result can then be used to prove:


Hodge’s Theorem: There is an isomorphism

Harmp(M)Hp(M)

where Hp(M) is the de Rham cohomology group introduced in Section 2.4.3. In particular, the Betti numbers can be computed by counting the number of linearly independent harmonic forms,

Bp=dimHarmp(M)

Proof: First, let’s show that any harmonic form γ provides a representative of Hp(M). As we saw above, any harmonic p-form is closed, dγ=0, so γZp(M). But the unique nature of the Hodge decomposition tells us that γdβ for some β.

Next, we need to show that any equivalence class [ω]Hp(M) can be represented by a harmonic form. We decompose ω=dα+dβ+γ. By definition [ω]Hp(M) means that dω=0 so we have

0=dω,β=ω,dβ=dα+dβ+γ,dβ=dβ,dβ

where, in the final step, we “integrated by parts” and used the fact that ddα=dγ=0. Because the inner product is positive definite, we must have dβ=0 and, hence, ω=γ+dα. Any other representative ω~ω of [ω]Hp(M) differs by ω~=ω+dη and so, by the Hodge decomposition, is associated to the same harmonic form γ.

3.2 Connections and Curvature

We’ve already met one version of differentiation in these lectures. A vector field X is, at heart, a differential operator and provides a way to differentiate a function f. We write this simply as X(f).

As we saw previously, differentiating higher tensor fields is a little more tricky because it requires us to subtract tensor fields at different points. Yet tensors evaluated at different points live in different vector spaces, and it only makes sense to subtract these objects if we can first find a way to map one vector space into the other. In Section 2.2.4, we used the flow generated by X as a way to perform this mapping, resulting in the idea of the Lie derivative X.

There is, however, a different way to take derivatives, one which ultimately will prove more useful. The derivative is again associated to a vector field X. However, this time we introduce a different object, known as a connection to map the vector spaces at one point to the vector spaces at another. The result is an object, distinct from the Lie derivative, called the covariant derivative.

3.2.1 The Covariant Derivative

A connection is a map :𝔛(M)×𝔛(M)𝔛(M). We usually write this as (X,Y)=XY and the object X is called the covariant derivative. It satisfies the following properties for all vector fields X, Y and Z,

  • X(Y+Z)=XY+XZ

  • (fX+gY)Z=fXZ+gYZ for all functions f,g.

  • X(fY)=fXY+(Xf)Y where we define Xf=X(f)

The covariant derivative endows the manifold with more structure. To elucidate this, we can evaluate the connection in a basis {eμ} of 𝔛(M). We can always express this as

eρeν=Γρνμeμ (3.100)

with Γρνμ the components of the connection. It is no coincidence that these are denoted by the same greek letter that we used for the Christoffel symbols in Section 1. However, for now, you should not conflate the two; we’ll see the relationship between them in Section 3.2.3.

The name “connection” suggests that , or its components Γνρμ, connect things. Indeed they do. We will show in Section 3.3 that the connection provides a map from the tangent space Tp(M) to the tangent space at any other point Tq(M). This is what allows the connection to act as a derivative.

We will use the notation

μ=eμ

This makes the covariant derivative μ look similar to a partial derivative. Using the properties of the connection, we can write a general covariant derivative of a vector field as

XY = X(Yμeμ)
= X(Yμ)eμ+YμXeμ
= Xνeν(Yμ)eμ+XνYμνeμ
= Xν(eν(Yμ)+ΓνρμYρ)eμ

The fact that we can strip off the overall factor of Xν means that it makes sense to write the components of the covariant derivative as

νY=(eν(Yμ)+ΓνρμYρ)eμ

Or, in components,

(νY)μ=eν(Yμ)+ΓνρμYρ (3.101)

Note that the covariant derivative coincides with the Lie derivative on functions, Xf=Xf=X(f). It also coincides with the old-fashioned partial derivative: μf=μf. However, its action on vector fields differs. In particular, the Lie derivative XY=[X,Y] depends on both X and the first derivative of X while, as we have seen above, the covariant derivative depends only on X. This is the property that allows us to write X=Xνν and think of μ as an operator in its own right. In contrast, there is no way to write “X=Xμμ”. While the Lie derivative has its uses, the ability to define μ means that this is best viewed as the natural generalisation of the partial derivative to curved space.

Differentiation as Punctuation

In a coordinate basis, in which eμ=μ, the covariant derivative (3.101) becomes

(νY)μ=νYμ+ΓνρμYρ (3.102)

We will differentiate often. To save ink, we use the sloppy, and sometimes confusing, notation

(νY)μ=νYμ

This means, in particular, that νYμ is the μth component of νY, rather than the differentiation of the function Yμ.

Covariant differentiation is sometimes denoted using a semi-colon

νYμ=Yμ;ν

In this convention, the partial derivative is denoted using a mere comma, μYν=Yν,μ. The expression (3.102) then reads

Yμ;ν=Yμ,ν+ΓνρμYρ

I’m proud to say that we won’t adopt the “semi-colon = differentiation” notation in these lectures. Because it’s stupid.

The Connection is Not a Tensor

The Γρνμ defining the connection are not components of a tensor. We can see this immediately from the definition (X,fY)=X(fY)=fXY+(X(f))Y. This is not linear in the second argument, which is one of the requirements of a tensor.

To illustrate this, we can ask what the connection looks like in a different basis,

e~ν=Aνμeμ (3.103)

for some invertible matrix A. If eμ and e~μ are both coordinate bases, then

Aνμ=xμx~ν

We know from (2.78) that the components of a (1,2) tensor transform as

T~νρμ=(A-1)τμAνλAρσTτλσ (3.104)

We can now compare this to the transformation of the connection components Γρνμ. In the basis e~μ, we have

e~ρe~ν=Γ~ρνμe~μ

Substituting in the transformation (3.103), we have

Γ~ρνμe~μ=(Aρσeσ)(Aνλeλ)=Aρσeσ(Aνλeλ)=AρσAνλΓσλτeτ+AρσeλσAνλ

We can write this as

Γ~ρνμe~μ = (AρσAνλΓσλτ+AρσσAντ)eτ
= (AρσAνλΓσλτ+AρσσAντ)(A-1)τμe~μ

Stripping off the basis vectors e~μ, we see that the components of the connection transform as

Γ~ρνμ=(A-1)τμAρσAνλΓσλτ+(A-1)τμAρσσAντ (3.105)

The first term coincides with the transformation of a tensor (3.104). But the second term, which is independent of Γ, but instead depends on A, is novel. This is the characteristic transformation property of a connection.

Differentiating Other Tensors

We can use the Leibnizarity of the covariant derivative to extend its action to any tensor field. It’s best to illustrate this with an example.

Consider a one-form ω. If we differentiate ω, we will get another one-form Xω which, like any one-form, is defined by its action on vector fields Y𝔛(M). To construct this, we will insist that the connection obeys the Leibnizarity in the modified sense that

X(ω(Y))=(Xω)(Y)+ω(XY)

But ω(Y) is simply a function, which means that we can also write this as

X(ω(Y))=X(ω(Y))

Putting these together gives

(Xω)(Y)=X(ω(Y))-ω(XY)

In coordinates, we have

Xμ(μω)νYν = Xμμ(ωνYν)-ωνXμ(μYν+ΓμρνYρ)
= Xμ(μωρ-Γμρνων)Yρ

where, crucially, the Y terms cancel in going from the first to the second line. This means that the overall result is linear in Y and we may define Xω without reference to the vector field Y on which is acts. In components, we have

(μω)ρ=μωρ-Γμρνων

As for vector fields, we also write this as

(μω)ρμωρωρ;μ=ωρ,μ-Γμρνων

This kind of argument can be extended to a general tensor field of rank (p,q), where the covariant derivative is defined by,

Tμ1μpν1νq;ρ = Tμ1μpν1νq,ρ+Γρσμ1Tσμ2μpν1νq++ΓρσμpTμ1μp-1σν1νq
   -Γρν1σTμ1μpσν2νq--ΓρνqσTμ1μpν1νq-1σ

The pattern is clear: for every upper index μ we get a +ΓT term, while for every lower index we get a -ΓT term.

Now that we can differentiate tensors, we will also need to extend our punctuation notation slightly. If more than two subscripts follow a semi-colon (or, indeed, a comma) then we differentiate respect to both, doing the one on the left first. So, for example, Xμ;νρ=ρνXμ.

3.2.2 Torsion and Curvature

Even though the connection is not a tensor, we can use it to construct two tensors. The first is a rank (1,2) tensor T known as torsion. It is defined to act on X,Y𝔛(M) and ωΛ1(M) by

T(ω;X,Y)=ω(XY-YX-[X,Y])

The other is a rank (1,3) tensor R, known as curvature. It acts on X,Y,Z𝔛(M) and ωΛ1(M) by

R(ω;X,Y,Z)=ω(XYZ-YXZ-[X,Y]Z)

The curvature tensor is also called the Riemann tensor.

Alternatively, we could think of torsion as a map T:𝔛(M)×𝔛(M)𝔛(M), defined by

T(X,Y)=XY-YX-[X,Y]

Similarly, the curvature R can be viewed as a map from 𝔛(M)×𝔛(M) to a differential operator acting on 𝔛(M),

R(X,Y)=XY-YX-[X,Y] (3.106)

Checking Linearity

To demonstrate that T and R are indeed tensors, we need to show that they are linear in all arguments. Linearity in ω is straightforward. For the others, there are some small calculations to do. For example, we must show that T(ω;fX,Y)=fT(ω;X,Y). To see this, we just run through the definitions of the various objects,

T(ω;fX,Y)=ω(fXY-Y(fX)-[fX,Y])

We then use fXY=fXY and Y(fX)=fYX+Y(f)X and [fX,Y]=f[X,Y]-Y(f)X. The two Y(f)X terms cancel, leaving us with

T(ω;fX,Y) = fω(XY-YX-[X,Y])
= fT(ω;X,Y)

Similarly, for the curvature tensor we have

R(ω;fX,Y,Z) = ω(fXYZ-YfXZ-[fX,Y]Z
= ω(fXYZ-Y(fXZ)-(f[X,Y]-Y(f)X)Z)
= ω(fXYZ-fYXZ-Y(f)XZ-f[X,Y]Z+Y(f)XZ)
= ω(fXYZ-fYXZ-Y(f)XZ-f[X,Y]Z+Y(f)XZ)
= fω(XYZ-YXZ-[X,Y]Z)
= fR(ω;X,Y,Z)

Linearity in Y follows from linearity in X. But we still need to check linearity in Z,

R(ω;X,Y,fZ) = ω(XY(fZ)-YX(fZ)-[X,Y](fZ))
= ω(X(fYZ+Y(f)Z)-Y(fXZ+X(f)Z)
      -f[X,Y]Z-[X,Y](f)Z)
= ω(fXY+X(f)YZ+Y(f)XZ+X(Y(f))Z
      -fYXZ-Y(f)XZ-X(f)YZ-Y(X(f))Z
      -f[X,Y]Z-[X,Y](f)Z)
= fR(ω;X,Y,Z)

Thus, both torsion and curvature define new tensors on our manifold.

Components

We can evaluate these tensors in a coordinate basis {eμ}={μ}, with the dual basis {fμ}={dxμ}. The components of the torsion are

Tρμν = T(fρ;eμ,eν)
= fρ(μeν-νeμ-[eμ,eν])
= fρ(Γμνσeσ-Γνμσeσ)
= Γμνρ-Γνμρ

where we’ve used the fact that, in a coordinate basis, [eμ,eν]=[μ,ν]=0. We learn that, even though Γμνρ is not a tensor, the anti-symmetric part Γ[μν]ρ does form a tensor. Clearly the torsion tensor is anti-symmetric in the lower two indices

Tρμν=-Tρνμ

Connections which are symmetric in the lower indices, so Γμνρ=Γνμρ have Tρμν=0. Such connections are said to be torsion-free.

The components of the curvature tensor are given by

Rσρμν=R(fσ;eμ,eν,eρ)

Note the slightly counterintuitive, but standard ordering of the indices; the indices μ and ν that are associated to covariant derivatives μ and ν go at the end. We have

Rσρμν = fσ(μνeρ-νμeρ-[eμ,eν]eρ) (3.107)
= fσ(μνeρ-νμeρ)
= fσ(μ(Γνρλeλ)-ν(Γμρλeλ))
= fσ((μΓνρλ)eλ+ΓνρλΓμλτeτ-(νΓμρλ)eλ-ΓμρλΓνλτeτ)
= μΓνρσ-νΓμρσ+ΓνρλΓμλσ-ΓμρλΓνλσ

Clearly the Riemann tensor is anti-symmetric in its last two indices

Rσρμν=-Rσρνμ

Equivalently, Rσρμν=Rσρ[μν]. There are a number of further identities of the Riemann tensor of this kind. We postpone this discussion to Section 3.4.

The Ricci Identity

There is a closely related calculation in which both the torsion and Riemann tensors appears. We look at the commutator of covariant derivatives acting on vector fields. Written in an orgy of anti-symmetrised notation, this calculation gives

[μν]Zσ = [μ(ν]Zσ)+Γ[μ|λ|σν]Zλ-Γ[μν]ρρZσ
= [μν]Zσ+([μΓν]ρσ)Zρ+([μZρ)Γν]ρσ+Γ[μ|λ|σν]Zλ
+Γ[μ|λ|σΓν]ρλZρ-Γ[μν]ρρZσ

The first term vanishes, while the third and fourth terms cancel against each other. We’re left with

2[μν]Zσ=RσZρρμν-TρρμνZσ (3.108)

where the torsion tensor is Tρ=μν2Γ[μν]ρ and the Riemann tensor appears as

Rσ=ρμν2[μΓν]ρσ+2Γ[μ|λ|σΓν]ρλ

which coincides with (3.107). The expression (3.108) is known as the Ricci identity.

3.2.3 The Levi-Civita Connection

So far, our discussion of the connection has been entirely independent of the metric. However, something nice happens if we have both a connection and a metric. This something nice is called the fundamental theorem of Riemannian geometry. (Happily, it’s also true for Lorentzian geometries.)


Theorem: There exists a unique, torsion free, connection that is compatible with a metric g, in the sense that

Xg=0

for all vector fields X.


Proof: We start by showing uniqueness. Suppose that such a connection exists. Then, by Leibniz

X(g(Y,Z))=X(g(Y,Z))=(Xg)(Y,Z)+g(XY,Z)+g(Y,XZ)

Since Xg=0, this becomes

X(g(Y,Z))=g(XY,Z)+g(XZ,Y)

By cyclic permutation of X, Y and Z, we also have

Y(g(Z,X)) = g(YZ,X)+g(YX,Z)
Z(g(X,Y)) = g(ZX,Y)+g(ZY,X)

Since the torsion vanishes, we have

XY-YX=[X,Y]

We can use this to write the cyclically permuted equations as

X(g(Y,Z)) = g(YX,Z)+g(XZ,Y)+g([X,Y],Z)
Y(g(Z,X)) = g(ZY,X)+g(YX,Z)+g([Y,Z],X)
Z(g(X,Y)) = g(XZ,Y)+g(ZY,X)+g([Z,X],Y)

Add the first two of these equations, and subtract the third. We find

g(YX,Z) = 12[X(g(Y,Z))+Y(g(Z,X))-Z(g(X,Y)) (3.109)
    -g([X,Y],Z)-g([Y,Z],X)+g([Z,X],Y)]

But with a non-degenerate metric, this specifies the connection uniquely. We’ll give an expression in terms of components in (3.110) below.

It remains to show that the object defined this way does indeed satisfy the properties expected of a connection. The tricky one turns out to be the requirement that fXY=fXY. We can see that this is indeed the case as follows:

g(fYX,Z) = 12[X(g(fY,Z))+fY(g(Z,X))-Z(g(X,fY))
    -g([X,fY],Z)-g([fY,Z],X)+g([Z,X],fY)]
= 12[fX(g(Y,Z))+X(f)g(Y,Z)+fY(g(Z,X))-fZ(g(X,Y))
    -Z(f)g(X,Y)-fg([X,Y],Z)-X(f)g(Y,Z)-fg([Y,Z],X)
    +Z(f)g(Y,X)+fg([Z,X],Y)]
= g(fYX,Z)

The other properties of the connection follow similarly.

The connection (3.109), compatible with the metric, is called the Levi-Civita connection. We can compute its components in a coordinate basis {eμ}={μ}. This is particularly simple because [μ,ν]=0, leaving us with

g(νeμ,eρ) = Γνμλgλρ=12(μgνρ+νgμρ-ρgμν)

Multiplying by the inverse metric gives

Γμνλ=12gλρ(μgνρ+νgμρ-ρgμν) (3.110)

The components of the Levi-Civita connection are called the Christoffel symbols. They are the objects (1.32) we met already in Section 1 when discussing geodesics in spacetime. For the rest of these lectures, when discussing a connection we will always mean the Levi-Civita connection.

An Example: Flat Space

In flat space 𝐑d, endowed with either Euclidean or Minkowski metric, we can always pick Cartesian coordinates, in which case the Christoffel symbols vanish. However, in other coordinates this need not be the case. For example, in Section 1.1.1, we computed the flat space Christoffel symbols in polar coordinates (1.11). They don’t vanish. But because the Riemann tensor is a genuine tensor, if it vanishes in one coordinate system then it must vanishes in all of them. Given some horrible coordinate system, with Γμνρ0, we can always compute the corresponding Riemann tensor to see if the space is actually flat after all.

Another Example: The Sphere 𝐒2

Consider 𝐒2 with radius r and the round metric

ds2=r2(dθ2+sin2θdϕ2)

We can extract the Christoffel symbols from those of flat space in polar coordinates (1.11). The non-zero components are

Γϕϕθ=-sinθcosθ,Γθϕϕ=Γϕθϕ=cosθsinθ (3.111)

From these, it is straightforward to compute the components of the Riemann tensor. They are most simply expressed as Rσρμν=gσλRλρμν and are given by

Rθϕθϕ=Rϕθϕθ=-Rθϕϕθ=-Rϕθθϕ=r2sin2θ (3.112)

with the other components vanishing.

3.2.4 The Divergence Theorem

Gauss’ Theorem, also known as the divergence theorem, states that if you integrate a total derivative, you get a boundary term. There is a particular version of this theorem in curved space that we will need for later applications.

As a warm-up, we have the following result:


Lemma: The contraction of the Christoffel symbols can be written as

Γμνμ=1gνg (3.113)

On Lorentzian manifolds, we should replace g with |g|.


Proof: From (3.110), we have

Γμνμ=12gμρνgμρ=12tr(g-1νg)=12tr(νlogg)

However, there’s a useful identity for the log of any diagonalisable matrix: they obey

trlogA=logdetA

This is clearly true for a diagonal matrix, since the determinant is the product of eigenvalues while the trace is the sum. But both trace and determinant are invariant under conjugation, so this is also true for diagonalisable matrices. Applying it to our metric formula above, we have

Γμνμ=12tr(νlogg)=12νlogdetg=121detgνdetg=1detgνdetg

which is the claimed result.

With this in hand, we can now prove the following:


Divergence Theorem: Consider a region of a manifold M with boundary M. Let nμ be an outward-pointing, unit vector orthogonal to M. Then, for any vector field Xμ on M, we have

MdnxgμXμ=Mdn-1xγnμXμ

where γij is the pull-back of the metric to M, and γ=detγij. On a Lorentzian manifold, a version of this formula holds only if M is purely timelike or purely spacelike, which ensures that γ0 at any point. Proof: Using the lemma above, the integrand is

gμXμ=g(μXμ+ΓμνμXν)=g(μXμ+Xν1gνg)=μ(gXμ)

The integral is then

MdnxgμXμ=Mdnxμ(gXμ)

which now is an integral of an ordinary partial derivative, so we can apply the usual divergence theorem that we are familiar with. It remains only to evaluate what’s happening at the boundary M. For this, it is useful to pick coordinates so that the boundary M is a surface of constant xn. Furthermore, we will restrict to metrics of the form

gμν=(γij00N2)

Then by our usual rules of integration, we have

Mdnxμ(gXμ)=Mdn-1xγN2Xn

The unit normal vector nμ is given by nμ=(0,0,,1/N), which satisfies gμνnμnν=1 as it should. We then have nμ=gμνnν=(0,0,,N), so we can write

MdnxgμXμ=Mdn-1xγnμXμ

which is the result we need. As the final expression is a covariant quantity, it is true in general.

In Section 2.4.5, we advertised Stokes’ theorem as the mother of all integral theorems. It’s perhaps not surprising to hear that the divergence theorem is a special case of Stokes’ theorem. To see this, here’s an alternative proof that uses the language of forms.


Another Proof: Given the volume form v on M, and a vector field X, we can contract the two to define an n-1 form ω=ιXv. (This is the interior product that we previously met in (2.85).) It has components

ωμ1μn-1=gϵμ1μnXμn

If we now take the exterior derivative, dω, we have a top-form. Since the top form is unique up to multiplication, dω must be proportional to the volume form. Indeed, it’s not hard to show that

(dω)μ1μn=gϵμ1μnνXν

This means that, in form language, the integral over M that we wish to consider can be written as

MdnxgμXμ=M𝑑ω

Now we invoke Stokes’ theorem, to write

M𝑑ω=Mω

We now need to massage ω into the form needed. First, we introduce a volume form v^ on M, with components

v^μ1μn-1=γϵμ1μn-1

This is related to the volume form on M by

1nvμ1μn-1ν=v^[μ1μn-1nν]

where nμ is the orthonormal vector that we introduced previously. We then have

ωμ1μn-1=γ(nνXν)ϵ~μ1μn-1

The divergence theorem then follows from Stokes’ theorem.

3.2.5 The Maxwell Action

Let’s briefly turn to some physics. We take the manifold M to be spacetime. In classical field theory, the dynamical degrees of freedom are objects that take values at each point in M. We call these objects fields. The simplest such object is just a function which, in physics, we call a scalar field.

As we described in Section 2.4.2, the theory of electromagnetism is described by a one-form field A. In fact, there is a little more structure because we ask that the theory is invariant under gauge transformations

AA+dα

To achieve this, we construct a field strength F=dA which is indeed invariant under gauge transformations. The next question to ask is: what are the dynamics of these fields?

The most elegant and powerful way to describe the dynamics of classical fields is provided by the action principle. The action is a functional of the fields, constructed by integrating over the manifold. The differential geometric language that we’ve developed in these lectures tells us that there are, in fact, very few actions one can write down.

To see this, suppose that our manifold has only the 2-form F but is not equipped with a metric. If spacetime has dimension dim(M)=4 (it does!) then we need to construct a 4-form to integrate over M. There is only one of these at our disposal, suggesting the action

Stop=-12FF

If we expand this out in the electric and magnetic fields using (2.87), we find

Stop=𝑑x0𝑑x1𝑑x2𝑑x3𝐄𝐁

Actions of this kind, which are independent of the metric, are called topological. They are typically unimportant in classical physics. Indeed, we can locally write FF=d(AF), so the action is a total derivative and does not affect the classical equations of motion. Nonetheless, topological actions often play subtle and interesting roles in quantum physics. For example, the action Stop underlies the theory of topological insulators. You can read more about this in Section 1 of the lectures on Gauge Theory.

To construct an action that gives rise to interesting classical dynamics, we need to introduce a metric. The existence of a metric allows us to introduce a second two-form, F, and construct the action

SMaxwell=-12FF=-14d4x-ggμνgρσFμρFνσ=-14d4x-gFμνFμν

This is the Maxwell action, now generalised to a curved spacetime. If we restrict to flat Minkowski space, the components are FμνFμν=2(𝐁2-𝐄2). As we saw in our lectures on Electromagnetism, varying this action gives the remaining two Maxwell equations. In the elegant language of differential geometry, these take the simple form

dF=0

We can also couple the gauge field to an electric current. This is described by a one-form J, and we write the action

S=-12FF+AJ

We require that this action is invariant under gauge transformations AA+dα. The action transforms as

SS+dαJ

After an integration by parts, the second term vanishes provided that

dJ=0

which is the requirement of current conservation expressed in the language of forms. The Maxwell equations now have a source term, and read

dF=J (3.114)

We see that the rigid structure of differential geometry leads us by the hand to the theories that govern our world. We’ll see this again in Section 4 when we discuss gravity.

Electric and Magnetic Charges

To define electric and magnetic charges, we integrate over submanifolds. For example, consider a three-dimensional spatial submanifold Σ. The electric charge in Σ is defined to be

Qe=ΣJ

It’s simple to check that this agrees with our usual definition Qe=d3xJ0 in flat Minkowski space. Using the equation of motion (3.114), we can translate this into an integral of the field strength

Qe=ΣdF=ΣF (3.115)

where we have used Stokes’ theorem to write this as an integral over the boundary Σ. The result is the general form of Gauss’ law, relating the electric charge in a region to the electric field piercing the boundary of the region. Similarly, we can define the magnetic charge

Qm=ΣF

When we first meet Maxwell theory, we learn that magnetic charges do not exist, courtesy of the identity dF=0. However, this can be evaded in topologically more interesting spaces. We’ll see a simple example in Section 6.2.1 when we discuss charged black holes.

Figure 22:

The statement of current conservation dJ=0 means that the electric charge Qe in a region cannot change unless current flows in or out of that region. This fact, familiar from Electromagnetism, also has a nice expression in terms of forms. Consider a cylindrical region of spacetime V, ending on two spatial hypersurfaces Σ1 and Σ2 as shown in the figure. The boundary of V is then

V=Σ1Σ2B

where B is the cylindrical timelike hypersurface.

We require that J=0 on B, which is the statement that no current flows in or out of the region. Then we have

Qe(Σ1)-Qe(Σ2)=Σ1J-Σ2J=VJ=VdJ=0

which tells us that the electric charge remains constant in time.

Maxwell Equations Using Connections

The form of the Maxwell equations given above makes no reference to a connection. It does, however, use the metric, buried in the definition of the Hodge .

There is an equivalent formulation of the Maxwell equation using the covariant derivative. This will also serve to highlight the relationship between the covariant and exterior derivatives. First note that, given a one-form AΛ1(M), we can define the field strength as

Fμν=μAν-νAμ=μAν-νAμ

where the Christoffel symbols have cancelled out by virtue of the anti-symmetry. This is what allowed us to define the exterior derivative without the need for a connection.

Next, consider the current one-form J. We can recast the statement of current conservation as follows:


Claim:

dJ=0      μJμ=0

Proof: We have

μJμ=μJμ+ΓμρμJρ=1-gμ(-gJμ)

where, in the second equality, we have used our previous result (3.113): Γμνμ=νlog|g|. But this final form is proportional to dJ, with the Hodge dual defined in (3.97).

As an aside, in Riemannian signature the formula

μJμ=1gμ(gJμ)

provides a quick way of computing the divergence in different coordinate systems (if you don’t have the inside cover of Jackson to hand). For example, in spherical polar coordinates on 𝐑3, we have g=r4sin2θ. Plug this into the expression above to immediately find

𝐉=1r2r(r2Jr)+1sinθθ(sinθJθ)+ϕJϕ

The Maxwell equation (3.114) can also be written in terms of the covariant derivative


Claim:

dF=J      μFμν=Jν (3.116)

Proof: We have

μFμν = μFμν+ΓμρμFρν+ΓμρνFμρ
= 1-gμ(-gFμν)+ΓμρνFμρ=1-gμ(-gFμν)

where, in the second equality, we’ve again used (3.113) and in the final equality we’ve used the fact that Γμρν is symmetric while Fμρ is anti-symmetric. To complete the proof, you need to chase down the definitions of the Hodge dual (3.97) and the exterior derivative (2.81). (If you’re struggling to match factors of -g, then remember that the volume form v=-gϵ is a tensor, while the epsilon symbol ϵμ1μ4 is a tensor density.)

3.3 Parallel Transport

Although we have now met a number of properties of the connection, we have not yet explained its name. What does it connect?

The answer is that the connection connects tangent spaces, or more generally any tensor vector space, at different points of the manifold. This map is called parallel transport. As we stressed earlier, such a map is necessary to define differentiation.

Take a vector field X and consider some associated integral curve C, with coordinates xμ(τ), such that

Xμ|C=dxμ(τ)dτ (3.117)

We say that a tensor field T is parallely transported along C if

XT=0 (3.118)

Suppose that the curve C connects two points, pM and qM. The requirement (3.118) provides a map from the vector space defined at p to the vector space defined at q.

To illustrate this, consider the parallel transport of a second vector field Y. In components, the condition (3.118) reads

Xν(νYμ+ΓνρμYρ)=0

If we now evaluate this on the curve C, we can think of Yμ=Yμ(x(τ)), which obeys

dYμdτ+XνΓνρμYρ=0 (3.119)

These are a set of coupled, ordinary differential equations. Given an initial condition at, say τ=0, corresponding to point p, these equations can be solved to find a unique vector at each point along the curve.

Parallel transport is path dependent. It depends on both the connection, and the underlying path which, in this case, is characterised by the vector field X.

This is the second time we’ve used a vector field X to construct maps between tensors at different points in the manifold. In Section 2.2.2, we used X to generate a flow σt:MM, which we could then use to pull-back or push-forward tensors from one point to another. This was the basis of the Lie derivative. This is not the same as the present map. Here, we’re using X only to define the curve, while the connection does the work of relating vector spaces along the curve.

3.3.1 Geodesics Revisited

A geodesic is a curve tangent to a vector field X that obeys

XX=0 (3.120)

Along the curve C, we can substitute the expression (3.117) into (3.119) to find

d2xμdτ2+Γμdxρdτρνdxνdτ=0 (3.121)

This is precisely the geodesic equation (1.31) that we derived in Section 1 by considering the action for a particle moving in spacetime. In fact, we find that the condition (3.120) results in geodesics with affine parameterisation.

For the Levi-Civita connection, we have Xg=0. This ensures that for any vector field Y parallely transported along a geodesic X, so XY=XX=0, we have

ddτg(X,Y)=0

This tells us that the vector field Y makes the same angle with the tangent vector along each point of the geodesic.

3.3.2 Normal Coordinates

Geodesics lend themselves to the construction of a particularly useful coordinate system. On a Riemannian manifold, in the neighbourhood of a point pM, we can always find coordinates such that

gμν(p)=δμν   and   gμν,ρ(p)=0 (3.122)

The same holds for Lorentzian manifolds, now with gμν(p)=ημν. These are referred to as normal coordinates. Because the first derivative of the metric vanishes, normal coordinates have the property that, at the point p, the Christoffel symbols vanish: Γνρμ(p)=0. Generally, away from p we will have Γνρμ0. Note, however, that it is not generally possible to ensure that the second derivatives of the metric also vanish. This, in turn, means that it’s not possible to pick coordinates such that the Riemann tensor vanishes at a given point.

There are a number of ways to demonstrate the existence of coordinates (3.122). The brute force way is to start with some metric g~μν in coordinates x~μ and try to find a change of coordinates to xμ(x~) which does the trick. In the new coordinates,

x~ρxμx~σxνg~ρσ=gμν (3.123)

We’ll take the point p to be the origin in both sets of coordinates. Then we can Taylor expand

x~ρ=x~ρxμ|x=0xμ+122x~ρxμxν|x=0xμxν+

We insert this into (3.123), together with a Taylor expansion of g~ρσ, and try to solve the resulting partial differential equations to find the coefficients x~/x and 2x~/x2 that do the job. For example, the first requirement is

x~ρxμ|x=0x~σxν|x=0g~ρσ(p)=δμν

Given any g~ρσ(p), it’s always possible to find x~/x so that this is satisfied. In fact, a little counting shows that there are many such choices. If dimM=n, then there are n2 independent coefficients in the matrix x~/x. The equation above puts 12n(n+1) conditions on these. That still leaves 12n(n-1) parameters unaccounted for. But this is to be expected: this is precisely the dimension of the rotational group SO(n) (or the Lorentz group SO(1,n-1)) that leaves the flat metric unchanged.

We can do a similar counting at the next order. There are 12n2(n+1) independent elements in the coefficients 2x~ρ/xμxν. This is exactly the same number of conditions in the requirement gμν,ρ(p)=0.

We can also see why we shouldn’t expect to set the second derivative of the metric to zero. Requiring gμν,ρσ=0 is 14n2(n+1)2 constraints. Meanwhile, the next term in the Taylor expansion is 3x~ρ/xμxνxλ which has 16n2(n+1)(n+2) independent coefficients. We see that the numbers no longer match. This time we fall short, leaving

14n2(n+1)2-16n2(n+1)(n+2)=112n2(n2-1)

unaccounted for. This, therefore, is the number of ways to characterise the second derivative of the metric in a manner that cannot be undone by coordinate transformations. Indeed, it is not hard to show that this is precisely the number of independent coefficients in the Riemann tensor. (For n=4, there are 20 coefficients of the Riemann tensor.)

Figure 23: Start with a tangent vector, and follow the resulting geodesic to get the exponential map.

The Exponential Map

There is a rather pretty, direct way to construct the coordinates (3.122). This uses geodesics. The rough idea is that, given a tangent vector XpTp(M), there is a unique affinely parameterised geodesic through p with tangent vector Xp at p. We then label any point q in the neighbourhood of p by the coordinates of the geodesic that take us to q in some fixed amount of time. It’s like throwing a ball in all possible directions, and labelling points by the initial velocity needed for the ball to reach that point in, say, 1 second.

Let’s put some flesh on this. We introduce any coordinate system (not necessarily normal coordinates) x~μ in the neighbourhood of p. Then the geodesic we want solves the equation (3.121) subject to the requirements

dx~μdτ|τ=0=X~pμ   with   x~μ(τ=0)=0

There is a unique solution.

This observation means that we can define a map,

Exp:Tp(M)M

Given XpTp(M), construct the appropriate geodesic and the follow it for some affine distance which we take to be τ=1. This gives a point qM. This is known as the exponential map and is illustrated in the Figure 23.

There is no reason that the exponential map covers all of the manifold M. It could well be that there are points which cannot be reached from p by geodesics. Moreover, it may be that there are tangent vectors Xp for which the exponential map is ill-defined. In general relativity, this occurs if the spacetime has singularities. Neither of these issues are relevant for our current purpose.

Now pick a basis {eμ} of Tp(M). The exponential map means that tangent vector Xp=Xμeμ defines a point q in the neighbourhood of p. We simply assign this point coordinates

xμ(q)=Xμ

These are the normal coordinates.

If we pick the initial basis {eμ} to be orthonormal, then the geodesics will point in orthogonal directions which ensures that the metric takes the form gμν(p)=δμν.

To see that the first derivative of the metric also vanishes, we first fix a point q associated to a given tangent vector XTp(M). This tells us that the point q sits a distance τ=1 along the geodesic. We can now ask: what tangent vector will take us a different distance along this same geodesic? Because the geodesic equation (3.121) is homogeneous in τ, if we halve the length of X then we will travel only half the distance along the geodesic, i.e. to τ=1/2. In general, the tangent vector τX will take us a distance τ along the geodesic

Exp:τXpxμ(τ)=τXμ

This means that the geodesics in these coordinates take the particularly simply form

xμ(τ)=τXμ

Since these are geodesics, they must solve the geodesic equation (3.121). But, for trajectories that vary linearly in time, this is just

Γρνμ(x(τ))XρXν=0

This holds at any point along the geodesic. At most points x(τ), this equation only holds for those choices of Xρ which take us along the geodesic in the first place. However, at x(τ)=0, corresponding to the point p of interest, this equation must hold for any tangent vector Xμ. This means that Γ(ρν)μ(p)=0 which, for a torsion free connection, ensures that Γρνμ(p)=0.

Vanishing Christoffel symbols means that the derivative of the metric vanishes. This follows for the Levi-Civita connection by writing 2gμσΓρνσ=gμρ,ν+gμν,ρ-gρν,μ. Symmetrising over (μρ) means that the last two terms cancel, leaving us with gμρ,ν=0 when evaluated at p.

The Equivalence Principle

Normal coordinates play an important conceptual role in general relativity. Any observer at point p who parameterises her immediate surroundings using coordinates constructed by geodesics will experience a locally flat metric, in the sense of (3.122).

This is the mathematics underlying the Einstein equivalence principle. This principle states that any freely falling observer, performing local experiments, will not experience a gravitational field. Here “freely falling” means the observer follows geodesics, as we saw in Section 1 and will naturally use normal coordinates. In this context, the coordinates are called a local inertial frame. The lack of gravitational field is the statement that gμν(p)=ημν.

Key to understanding the meaning and limitations of the equivalence principle is the word “local”. There is a way to distinguish whether there is a gravitational field at p: we compute the Riemann tensor. This depends on the second derivative of the metric and, in general, will be non-vanishing. However, to measure the effects of the Riemann tensor, one typically has to compare the result of an experiment at p with an experiment at a nearby point q: this is considered a “non-local” observation as far as the equivalence principle goes. In the next two subsections, we give examples of physics that depends on the Riemann tensor.

3.3.3 Path Dependence: Curvature and Torsion

Take a tangent vector ZpTp(M), and parallel transport it along a curve C to some point rM. Now parallel transport it along a different curve C to the same point r. How do the resulting vectors differ?

To answer this, we construct each of our curves C and C from two segments, generated by linearly independent vector fields, X and Y satisfying [X,Y]=0 as shown in Figure 24. To make life easy, we’ll take the point r to be close to the original point p.

We pick normal coordinates xμ=(τ,σ,0,) so that the starting point is at xμ(p)=0 while the tangent vectors are aligned along the coordinates, X=/τ and Y=/σ. The other corner points are then xμ(q)=(δτ,0,0,), xμ(r)=(δτ,δσ,0,) and xμ(s)=(0,δσ,0,) where δτ and δσ are taken to be small. This set-up is shown in Figure 24.

Figure 24: Parallel transporting a vector Zp along two different paths does not give the same answer.

First we parallel transport Zp along X to Zq. Along the curve, Zμ solves (3.119)

dZμdτ+XνΓρνμZρ=0 (3.124)

We Taylor expand the solution as

Zqμ=Zpμ+dZμdτ|τ=0δτ+12d2Zμdτ2|τ=0δτ2+𝒪(δτ3)

From (3.124), we have dZμ/dτ|0=0 because, in normal coordinates, Γρνμ(p)=0. We can calculate the second derivative by differentiating (3.124) to find

d2Zμdτ2|τ=0 = -(XνZρdΓρνμdτ+dXνdτZρΓρνμ+XνdZρdτΓρνμ)|p
= -XνZρdΓρνμdτ|p
= -(XνXσZρΓρν,σμ)p

Here the second line follows because we’re working in normal coordinates at p, and the final line because τ is the parameter along the integral curve of X, so d/dτ=Xσσ. We therefore have

Zqμ=Zpμ-12(XνXσZρΓρν,σμ)pδτ2+ (3.126)

Now we parallel transport once more, this time along Y to Zrμ. The Taylor expansion now takes the form

Zrμ=Zqμ+dZμdσ|qδσ+12d2Zμdσ2|qδσ2+𝒪(δσ3) (3.127)

We can again evaluate the first derivative dZμ/dσ|q using the analog of the parallel transport equation (3.124),

dZμdσ|q=-(YνZρΓρνμ)q

Since we’re working in normal coordinates about p and not q, we no longer get to argue that this term vanishes. Instead we Taylor expand about p to get

(YνZρΓρνμ)q=(YνZρXσΓρν,σμ)pδτ+

Note that in principle we should also Taylor expand Yν and Zρ but, at leading order, these will multiply Γρνμ(p)=0, so they only contribute at next order. The second order term in the Taylor expansion (3.127) involves d2Zμ/dσ2|q and there is an expression similar to (3.3.3). To leading order the dXν/dσ and dZρ/dσ terms are again absent because they are multiplied by Γρνμ(q)=dΓρνμ/dτ|pδτ. We therefore have

d2Zμdσ2|q = -(YνYσZρΓρν,σμ)q+
= -(YνYσZρΓρν,σμ)p+

where we replaced the point q with point p because they differ only subleading terms proportional to δτ. The upshot is that this time the difference between Zrμ and Zqμ involves two terms,

Zrμ = Zqμ-(YνZρXσΓρν,σμ)pδτδσ-12(YνYσZρΓρν,σμ)pδσ2+

Finally, we can relate Zqμ to Zpμ using the expression (3.126) that we derived previously. We end up with

Zrμ = Zpμ-12(Γρν,σμ)p[XνXσZρδτ2+2YνZρXσδσδτ+YνYσZρδσ2]p+

where denotes any terms cubic or higher in small quantities.

Now suppose we go along the path C, first visiting point s and then making our way to r. We can read the answer off directly from the result above, simply by swapping X and Y and σ and τ; only the middle term changes,

Zrμ = Zpμ-12(Γρν,σμ)p[XνXσZρδτ2+2XνZρYσδσδτ+YνYσZρδσ2]p+

We find that

ΔZrμ=Zrμ-Zrμ = -(Γρν,σμ-Γρσ,νμ)p(YνZρXσ)pδσδτ+
= (RμYνρσνZρXσ)pδσδτ+

where, in the final equality, we’ve used the expression for the Riemann tensor in components (3.107), which simplifies in normal coordinates as Γρσμ(p)=0. Note that, to the order we’re working, we could equally as well evaluate RμXνρσνZρYσ at the point r; the two differ only by higher order terms.

Although our calculation was performed with a particular choice of coordinates, the end result is written as an equality between tensors and must, therefore, hold in any coordinate system. This is a trick that we will use frequently throughout these lectures: calculations are considerably easier in normal coordinates. But if the resulting expression relate tensors then the final result must be true in any coordinate system.

We have discovered a rather nice interpretation of the Riemann tensor: it tells us the path dependence of parallel transport. The calculation above is closely related to the idea of holonomy. Here, one transports a vector around a closed curve C and asks how the resulting vector compares to the original. This too is captured by the Riemann tensor. A particularly simple example of non-trivial holonomy comes from parallel transport of a vector on a sphere: the direction that you end up pointing in depends on the path you take.

The Meaning of Torsion

We discarded torsion almost as soon as we met it, choosing to work with the Levi-Civita connection which has vanishing torsion, Γμνρ=Γνμρ. Moreover, as we will see in Section 4, torsion plays no role in the theory of general relativity which makes use of the Levi-Civita connection. Nonetheless, it is natural to ask: what is the geometric meaning of torsion? There is an answer to this that makes use of the kind of parallel transport arguments we used above.

Figure 25:

This time, we start with two vectors X,YTp(M). We pick coordinates xμ and write these vectors as X=Xμμ and Y=Yμμ. Starting from pM, we can use these two vectors to construct two points infinitesimally close to p. We call these points r and s respectively: they have coordinates

r:xμ+Xμϵ   and   s:xμ+Yμϵ

where ϵ is some infinitesimal parameter.

We now parallel transport the vector XTp(M) along the direction of Y to give a new vector XTs(M). Similarly, we parallel transport Y along the direction of X to get a new vector YTr(M). These new vectors have components

X=(Xμ-ϵΓνρμYνXρ)μ   and   Y=(Yμ-ϵΓνρμXνYρ)μ

Each of these tangent vectors now defines a new point. Starting from point s, and moving in the direction of X, we see that we get a new point q with coordinates

q:xμ+(Xμ+Yμ)ϵ-ϵ2ΓνρμYνXρ

Meanwhile, if we sit at point r and move in the direction of Y, we get to a typically different point, t, with coordinates

t:xμ+(Xμ+Yμ)ϵ-ϵ2ΓνρμXνYρ

We see that if the connection has torsion, so ΓνρμΓρνμ, then the two points q and t do not coincide. In other words, torsion measures the failure of the parallelogram shown in figure to close.

3.3.4 Geodesic Deviation

Consider now a one-parameter family of geodesics, with coordinates xμ(τ;s). Here τ is the affine parameter along the geodesics, all of which are tangent to the vector field X so that, along the surface spanned by xμ(τ,s), we have

Xμ=xμτ|s

Meanwhile, s labels the different geodesics, as shown in Figure 26. We take the tangent vector in the s direction to be generated by a second vector field S so that,

Sμ=xμs|τ

The tangent vector Sμ is sometimes called the deviation vector; it takes us from one geodesic to a nearby geodesic with the same affine parameter τ.

The family of geodesics sweeps out a surface embedded in the manifold. This gives us some freedom in the way we assign coordinates s and τ. In fact, we can always pick coordinates s and t on the surface such that S=/s and X=/t, ensuring that

[S,X]=0

Roughly speaking, we can do this if we use τ and s as coordinates on some submanifold of M. Then the vector fields can be written simply as X=/τ and S=/s and [X,S]=0.

Figure 26: The black lines are geodesics generated by X. The red lines label constant τ and are generated by S, with [X,S]=0.

We can ask how neighbouring geodesics behave. Do they converge? Or do they move further apart? Now consider a connection Γ with vanishing torsion, so that XS-SX=[X,S]. Since [X,S]=0, we have

XXS=XSX=SXX+R(X,S)X

where, in the second equality, we’ve used the expression (3.106) for the Riemann tensor as a differential operator. But XX=0 because X is tangent to geodesics, and we have

XXS=R(X,S)X

In index notation, this is

Xνν(XρρSμ)=RμXννρσXρSσ

If we further restrict to an integral curve C associated to the vector field X, as in (3.117), this equation is sometimes written as

D2SμDτ2=RμXννρσXρSσ (3.128)

where D/Dτ is the covariant derivative along the curve C, defined by D/Dτ=xμτμ. The left-hand-side tells us how the deviation vector Sμ changes as we move along the geodesic. In other words, it is the relative acceleration of neighbouring geodesics. We learn that this relative acceleration is controlled by the Riemann tensor.

Experimentally, such geodesic deviations are called tidal forces. We met a simple example in Section 1.2.4.

An Example: the Sphere 𝐒2 Again

It is simple to determine the geodesics on the sphere 𝐒2 of radius r. Using the Christoffel symbols (3.111), the geodesic equations are

d2θdτ2=sinθcosθ(dϕdτ)2   and   d2ϕdτ2=-2cosθsinθdϕdτdθdτ

The solutions are great circles. The general solution is a little awkward in these coordinates, but there are two simple solutions.

  • We can set θ=π/2 with θ˙=0 and ϕ˙=constant. This is a solution in which the particle moves around the equator. Note that this solution doesn’t work for other values of θ.

  • We can set ϕ˙=0 and θ˙=constant. These are paths of constant longitude and are geodesics for any constant value of ϕ. Note, however, that our coordinates go a little screwy at the poles θ=0 and θ=π.

To illustrate geodesic deviation, we’ll look at the second class of solutions; the particle moves along θ=vτ, with the angle ϕ specifying the geodesic. This set-up is simple enough that we don’t need to use any fancy Riemann tensor techniques: we can just understand the geodesic deviation using simple geometry. The distance between the geodesic at ϕ=0 and the geodesic at some other longitude ϕ is

s(τ)=rϕsinθ=rϕsin(vτ) (3.129)

Now let’s re-derive this result using our fancy technology. The geodesics are generated by the vector field Xθ=v. Meanwhile, the separation between geodesics at a fixed τ is Sϕ=s(τ). The geodesic deviation equation in the form (3.128) is

d2sdτ2=v2Rϕsθθϕ(τ)

We computed the Riemann tensor for 𝐒2 in (3.112); the relevant component is

Rϕθθϕ=-r2sin2θ      Rϕ=θθϕgϕϕRϕθθϕ=-1 (3.130)

and the geodesic deviation equation becomes simply

d2sdτ2=-v2s

which is indeed solved by (3.129).

3.4 More on the Riemann Tensor and its Friends

Recall that the components of the Riemann tensor are given by (3.107),

Rσρμν=μΓνρσ-νΓμρσ+ΓνρλΓμλσ-ΓμρλΓνλσ (3.131)

We can immediately see that the Riemann tensor is anti-symmetric in the final two indices

Rσρμν=-Rσρνμ

However, there are also a number of more subtle symmetric properties satisfied by the Riemann tensor when we use the Levi-Civita connection. Logically, we could have discussed this back in Section 3.2. However, it turns out that a number of statements are substantially simpler to prove using normal coordinates introduced in Section 3.3.2.


Claim: If we lower an index on the Riemann tensor, and write Rσρμν=gσλRλρμν then the resulting object also obeys the following identities

  • Rσρμν=-Rσρνμ.

  • Rσρμν=-Rρσμν.

  • Rσρμν=Rμνσρ.

  • Rσ[ρμν]=0.

Proof: We work in normal coordinates, with Γλ=μν0 at a point. The Riemann tensor can then be written as

Rσρμν = gσλ(μΓλ-νρνΓλ)μρ
= 12(μ(νgσρ+ρgνσ-σgνρ)-ν(μgσρ+ρgμσ-σgμρ))
= 12(μρgνσ-μσgνρ-νρgμσ+νσgμρ)

where, in going to the second line, we used the fact that μgλσ=0 in normal coordinates. The first three symmetries are manifest; the final one follows from a little playing. (It is perhaps quicker to see the final symmetry if we return to the Christoffel symbols where, in normal coordinates, we have Rσ=ρμνμΓσ-ρννΓσρμ.) But since the symmetry equations are tensor equations, they must hold in all coordinate systems.


Claim: The Riemann tensor also obeys the Bianchi identity

[λRσρ]μν=0 (3.132)

Alternatively, we can anti-symmetrise on the final two indices, in which case this can be written as Rσ=ρ[μν;λ]0.


Proof: We again use normal coordinates, where λRσρμν=λRσρμν at the point p. Schematically, we have R=Γ+ΓΓ, so R=2Γ+ΓΓ and the final ΓΓ term is absent in normal coordinates. This means that we just have R=2Γ which, in its full coordinated glory, is

λRσρμν=12λ(μρgνσ-μσgνρ-νρgμσ+νσgμρ)

Now anti-symmetrise on the three appropriate indices to get the result.

For completeness, we should mention that the identities Rσ[ρμν]=0 and [λRσρ]μν=0 (sometimes called the first and second Bianchi identities respectively) are more general, in the sense that they hold for an arbitrary torsion free connection. In contrast, the other two identities, Rσρμν=-Rρσμν and Rσρμν=Rμνσρ hold only for the Levi-Civita connection.

3.4.1 The Ricci and Einstein Tensors

There are a number of further tensors that we can build from the Riemann tensor.

First, given a rank (1,3) tensor, we can always construct a rank (0,2) tensor by contraction. If we start with the Riemann tensor, the resulting object is called the Ricci tensor. It is defined by

Rμν=Rρμρν

The Ricci tensor inherits its symmetry from the Riemann tensor. We write Rμν=gσρRσμρν=gρσRρνσμ, giving us

Rμν=Rνμ

We can go one step further and create a function R over the manifold. This is the Ricci scalar,

R=gμνRμν

The Bianchi identity (3.132) has a nice implication for the Ricci tensor. If we write the Bianchi identity out in full, we have

λRσρμν+σRρλμν+ρRλσμν=0
×gμλgρν    μRμσ-σR+νRνσ=0

which means that

μRμν=12νR

This motivates us to introduce the Einstein tensor,

Gμν=Rμν-12Rgμν

which has the property that it is covariantly constant, meaning

μGμν=0 (3.133)

We’ll be seeing much more of the Ricci and Einstein tensors in the next section.

3.4.2 Connection 1-forms and Curvature 2-forms

Calculating the components of the Riemann tensor is straightforward but extremely tedious. It turns out that there is a slightly different way of repackaging the connection and the torsion and curvature tensors using the language of forms. This not only provides a simple way to actually compute the Riemann tensor, but also offers some useful conceptual insight.

Vielbeins

Until now, we have typically worked with a coordinate basis {eμ}={μ}. However, we could always pick a basis of vector fields that has no such interpretation. For example, a linear combination of a coordinate basis, say

e^a=eaμμ

will not, in general, be a coordinate basis itself.

Given a metric, there is a non-coordinate basis that will prove particularly useful for computing the curvature tensor. This is the basis such that, on a Riemannian manifold,

g(e^a,e^b)=gμνeaμebν=δab

Alternatively, on a Lorentzian manifold we take

g(e^a,e^b)=gμνeaμebν=ηab (3.134)

The components eaμ are called vielbeins or tetrads. (On an n-dimensional manifold, these objects are usually called “German word for n”-beins. For example, one-dimensional manifolds have einbeins; four-dimensional manifolds have vierbeins.)

The is reminiscent of our discussion in Section 3.1.2 where we mentioned that we can always find coordinates so that any metric will look flat at a point. In (3.134), we’ve succeeded in making the manifold look flat everywhere (at least in a patch covered by a chart). There are no coordinates that do this, but there’s nothing to stop us picking a basis of vector fields that does the job. In what follows, μν indices are raised/lowered with the metric gμν while a,b indices are raised/lowered with the flat metric δab or ηab. We will phrase our discussion in the context of Lorentzian manifolds, with an eye to later applications to general relativity.

The vielbeins aren’t unique. Given a set of vielbeins, we can always find another set related by

e~a=μebμ(Λ-1)ab   with   ΛacΛbdηcd=ηab (3.135)

These are Lorentz transformations. However now they are local Lorentz transformation, because Λ can vary over the manifold. These local Lorentz transformations are a redundancy in the definition of the vielbeins in (3.134).

The dual basis of one-forms {θ^a} is defined by θ^a(e^b)=δba. They are related to the coordinate basis by

θ^a=eaμdxμ

Note the different placement of indices: eaμ is the inverse of eaμ, meaning it satisfies eaμebμ=δba and eaμeaν=δμν. In the non-coordinate basis, the metric on a Lorentzian manifold takes the form

g=gμνdxμdxν=ηabθ^aθ^b      gμν=eaμebνηab

For Riemannian manifolds, we replace ηab with δab.

The Connection One-Form

Given a non-coordinate basis {e^a}, we can define the components of a connection in the usual way (3.100)

e^ce^b=Γcbae^a

Note that, annoyingly, these are not the same functions as Γρνμ, which are the components of the connection computed in the coordinate basis! You need to pay attention to whether the components are Greek μ,ν etc which tells you that we’re in the coordinate basis, or Roman a,b etc which tells you we’re in the vielbein basis.

We then define the matrix-valued connection one-form as

ωab=Γcbaθ^c (3.136)

This is sometimes referred to as the spin connection because of the role it plays in defining spinors on curved spacetime. We’ll describe this in Section 4.5.6.

The connection one-forms don’t transform covariantly under local Lorentz transformations (3.135). Instead, in the new basis, the components of the connection one-form are defined as e~^be~^c=Γ~bcae~^a. You can check that the connection one-form transforms as

ω~ba=Λcaωdc(Λ-1)bd+Λca(dΛ-1)bc (3.137)

The second term reflects the fact that the original connection components Γνρμ do not transform as a tensor, but with an extra term involving the derivative of the coordinate transformation (3.105). This now shows up as an extra term involving the derivative of the local Lorentz transformation.

There is a rather simple way to compute the connection one-forms, at least for a torsion free connection. This follows from the first of two Cartan structure relations:


Claim: For a torsion free connection,

dθ^a+ωabθ^b=0 (3.138)

Proof: We first look at the second term,

ωabθ^b=Γcba(eμcdxμ)(eνbdxν)

The components Γcba are related to the coordinate basis components by

Γcba=eaecμρ(μebρ+ebνΓμνρ)=eaecρμμebρ (3.139)

So

ωabθ^b = eaecλρeμceνb(λebρ+ebσΓλσρ)dxμdxν
= eaeνbρμebρdxμdxν

where, in the second line we’ve used ececλ=μδμλ and the fact that the connection is torsion free so Γ[μν]ρ=0. Now we use the fact that ebebν=ρδνρ, so ebμνeb=ρ-ebμρebν. We have

ωabθ^b = -eaebρρμeνbdxμdxν
= -μeadνxμdxν=-dθ^a

which completes the proof.

The discussion above was for a general connection. For the Levi-Civita connection, we have a stronger result


Claim: For the Levi-Civita connection, the connection one-form is anti-symmetric

ωab=-ωba (3.140)

Proof: This follows from the explicit expression (3.139) for the components Γbca. Lowering an index, we have

Γabc=ηadedebρμμec=ρ-ηadecebρμμed=ρ-ηcfefebσμμ(ηadgρσed)ρ

where, in the final equality, we’ve used the fact that the connection is compatible with the metric to raise the indices of edρ inside the covariant derivative. Finishing off the derivation, we then have

Γabc=-ηcfefebρμμea=ρ-Γcba

The result then follows from the definition ωab=Γacbθ^c.

The Cartan structure equation (3.138), together with the anti-symmetry condition (3.140), gives a quick way to compute the spin connection. It’s instructive to do some counting to see how these two equations uniquely define ωab. In particular, since ωab is anti-symmetric, one might think that it has 12n(n-1) independent components, and these can’t possibly be fixed by the n Cartan structure equations (3.138). But this is missing the fact that ωab are not numbers, but are one-forms. So the true number of components in ωab is n×12n(n-1). Furthermore, the Cartan structure equation is an equation relating 2-forms, each of which has 12n(n-1) components. This means that it’s really n×12n(n-1) equations. We see that the counting does work, and the two fix the spin connection uniquely.

The Curvature Two-Form

We can compute the components of the Riemann tensor in our non-coordinate basis,

Rabcd=R(θ^a;e^c,e^d,e^b)

The anti-symmetry of the last two indices, Rabcd=-Rabdc, makes this ripe for turning into a matrix of two-forms,

a=b12Raθ^cbcdθ^d (3.141)

The second of the two Cartan structure relations states that this can be written in terms of the curvature one-form as

a=bdωa+bωacωcb (3.142)

The proof of this is mechanical and somewhat tedious. It’s helpful to define the quantities [e^a,e^b]=fabe^cc along the way, since they appear on both left and right-hand sides.

3.4.3 An Example: the Schwarzschild Metric

The connection one-form and curvature two-form provide a slick way to compute the curvature tensor associated to a metric. The reason for this is that computing exterior derivatives takes significantly less effort than computing covariant derivatives. We will illustrate this for metrics of the form,

ds2=-f(r)2dt2+f(r)-2dr2+r2(dθ2+sin2θdϕ2) (3.143)

For later applications, it will prove useful to compute the Riemann tensor for this metric with general f(r). However, if we want to restrict to the Schwarzschild metric we can take

f(r)=1-2GMr (3.144)

The basis of non-coordinate one-forms is

θ^0=fdt,θ^1=f-1dr,θ^2=rdθ,θ^3=rsinθdϕ (3.145)

Note that the one-forms θ^ should not be confused with the angular coordinate θ! In this basis, the metric takes the simple form

ds2=ηabθ^aθ^b

We now compute dθ^a. Calculationally, this is straightforward. In particular, it’s substantially easier than computing the covariant derivative because there’s no messy connection to worry about. The exterior derivatives are simply

dθ^0=fdrdt,dθ^1=0,dθ^2=drdθ,dθ^3=sinθdrdϕ+rcosθdθdϕ

The first Cartan structure relation, dθ^a=-ωabθ^b, can then be used to read off the connection one-form. The first equation tells us that ω0=1ffdt=fθ^0. We then use the anti-symmetry (3.140), together with raising and lowering by the Minkowski metric η=diag(-1,+1,+1,+1) to get ω1=0ω10=-ω01=ω01. The Cartan structure equation then gives dθ^1=-ω10θ^0+ and the ω10θ^0 contribution happily vanishes because it is proportional to θ^0θ^0=0.

Next, we take ω2=1fdθ=(f/r)θ^2 to solve the dθ^2 structure equation. The anti-symmetry (3.140) gives ω1=2-ω2=1-(f/r)θ^2 and this again gives a vanishing contribution to the dθ^1 structure equation.

Finally, the dθ^3 equation suggests that we take ω3=1fsinθdϕ=(f/r)θ^3 and ω3=2cosθdϕ=(1/r)cotθθ^3. These anti-symmetric partners ω1=3-ω31 and ω2=3-ω32 do nothing to spoil the dθ^1 and dθ^2 structure equations, so we’re home dry. The final result is

ω0=1ω1=0fθ^0 ,  ω2=1-ω1=2frθ^2
ω3=1-ω1=3frθ^3 ,  ω3=2-ω2=3cotθrθ^3

Now we can use this to compute the curvature two-form. We will focus on

0=1dω0+1ω0cωc1

We have

dω0=1fdθ^0+f′′drθ^0=((f)2+f′′f)drdt

The second term in the curvature 2-form is ω0cωc=1ω01ω1=10. So we’re left with

0=1((f)2+f′′f)drdt=((f)2+f′′f)θ^1θ^0

The other curvature 2-forms can be computed in a similar fashion. We can now read off the components of the Riemann tensor in the non-coordinate basis using (3.141). (We should remember that we get a contribution from both R 1010 and R 1100=-R 1010, which cancels the factor of 1/2 in (3.141).) After lowering an index, we find that the non-vanishing components of the Riemann tensor are

R0101 = ff′′+(f)2
R0202 = ffr
R0303 = ffr
R1212 = -ffr
R1313 = -ffr
R2323 = 1-f2r2

We can also convert this back to the coordinates xμ=(t,r,θ,ϕ) using

Rμνρσ=eμaeνbeρceσdRabcd

This is particularly easy in this case because the matrices eaμ defining the one-forms (3.145) are diagonal. We then have

Rtrtr = ff′′+(f)2
Rtθtθ = f3fr
Rtϕtϕ = f3frsin2θ
Rrθrθ = -frf (3.146)
Rrϕrϕ = -frfsin2θ
Rθϕθϕ = (1-f2)r2sin2θ

Finally, if we want to specialise to the Schwarzschild metric with f(r) given by (3.144), we have

Rtrtr = -2GMr3
Rtθtθ = GM(r-2GM)r2
Rtϕtϕ = GM(r-2GM)r2sin2θ
Rrθrθ = -GMr-2GM
Rrϕrϕ = -GMsin2θr-2GM
Rθϕθϕ = 2GMrsin2θ

Although the calculation is a little lengthy, it turns out to be considerably quicker than first computing the Levi-Civita connection and subsequently motoring through to get the Riemann tensor components.

3.4.4 The Relation to Yang-Mills Theory

It is no secret that the force of gravity is geometrical. However, the other forces are equally as geometrical. The underlying geometry is something called a fibre bundle, rather than the geometry of spacetime.

We won’t describe fibre bundles in this course, but we can exhibit a clear similarity between the structures that arise in general relativity and the structures that arise in the other forces, which are described by Maxwell theory and its generalisation to Yang-Mills theory.

Yang-Mills theory is based on a Lie group G which, for this discussion, we will take to be SU(N) or U(N). If we take G=U(1), then Yang-Mills theory reduces to Maxwell theory. The theory is described in terms of an object that physicists call a gauge potential. This is a spacetime “vector” Aμ which lives in the Lie algebra of G. In more down to earth terms, each component is an anti-Hermitian N×N matrix, (Aμ)ba, with a,b=1,,N. In fact, as we saw above, this “vector” is really a one-form. The novelty is that it’s a Lie algebra-valued one-form.

Mathematicians don’t refer to Aμ as a gauge potential. Instead, they call it a connection (on a fibre bundle). This relationship becomes clearer if we look at how Aμ changes under a gauge transformation

A~μ=ΩAμΩ-1+ΩμΩ-1

where Ω(x)G. This is identical to the transformation property (3.137) of the one-form connection under local Lorentz transformations.

In Yang-Mills, as in Maxwell theory, we construct a field strength. In components, this is given by

(Fμν)ba=μ(Aν)ba-ν(Aμ)ba+[Aμ,Aν]ba

Alternatively, in the language of forms, the field strength becomes

Fba=dAa+bAacAcb

Again, there is an obvious similarity with the curvature 2-form introduced in (3.142). Mathematicians refer to the Yang-Mills field strength the “curvature”.

A particularly quick way to construct the Yang-Mills field strength is to take the commutator of two covariant derivatives. It is simple to check that

[𝒟μ,𝒟ν]=Fμν

where I’ve suppressed the a,b indices on both sides. This is the gauge theory version of the Ricci identity (3.108): for a torsion free connection,

[μ,ν]Zσ=RσρμνZρ