General Relativity and the Principle of Inertia

General Relativity and the Principle of Inertia

In his appraisals of the theory of relativity Einstein often claimed that the general theory remedied an inherent defect in both Newtonian mechanics and special relativity, namely, the reliance on the principle of inertia, and the corresponding “inertial coordinate systems” for rendering intelligible the laws of physics. Of course, it’s always possible to analyze physical processes in terms of arbitrary coordinate systems, but Newtonian mechanics and special relativity seem to give a preferred status to inertial coordinate systems, which are distinguished by the fact that the laws of physics can be expressed in particularly simple form in terms of such coordinates. This is ultimately because the fundamental notions of spatial distance and temporal duration underlying those physical laws are essentially defined in terms of inertial coordinate systems. It’s possible to take a description of phenomena in terms of inertial coordinates and translate it into a description in terms of any arbitrary coordinate system (given the mapping between the two systems), and it’s even possible to express the laws of physics in terms of this arbitrary coordinate system, but the translated laws will usually be very complicated, containing many terms that we would regard as artifacts of the chosen coordinate system. (A well-known example is the “Coriolis acceleration” terms that appear when Newtonian mechanics is expressed in non-inertial coordinates.) In contrast, Einstein says, the laws of physics (mechanics and gravitation) in the general theory of relativity do not give a preferred status to any class of coordinate systems, so (he believed) the general theory finally frees us from the unjustified reliance on the principle of inertia and its “distinguished” systems of coordinates.

However, the field equations of general relativity are actually not applicable to arbitrary coordinate systems; they are applicable “only” to the members of one specific class of coordinate systems that are all diffeomorphically equivalent to each other. This is admittedly a very large class – much larger than (for example) the class of coordinate systems related by linear transformations – but it’s still just an infinitesimally small fraction of all possible coordinate systems. In order for the field equations to have any applicability we must first select at least one member of the appropriate equivalence class, and general relativity does not tell us how to make this selection. We must simply rely on an intuitively plausible choice, based on our pre-existing notions of the topological arrangement of events and our identification of persistent entities through time. This amounts to a reliance, once again, on the principle of inertia (albeit in a weaker sense) as the organizing principle for interpreting our raw experiences and rendering them nominally intelligible.

It might be argued that reducing our reliance on the principle of inertia to this weaker topological form is, in itself, a notable achievement, and that perhaps this is all Einstein intended to claim. However, general relativity actually relies on more than just the topology of inertia, it relies on the metrical attributes of inertia just as much as do Newtonian mechanics and special relativity, the only difference being that in general relativity the inertial geodesics imply a curved manifold, whereas the inertial geodesics in the earlier theories were consistent with a globally flat manifold. Nevertheless, general relativity assumes each small region of space and time asymptotically approaches “flatness”, and in these infinitesimal regions the metrical properties are those of special relativity, which we’ve already seen are based epistemologically on the principle of inertia. The overall metrical properties in general relativity are simply pieced together (in a possibly curved pattern) from these individually flat regions. The fact that these regions may be assembled in a curved pattern admittedly introduces a new degree of freedom to the metrical properties of the manifold (with the additional freedom constrained by the field equations), but it doesn’t change the fact that the elementary quantifiable metrical attributes of the manifold are still based on the principle of inertia, just as much as they are in special relativity.

Given that general relativity still relies on inertial geodesics as the basis of its applicability, we might ask why the field equations of general relativity appear to be uniformly applicable to arbitrary systems of coordinates (within the appropriate diffeomorphism class), regardless of whether they are locally inertial or not. Should we not expect to find that coordinate systems based on geodesics are “distinguished” in the sense that the laws take a particularly simple form when expressed in terms of such (local) inertially “free-falling” coordinate systems? Indeed we should - and we do - but this fact is obscured by tensor notation. This should not be surprising, because the tensor formalism was designed explicitly for the purpose of systematizing the effects of general coordinate transformations (up to diffeomorphism) so they are automatically handled, allowing us to operate on a “higher” level of abstraction. In a sense, the tensor formalism is a highly efficient system of abbreviations and “shorthand” notations.

This is best illustrated by an example. Consider the fully covariant Riemann curvature tensor, denoted as R_abcd. At the origin of Riemann normal coordinates, which are defined based on the geodesics passing through the origin, the curvature tensor can be expressed in terms of partial derivatives of the metric coefficients as

where subscripts following commas signify partial differentiation with respect to the coordinates with those indices. The simplicity of this equation is due to the special symmetries that apply at the origin of Riemann normal coordinates, which are basically projected from a “tangent space” at that point, so the first derivatives of the metric all vanish (i.e., the Christoffel symbols are all zero), and the metric satisfies the special symmetry g_ab,cd = g_cd,ab. Without this last special symmetry, but still using coordinates such that the first metric derivatives vanish, the curvature tensor must be expressed in the more general form

But this is still based on very specialized coordinates aligned with and asymptotic to the geodesics of the manifold passing through that point, and such that all the Christoffel symbols vanish. Thus it applies only to “freely-falling” locally inertial coordinate systems. The completely general form of the Riemann curvature tensor, applicable to arbitrary curvilinear coordinates (related to the inertial geodesics by a diffeomorphism) is actually

where the Γ are Christoffel symbols of the first kind, defined as

If we substitute these expressions for the Christoffel symbols into the general expression (3) for the curvature tensor, it obviously gives a very elaborate formula, hardly recognizable as representing essentially nothing but equation (1). All the additional machinery in (3) combined with (4) is solely for the purpose of automatically compensating for arbitrary diffeomorphic transformations of the coordinates. The Einstein tensor appearing in the field equations of general relativity is defined in terms of contractions of the Riemann curvature tensor, so the same degree of complication arises when we consider general curvilinear coordinates versus normalized geodesic (i.e., inertial) coordinates.

There is a huge difference in complexity (and intelligibility) between equation (1) and the combination of equations (3) and (4), and yet they can both be represented by the symbol R_abcd. By expressing the field equations in tensor notation, we do not change the fact that there are highly distinguished systems of coordinates, we simply disguise this fact by defining the symbols to include whatever terms are necessary to compensate for general diffeomorphic transformations of the coordinates. Similarly we can express the laws of Newtonian mechanics in a way that applies to general coordinate systems if we simply define symbols to represent not only the “actual” accelerations but also whatever additional terms (e.g., Coriolis) may arise due to the use of non-geodesic (i.e., non-inertial) coordinates. Tensor calculus was invented specifically to systematize this process of compensating for the effects of diffeomorphic transformations.

It’s well known that Einstein’s emphasis on the concept of “general covariance” as a heuristic guide is more subtle – and more problematic - than it appears at first glance, because essentially any set of physical laws (including the laws of Newtonian mechanics and the laws of special relativity) can be expressed in generally covariant form. Hence “general covariance” can serve as a guide only if combined with some other concept, such as the notion that physical laws ought to have particularly simple representations when expressed in generally covariant form. However, even this is too broad to be useful, because we could devise many different generally covariant formalisms, of which the tensor formalism is only one, and each of these would have its own set of “simply expressible” laws. (Compare this with the impossibility of proving the absolute complexity of a string of binary digits.) Einstein eventually settled on the idea that the tensor formalism in particular is somehow the preferred formalism of nature, so we should seek laws that can be simply expressed as tensor equations. This was the position taken by Einstein in 1918 when he wrote (responding to Kretschmann’s critique of his use of the principle of general covariance)

Although it is true that one can put every empirical law in a generally covariant form, yet the principle [of general covariance] possesses great heuristic power… Of two theoretical systems, both of which are in agreement with experience, that one is to be preferred which, from the point of view of the absolute differential calculus is the simpler and more transparent.

Fair enough (although if two laws are equally in agreement with experience, the significance of our “preference” is debatable), but this is quite distinct from the question of whether there exists a distinguished class of coordinate systems for general relativity – distinguished in the sense that the laws take on a particularly simple form when expressed in terms of such coordinates. Clearly the Equivalence Principle leads us to expect such distinguished systems of coordinates to exist, because the principle asserts that the spacetime manifold is asymptotically Minkowskian at each point. Indeed this is confirmed by noting the drastic simplification of the explicit field equations at the origin of Riemann normal coordinates as compared with general coordinates. This shows that the principle of inertia continues to play a crucial role. In this regard it’s worth remembering that although we can formally express the laws of general relativity as a tensor equation without any explicit choice of coordinate system, it is nevertheless true that the physical implications of those laws for measurable quantities ultimately relies on our ability to choose physically meaningful coordinates. We never observe abstract tensors; we observe the individual components of tensors with respect to some explicit basis.

Einstein’s claim that general relativity dispenses with the principle of inertia might seem to be supported by the fact that, uniquely among field theories, the equations of motion in general relativity need not be postulated separately from the field equations. In other words, we don’t need to assume what amounts to the principle of inertia, i.e., that material particles follow geodesic paths in spacetime. The field equations themselves imply that free particles follow geodesic paths. This can easily be overlooked when dealing only with test particles of negligible mass moving in a vacuum, because we typically don’t treat such object as sources, we just apply the vacuum field equations R_mn = 0, augmented with the assumption that the test particles follow geodesic paths. But the need to make this separate assumption arises only because we have neglected the role of the particles as sources. If we treat the particles correctly as contributors to the source term, we must apply the full field equations, and those equations do indeed imply that the particles must follow geodesic paths (in the absence of non-gravitational forces).

However, it’s somewhat misleading to say that the equations of motion emerge from the field equations without having been imposed as a separate assumption. They follow as a direct consequence of the fact that particles follow “straight and uniform” inertial paths in each infinitesimal region of spacetime, and this in turn is a direct consequence of the local conservation of energy-momentum. It’s true that the field equations of general relativity imply this conservation, as can be seen by the vanishing of the covariant divergence of the Einstein tensor

The field equations simply equate this to the energy-momentum tensor T_μν, so the covariant divergence of the latter must also vanish, hence energy-momentum is locally conserved, hence particles follow geodesics. But the local conservation of mass-energy didn’t arise automatically, it was specifically designed into the field equations by the inclusion of the “trace” term (the term with coefficient 1/2) in the Einstein tensor. In fact, one of the early attempts of Einstein and Grossmann to formulate generally relativistic field equations led to the result R_μν = T_μν, but not surprisingly this is unsatisfactory, precisely because the covariant divergence does not vanish. After a great deal of searching, Einstein finally realized that the natural conservation laws – and hence the law of inertia – is recovered if we include the trace term. David Hilbert arrived at this same conclusion almost simultaneously (in November 1915), although his route was much more direct, since he proceeded from a Lagrangian, which automatically leads to conservation laws. In view of this, it’s surely disingenuous to claim that the field equations of general relativity allow us to dispense with the independent assumption of equations of motion; the equations of motion essentially were imposed as a separate assumption, corresponding to the assumed conservation of energy-momentum that was intentionally built into the equations by the inclusion of the trace term.

In another note we commented on the remarkable parallel between Einstein’s addition of this term and Maxwell’s addition of the “displacement current” to Ampere’s law. The latter is necessary for Maxwell’s equations to be consistent with the conservation of charge, just as the trace term in Einstein’s field equations is necessary for the equations to be consistent with conservation of mass-energy. The two cases differ with regard to the equations of motion only because the conservation of electric charge does not, by itself, constrain the motion of a massive particle. It’s possible for the inertial mass of an electrically charged particle to be accelerated in a variety of ways while still conserving electric charge and current. In contrast, the conservation of energy-momentum fully constrains the motion of a massive particle (in the absence of non-gravitational forces), because the inertial mass is identical to the conserved gravitational “charge”. The principle of inertia is really just another name for the conservation of energy and momentum.

The elaborate investigations of Einstein, Infeld, and Hoffman into the “equations of motion” for isolated pole singularities (eschewing the representation of matter by means of the stress energy tensor) admittedly lead to the very interesting result that such isolated singularities are consistent with the field equations only if they “move” along geodesic paths. There is, undeniably, a sense in which the field equations of general relativity are “stronger” and more comprehensive than those of (say) electromagnetism. This is especially true in view of the fact that the equations of motion arise solely from the non-linearity of Einstein’s field equations, whereas all other classical field theories are strictly linear. However, the representation of mass particles as singularities of the field is problematic (as Einstein himself was the first to admit), and leads to ambiguities in the interpretation of the results. EIH were unavoidably guided by the actual inertial behavior of physical entities in forming their interpretations. Moreover, the whole basis of the analysis was a series of successive linearized approximations beginning with flat Minkowski spacetime, and then applying small perturbations. This procedure explicitly relies on the identification of a background of flat Minkowski spacetime, since otherwise the process of evaluation cannot even get started. Thus the epistemological reliance on the principle of inertia (as embodied in Minkowski spacetime) is carried over into this analysis.

So, despite Einstein’s hopes, general relativity does not in any way explain or obviate the principle of inertia. Granted, if the field equations didn’t include the trace term (so that the covariant divergence didn’t vanish), the resulting theory would have many problems and be subject to many objections, but this goes without saying. No one disputes that the principle of inertia is extremely well-founded in observation. It is an extremely well-justified postulate – but it is still a postulate. General relativity does not explain inertia, nor does it dispense with the need to organize our spatio-temporal theories on the topology and morphology implicit in the principle of inertia and the associated distinguished coordinate systems.

Return to MathPages Main Menu