1.6  A More Practical Arrangement


It is known that Maxwell’s electrodynamics – as usually understood at the present time – when applied to moving bodies, leads to asymmetries which do not appear to be inherent in the phenomena.

                                                                                                A. Einstein, 1905


In the first section of his 1905 paper "On the Electrodynamics of Moving Bodies" Einstein noted that we could assign to each event the time reading of a clock resting at the origin of a system of spatial coordinates when light from that event reaches the clock, or (alternatively) when light from the clock reaches that event. Both of these methods would give time coordinates that are offset – either leading or lagging – by the light transit time. These methods have the disadvantage of making the time coordinates dependent on the choice of spatial origin. If we were to stipulate that the speed of light is the same in all directions (relative to our chosen coordinates), we could remedy this problem by assigning the origin clock time to any event that is temporally half-way between the leading and lagging events at any fixed spatial location. This would correspond to Lorentz’s concept of "true" time, but it would be valid only for a system of coordinates at rest in the ether. Lorentz would define this as the true time coordinate for every other system of coordinates as well. However, for every frame except the original one (in which the time coordinates are determined by isotropic light speed) these coordinates would not constitute an inertial coordinate system, because mechanical inertia would not be isotropic in terms of such coordinates, which would severely limit their utility, even if we had some way of identifying them (which we do not). Einstein says "We arrive at a much more practical arrangement by means of the following considerations".


Einstein argued that the most natural and useful class of coordinate systems would satisfy two basic principles, each derived from experience. The first is nothing other than Galileo's classical principle of inertial relativity, which asserts that for any material object in any state of motion there exists a system of space and time coordinates, called inertial coordinates, with respect to which the object is instantaneously at rest and inertia is homogeneous and isotropic (the latter being necessary for the usual equations of Newtonion mechanics to hold good at least quasi-statically). This principle, by itself, is not sufficient to establish the kinematics of inertial coordinate systems. We must also determine how the description of events with respect to one system of inertial coordinates is related to the description of those same events with respect to another, relatively moving, system of inertial coordinates. Rather than simply assuming a relationship based on some prior conception of inertial measures of space and time, Einstein realized that the correct relationship between relatively moving systems of inertial coordinates could only be determined empirically. Noting "the unsuccessful attempts to discover any motion of the earth relatively to the 'light medium" (i.e., the failure to detect any anisotropy in the speed of light), he argued that, since we define speeds in terms of inertial coordinates, these experiments imply that the speed of light is isotropic in terms of the very same class of coordinate systems for which mechanical inertia is isotropic. On the other hand, all the experimental results that are consistent with Maxwell's equations imply that the propagation speed of light (with respect to any inertial coordinate system) is independent of the state of motion of the emitting source. Einstein’s achievement was to explain clearly how these seemingly contradictory facts of experience may be reconciled.


As an aside, notice that light-speed isotropy with respect to the rest frame of the source is what we would expect if light was a stream of inertial corpuscles (as suggested by Newton), whereas the independence of the speed of light from the motion of its source is what we would expect if light was a wave in a medium. Thus, just as in quantum mechanics, we need to account for the fact that light behaves in some respects like a classical wave and in other respects like a classical particle. It is not a mere coincidence that Einstein wrote his seminal paper on light quanta almost simultaneously with his paper on the electrodynamics of moving bodies. He might actually have chosen to combine the two into a single paper, discussing general heuristic considerations arising from the observed properties of light, reconciling what would classically seem to be the irreconcilable wave-like and particle-like attributes light.


From the empirical facts that (a) light in vacuum propagates isotropically with respect to every system of inertial coordinates (which is essentially just an extension of Galileo's principle of relativity), and that (b) the speed of propagation of light with respect to any system of inertial coordinates is independent of the motion of the emitting source, it follows that the speed of light in invariant with respect to every system of inertial coordinates. From these facts we can deduce the correct relationship between relatively moving systems of inertial coordinates.


To establish the form of the relationships between this "more practical" class of coordinate systems (i.e., the class of inertial coordinate systems), Einstein notes that if x,y,z,t is a system of inertial coordinates, and a pulse of light is emitted from location x0 along the x axis at time t0 toward a distant location x1, where it arrives and is reflected at time t1, and if this reflected pulse is received back at location x2 (the same as x0) at time t2 then t1 = (t0 + t2)/2. In other words, since light is isotropic with respect to the same class of coordinate systems in which mechanical inertia is isotropic, the light pulse takes the same amount of time, (t2 - t1)/2, to travel each way when expressed in terms of any system of inertial coordinates. By the same reasoning the spatial distance between the emission and reflection events is x1 – x0 = c(t2 - t1)/2.


Naturally the invariance of light speed with respect to inertial coordinates is implicit in the principles on which special relativity is based, but we must not make the mistake of thinking that this invariance is therefore tautological, or merely an arbitrary definition. Inertial coordinates are not arbitrary, and they are definable without explicit reference to the phenomenon of light. The real content of Einstein's principles is that light is an inertial phenomenon (despite its wavelike attributes). The stationary ether posited by Lorentz did not interact mechanically with ordinary matter at all, and yet we know that light conveys momentum to material objects. The coupling between the supposed ether and ordinary matter was always problematic for ether theories, and indeed for any classical wavelike theory of light. Einstein’s paper on the photo-electric effect was a crucial step in recognizing the localized ballistic aspects of electromagnetic radiation, and this theme persists, just under the surface, in his paper on electrodynamics. Oddly enough, the clearest statement of this insight came only as an afterthought, appearing in Einstein's second paper on relativity in 1905, in which he explicitly concluded that "radiation carries inertia between emitting and absorbing bodies". Light conveys not only momentum, but inertia. For example, after a body has absorbed an elementary pulse of light, it has not only received a “kick” from the momentum of the light, but the internal inertia (i.e., the inertial mass) of the body has actually increased.


Once it is posited that light is inertial, Galileo's principle of relativity automatically implies that light propagates isotropically from the source, regardless of the source's state of uniform motion. Consequently, if we elect to use space and time coordinates in terms of which light speed is not isotropic (which we are certainly free to do), we will necessarily find that no inertial processes are isotropic. For example, we will find that two identical marbles expelled from a tube in opposite directions by an explosive charge located between them will not fly away at equal speeds, i.e., momentum will not be conserved. Conversely, if we use ordinary mechanical inertial processes together with the conservation of momentum (and if we decline to assign any momentum or reaction to unobservable and/or immovable entities), we will necessarily arrive at clock synchronizations that are identical with those given by light pulses. Thus, Einstein's "more practical arrangement" is based on (and ensures) isotropy not just for light propagation, but for all inertial phenomena.


If a uniformly moving observer uses pairs of identical material objects thrown with equal force in opposite directions to establish spaces of simultaneity, he will find that his synchronization agrees with that produced by Einstein's assumed isotropic light rays. The special attribute of light in this regard is due to the fact that, although light is inertial, it has no mass of its own, and therefore no rest frame. It can be regarded entirely as nothing but an interaction along a null interval between two massive bodies, the emitter and absorber. From this follows the indefinite metric of spacetime, and light's seemingly paradoxical combination of wavelike and inertial properties. (This is discussed more fully in Section 9.9.)


It's also worth noting that when Einstein invoked the operational definitions of time and distance based on light propagation, he commented that "we assume this definition of synchronization is free from contradictions, and possible for any number of points". This is crucial for understanding why a set of definitions based on the propagation of light is tenable, in contrast with a similar set of definitions based on non-inertial signals, such as acoustical waves or postal messages. A set of definitions based on any non-inertial signal can't possibly preserve inertial isotropy. Furthermore, a signal requiring an ordinary material medium for its propagation would obviously not be suitable for a universal definition of time, because it would be inapplicable across regions devoid of that substance. Moreover, even if we posited an omni-present substance, a signal consisting of (or carried by) any material substance would be unsuitable because such objects do not exhibit any particular fixed characteristic of motion, as shown by the fact that they can be brought to rest with respect to some inertial system of reference. Furthermore, if there exist any signals faster than those on which we base our definitions of temporal synchronization, those definitions will be easily falsified. The fact that Einstein's principles are empirically viable at all, far from being vacuous or tautological, is actually somewhat miraculous.


In fact, if we were to describe the kind of physical phenomenon that would be required in order for us to have a consistent capability of defining a coherent basis of temporal synchronization for spatially separate events, clearly it could be neither a material object, nor a disturbance in a material medium, and yet it must exhibit some fixed characteristic quality of motion that exceeds the motion of any other object or signal. We hardly have any right to expect, a priori, that such a phenomenon exists. On the other hand, it could be argued that Einstein's second principle is just as classical as his first, because sight has always been the de facto arbiter of simultaneity (as well as of straightness, as in "uniform motion in a straight line"). Even in Galileo's day it was widely presumed that vision was instantaneous, so it automatically was taken to define simultaneity. (We review the historical progress of understanding the speed of light in Section 3.3.) The difference between this and the modern view is not so much the treatment of light as the means of defining simultaneity, but simply the realization that light propagates at a finite speed, and therefore the spacetime manifold is only partially ordered.


The derivation of the Lorentz transformation in Einstein's 1905 paper is formally based on two empirically-based propositions, which he expressed as follows:


1. The laws by which the conditions of physical systems change are independent of which of two coordinate systems in homogeneous translational movement relative to each other these changes in status are referred.

2. Each ray of light moves in "the resting" coordinate system with the definite speed c, independently of whether this ray of light is emitted from a resting or moving body. Here speed = (optical path) / (length of time), where "length of time" is to be understood in the sense of the definition in § l.


In the first of these propositions we are to understand that the “coordinate systems” are all such that Newton’s laws of motion hold good (in a suitable limiting sense), as alluded to at the beginning of the paper’s §l. This is crucial, because without this stipulation, the proposition is false. For example, coordinate systems related by Galilean transformations are “in homogeneous translational movement relative to each other”, and yet the laws by which physical systems change (e.g., Maxwell’s equations) are manifestly not independent of the choice of such coordinate systems. So the restriction to coordinate systems in terms of which the laws of mechanics hold good is essential. However, once we have imposed this restriction, the proposition becomes tautological, at least for the laws of mechanics. The real content of Einstein’s first “principle” is therefore the assertion that the other laws of physics (e.g., the laws of electrodynamics) hold good in precisely the same set of coordinate systems in terms of which the laws of mechanics hold good. (This is also the empirical content of the failure of the attempts to detect the Earth’s absolute motion through the electromagnetic ether.) Thus Einstein’s first principle simply re-asserts Galileo’s claim that all effects of uniform rectilinear motion can be “transformed away” by a suitable choice of coordinate systems.


It might seem that Einstein’s second principle is implied by the first, at least if Maxwell's equations are regarded as laws governing the changes of physical systems, because Maxwell's equations prescribe the speed of light propagation independent of the source's motion. (Indeed, Einstein alluded to this very point at the beginning of his 1905 paper on the inertia of energy.) However, it’s not clear a priori whether Maxwell’s equations are valid in terms of relatively moving systems of coordinates, nor whether the permissivity of the vacuum is independent of the frame of reference in terms of which it is evaluated. Moreover, as discussed above, by 1905 Einstein already doubted the absolute validity of Maxwell's equations, having recently completed his paper on the photo-electric effect which introduced the idea of photons, i.e., light propagating as discrete packets of energy, a concept which cannot be represented as a solution of Maxwell's linear equations. Einstein also realized that a purely electromagnetic theory of matter based on Maxwell's equations was impossible, because those equations by themselves could never explain the equilibrium of electric charge that constitutes a charged particle. "Only different, nonlinear field equations could possibly accomplish such a thing." This observation shows how unjustified was the "molecular force hypothesis" of Lorentz, according to which all the forces of nature were assumed to transform exactly as do electromagnetic forces as described by Maxwell's linear equations. Knowing that the molecular forces responsible for the equilibrium of charged particles must necessarily be of a fundamentally different character than the forces of electromagnetism, and certainly knowing that the stability of matter may not even have a description in the form of a continuous field theory at all, it's clear that Lorentz's hypothesis has no constructive basis, and is simply tantamount to the adoption of Einstein’s two principles.


Thus, Einstein's contribution was to recognize that "the bearing of the Lorentz transformation transcended its connection with Maxwell's equations and was concerned with the nature of space and time in general". Instead of basing special relativity on an assumption of the absolutely validity of Maxwell's equations, Einstein based it on the particular characteristic exhibited by those equations, namely Lorentz invariance, that he intuited was the more fundamental principle, one that could serve as an organizing principle analogous to the conservation of energy in thermodynamics, and one that could encompass all physical laws, even if they turned out to be completely dissimilar to Maxwell's equations. Remarkably, this has turned out to be the case. Lorentz invariance is a key aspect of the modern theory of quantum electrodynamics, which replaced Maxwell’s equations.


Just as Einstein’s first principle relies on the restriction to coordinate systems in which the laws of mechanics hold good, his second principle relies crucially on the requirement that time intervals are “to be understood in the sense of the definition given in §1”. And, again, once this condition is recognized, the principle itself becomes tautological, although in this case the tautology is complete. The second principle states that light always propagates at the speed c, assuming we define the time intervals in accord with §1, which defines time intervals as whatever they must be in order for the speed of light to be c. This unfortunately has led some critics to assert that special relativity is purely tautological, merely a different choice of conventions. Einstein’s presentation somewhat obscures the real physical content of the theory, which is that mechanical inertia and the propagation speed of light are isotropic and invariant with respect to precisely the same set of coordinate systems. This is a non-trivial fact. It then remains to determine how these distinguished coordinate systems are related to each other.


Although Einstein explicitly highlighted just two principles as the basis of special relativity in his 1905 paper (consciously patterned after the two principles of thermodynamics), his derivation of the Lorentz transformation also invoked “the properties of homogeneity that we attribute to space and time” to establish the linearity of the transformations. (Linear fractional transformations are ruled out by stipulating continuity of the coordinates of a particle, or simply by requiring that finite coordinates map to finite coordinates.) In addition, he tacitly assumed spatial isotropy, i.e., that there is no preferred direction in space, so the intrinsic properties of ideal rods and clocks do not depend on their spatial orientations. Lastly, he assumed memorylessness, i.e., that the extrinsic properties of rods and clocks may be functions of their current positions and states of motion, but not of their previous positions or states of motion. This last assumption is needed to exclude the possibility that every elementary particle may somehow "remember" its entire history of accelerations, and thereby "know" its present absolute velocity relative to a common fixed reference. (Einstein explicitly listed these extra assumptions in an exposition written in 1920. He may have gained an appreciation of the importance of the independence of measuring rods and clocks from their past history after considering Weyl’s unified field theory, which Einstein rejected precisely because it violated this premise.)


The actual detailed derivation of the Lorentz transformation presented in Einstein’s 1905 paper is somewhat obscure and circuitous, but it’s worthwhile to follow his reasoning, partly for historical interest, and partly to contrast it with the more direct and compelling derivations that will be presented in subsequent sections.


Following Einstein’s original derivation, we begin with an inertial (and Cartesian) coordinate system called K, with the coordinates x, y, z, t, and we posit another system of inertial coordinates denoted as k, with the coordinates ξ, η, ζ, τ. The spatial axes of these two systems are aligned, and the spatial origin of k is moving in the positive x direction with speed v in terms of K. We then consider a particle at rest in the k system, and note that for such a particle the x and t coordinates (i.e., the coordinates in terms of the K system) are related by xʹ = x − vt for some constant xʹ. We also know the y and z coordinates of such a particle are constant. Hence each stationary spatial position in the k system corresponds to a set of three constants (xʹ,y,z), and we can also assign the time coordinate t to each event.


Interestingly, the system of variables xʹ,y,z,t constitute a complete coordinate system, related to the original system K by a Galilean transformation xʹ = x − vt, yʹ=y, zʹ=z, tʹ=t. Thus, just as Lorentz did in 1892, Einstein began by essentially applying a Galilean transformation to the original “rest frame” coordinates to give an intermediate system of coordinates, although Einstein’s paper makes it clear that this is not an inertial coordinate system.


Now we consider the values of the τ coordinate of the k system as a function of xʹ,y,z,t for any stationary point in the k system. Suppose a pulse of light is emitted from the origin of the k system in the positive x direction at time τ0, it reaches the point corresponding to xʹ,y,z at time τ1, where it is reflected, arriving back at the origin of the k system at time τ2. This is depicted in the figure below.



Recall that the ξ, η, ζ, τ coordinates are defined as inertial coordinates, meaning that inertia is homogeneous and isotropic in terms of these coordinates. Also, all experimental evidence (such as all "the unsuccessful attempts to discover any motion of the earth relatively to the 'light medium'") indicates that the speed of light is isotropic in terms of any inertial coordinate system. Therefore, we have t1 = (t0 + t2)/2, so the τ coordinate as a function of xʹ,y,z,t satisfies the relation



Differentiating both sides with respect to the parameter xʹ, we get (using the chain rule)



Now, it should be noted here that the partial derivatives are being evaluated at different points, so we would not, in general, be justified in treating them interchangeably. However, Einstein has stipulated that the transformation equations are linear (due to homogeneity of space and time), so the partial derivatives are all constants and unique (for any given v). Simplifying the above equation gives



At this point, Einstein alludes to analogous reasoning for the y and z directions, but doesn’t give the details. Presumably we are to consider a pulse of light emanating from the origin and reflecting at a point xʹ = 0, y, z = 0, and returning to the origin. In this case the isotropy of light propagation in terms of inertial coordinates implies



In this equation we have made use of the fact that the y component of the speed of the light pulse (in terms of the K system) as it travels in either direction between these points, which are stationary in the k system, is (c2 – v2)1/2. Differentiating both sides with respect to y, we get



and therefore ∂τ/∂y = 0. The same reasoning shows that ∂τ/∂z = 0. Now the total differential of τ(xʹ,y,z,t) is, by definition



and we know the partial derivatives with respect to y and z are zero, and those with respect to xʹ and t are in a known ratio, so for any given v we can write



where a(v) is as yet an undetermined function. Incidentally, Einstein didn’t write this expression in terms of differentials, but he did state that he was “letting xʹ be infinitesimally small”, so he was essentially dealing with differentials. On the other hand, the distinction between differentials and finite quantities matters little in this context, because the relations are linear, and hence the partial derivatives are constants, so the differentials can be trivially integrated. Thus we have



Einstein then used this result to determine the transformation equations for the spatial coordinates. The ξ coordinate of a pulse of light emitted from the origin in the positive x direction is related to the τ coordinate by ξ = cτ (since experience has shown that light propagates with the speed c in all directions when expressed in terms of any system of inertial coordinates). Substituting for τ from the preceding formula gives, for the ξ coordinate of this light pulse, the expression



We also know that, for this light pulse, the parameters t and xʹ are related by t = xʹ/(c−v), so we can substitute for t in the above expression and simplify to give the relation between ξ and xʹ (both of which, we remember, are constants for any point at rest in k)



We can choose xʹ to be anything we like, so this represents the general relation between these two parameters. Similarly the η coordinate of a pulse of light emanating from the origin in the η direction is



but in this case we have xʹ = 0 and, as noted previously, t = y/(c2-v2)1/2, so we have



and by the same token



If we define the function



and substitute x – vt for xʹ, the preceding results can be summarized as



At this point Einstein observes that a sphere of light expanding with the speed c in terms of the x, y, z, t coordinates transforms to a sphere of light expanding with speed c in terms of the ξ, η, ζ, τ coordinates. In other words,



As Einstein says, this “shows that our two fundamental principles are compatible”, i.e., it is possible for light to propagate isotropically with respect to two relatively moving systems of inertial coordinates, provided we allow the possibility that the transformation from one inertial coordinate system to another is not exactly as Galileo and Newton surmised.


To complete the derivation of the Lorentz transformation, it remains to determine the function ϕ(v). Einstein considers a two-fold application of the transformation, once with the speed v in the positive x direction, and then again with the speed v in the negative x direction. The result should be the identity transformation, i.e., we should get back to the original coordinate system. (Strictly speaking, this assumes the property of “memorylessness”.) If we apply the above transformation twice, once with parameter v and once with parameter −v, each coordinate is ϕ(v)ϕ(−v) times the original coordinate, so we must have



Finally, Einstein concludes by “inquiring into the signification of ϕ(v)”. He notes that a segment of the η axis moving with speed v perpendicular to its length (i.e., in the positive x direction) has the length y = η/ϕ(v) in terms of the K system coordinates, and by “reasons of symmetry” (i.e., spatial isotropy) this must equal η/ϕ(−v), because it doesn’t matter whether this segment of the y axis is moving in the positive or the negative x direction. Consequently we have ϕ(v) = ϕ(−v), and therefore ϕ(v) = 1, so he arrives at the Lorentz transformation



This somewhat laborious and awkward derivation is interesting in several respects. For one thing, one gets the impression that Einstein must have been experimenting with various methods of presentation, and changed his nomenclature during the drafting of the paper. For example, at one point he says “a is a function ϕ(v) at present unknown”, but subsequently a(v) and ϕ(v) are defined as different functions. At another point he defines x′ as a Galilean transform of x (without explicitly identifying it as such), but subsequently uses the symbol x′ as part of the inertial coordinate system resulting from the two-fold application of the Lorentz transformation. In addition, he somewhat tacitly makes use of the invariance of the light-like relation x2 + y2 = c2t2 in his derivation of the transformation equations for the y coordinate, but doesn’t seem to realize that he could just as well have invoked the invariance of x2 + y2 + z2 = c2t2 to make short work of the entire derivation. Instead, he presents this invariance as a consequence of the transformation equations – despite the fact that he has tacitly used the invariance as the basis of the derivation (which of course he was entitled to do, since that invariance simply expresses his “light principle”).


Perhaps not surprisingly, some readers have been confused as to the significance of the functions a(v) and ϕ(v). For example, in a review of Einstein’s paper, A. I. Miller writes


Then, without prior warning Einstein replaced a(v) with ϕ(v)/(1-(v/c)2)1/2… But why did Einstein make this replacement? It seems as if he knew beforehand the correct form of the set of relativistic transformations… How did Einstein know that he had to make [this substitution] in order to arrive at those space and time transformations in agreement with the postulates of relativity?


This suggests a misunderstanding, because the substitution in question is purely formal, and has no effect on the content of the equations. The transformations that Einstein had derived by that point, prior to replacing a(v), were already consistent with the postulates of relativity (as can be verified by substituting them into the Minkowski invariant). It is simply more convenient to express the equations in terms of ϕ(v), which is the entire coefficient of the transformations for y and z. One naturally expects this coefficient to equal unity.


Even aside from the inadvertent changes in nomenclature, Einstein’s derivation is undeniably clumsy, especially in first applying what amounts to a Galilean transformation, and then deriving the further transformation needed to arrive at a system of inertial coordinates. It seems clear that he was influenced by Lorentz’s writings, even to the point of using the same symbol b for the quantity 1/(1-(v/c)2)1/2, which Lorentz used in his 1904 paper. (Surprisingly, years later Einstein wrote to Carl Seelig that in 1905 he had known only of Lorentz’s 1895 paper, but not his subsequent papers, and none of Poincare’s papers on the subject.)


In a review article published in 1907 Einstein had already adopted a more economical derivation, dispensing with the intermediate Galilean system of coordinates, and making direct use of the lightlike invariant expression, similar to the standard derivation presented in most introductory texts today. To review this now standard derivation, consider (again) Einstein’s two systems of inertial coordinates K and k, with coordinates denoted by (x,y,z,t) and (ξ,η,ζ,τ) respectively, and oriented so that the x and ξ axes coincide, and the xy plane coincides with the ξη plane. Also, as before, the system k is moving in the positive x direction with fixed speed v relative to the system K, and the origins of the two systems momentarily coincide at time t = τ = 0.


According to the principle of homogeneity, the relationship between the two sets of coordinates must be linear, so there must be constants A1 and A2 (for a given v) such that ξ = A1x + A2 t. Furthermore, if an object is stationary relative to k, and if it passes through the point (x,t) = (0,0), then it's position in general satisfies x = vt, from the definition of velocity, and the ξ coordinate of that point with respect to the k system is 0. Therefore we have ξ = A1(vt) + A2t = 0. Since this must be true for non-zero t, we must have A1v + A2 = 0, and so A2 = -A1v. Consequently, there is a single constant A (for any given v) such that ξ = A(x-vt). Similarly there must be constants B and C such that η = By and ζ = Cz. Also, invoking isotropy and homogeneity, we know that τ is independent of y and z, so it must be of the form τ = Dx + Et for some constants D and E (for a given v). It only remains to determine the values of the constants A, B, C, D, and E in these expressions.


Suppose at the instant when the spatial origins of K and k coincide a spherical wave of light is emitted from their common origin. At a subsequent time t in the first frame of reference the sphere of light must be the locus of points satisfying the equation



and likewise, according to our principles, in the second frame of reference the spherical wave at time t must be the locus of points described by



Substituting from the previous expressions for the k coordinates into this equation, we get



Expanding these terms and rearranging gives



The equality of the speed of light in terms of both systems of coordinates implies that an expanding spherical wave of light in one system is also an expanding spherical wave of light in the other system, so the coefficients of equation (3) must be proportional to the coefficients of equation (1). Strictly speaking, the constant of proportionality is arbitrary, representing a simple re-scaling, so we are free to impose an additional condition, namely, that the transformation with parameter +v followed by the transformation with parameter –v yields the original coordinates, and by the isotropy of space these two transformations, which differ only in direction, must have the same constant of proportionality. Thus the corresponding coefficients of equations (1) and (3) must not only be proportional, they must be equal, so we have



Clearly we can take B = C = 1 (rather than −1, since we choose not to reflect the y and z directions). Dividing the 4th of these equations by 2, we're left with the three equations in the three unknowns A, D, and E:



Solving the first equation for A2 and substituting this into the 2nd and 3rd equations gives



Solving the first for E and substituting into the 2nd gives a single quadratic equation in D, with the roots



Substituting this into either of the previous equations and solving the resulting quadratic for E gives



Note that the equations require opposite signs for D and E. Now, for small values of v/c we expect to find E approaching +1 (as in Galilean relativity), so we choose the positive root for E and the negative root for D. Finally, from the relation A2 − c2 D2 = 1 we get



Again selecting the positive root, we have the Lorentz transformation



Naturally with this transformation we can easily verify that



so this quantity is the squared "absolute distance" from the origin to the point with K coordinates (x,y,z,t) and the corresponding k coordinates (ξ,η,ζ,τ), which confirms that the absolute spacetime interval between two points is the same in both frames. Notice that equations (1) and (2) already implied this relation for null intervals. In other words, the original premise was that if x2 + y2 + z2 - c2t2 equals zero, then ξ2 + η2 + ζ2 − c2τ2 also equals zero. The above reasoning show that a consequence of this premise is that, for any arbitrary real number s2, if x2 + y2 + z2 - c2t2 equals s2, then ξ2 + η2 + ζ2 − c2τ2 also equals s2. Therefore, this quadratic form represents an absolute invariant quantity associated with the interval from the origin to the event (x,y,z,t).


Given the Lorentz transformation it is easy to determine the full velocity composition law for two systems of aligned coordinates K and k, the latter moving in the positive x direction with velocity v relative to the former. We can without loss of generality make the origins of the two systems both coincide with a point P0 on the subject worldline, and let P1 denote a subsequent point on that worldline with K system coordinates dt,dx,dy,dz. The velocity components of that worldline with respect to K are ux = dx/dt, uy = dy/dt, and uz = dz/dt. The coordinates of P1 with respect to the k system are given by the Lorentz transformation (in units with c = 1) for a simple boost v in the x direction:



Therefore, the velocity components of the worldline with respect to the k system are



This illustrates the value of Einstein’s “more practical arrangement”, based on the recognition that inertial coordinate systems are related by Lorentz transformations. From this single dynamical premise, many results follow by pure kinematics.


Return to Table of Contents