|
Did Einstein
Misunderstand Aberration? |
|
|
|
Tony Rothman's 2003 book entitled "Everything's Relative" presents an interesting variation on this familiar theme. Rothman agrees with the anti-relativityist claim that Einstein's treatment of aberration was faulty, but he (Rothman) exonerates special relativity itself, and concludes instead that Einstein simply "didn't understand his own theory". Furthermore, he alleges that "Einstein's mistake" was repeated by Pauli and a host of others. |
|
|
|
Rothman has alluded to this subject more than once. For example, in response to comments on another of his essays ("Lost in Einstein's Shadow") in "The New Scientist", he wrote |
|
|
|
There are mistakes in the 1905 relativity paper, including a serious conceptual error regarding the bending of starlight ("aberration"), one of the phenomena Einstein created his theory to explain. |
|
|
|
It’s ironic to find the phrase “serious conceptual error” followed so closely by a description of aberration as “the bending of starlight”, considering that aberration doesn’t involve “bending”. In any case, Rothman’s charge is particularly significant in view of the fact that, according to E. T. Whittaker (a well-known proponent of the view that special relativity was chiefly due to Lorentz and Poincare) the relativistic Doppler/aberration formula was Einstein’s only noteworthy original contribution to special relativity. Moreover, it was the Doppler/aberration effects that led Einstein to the mass-energy equivalence relation (E = mc2), and that also confirmed the correspondence between energy and frequency (E = hn) of light quanta. |
|
|
|
Is Rothman correct? Did Einstein misunderstand and/or
mis-represent stellar aberration in his 1905 paper? One point in
Rothman's argument can be answered immediately. He says “no one recognizes
the rather large blunder in Einstein’s exposition” because "no one
bothers to read the original papers any more". But surely this is preposterous.
Einstein's paper "On the Electrodynamics of Moving Bodies" has been
and continues to be one of the most avidly scrutinized and dissected
documents ever written. Whole books have been devoted to parsing its every
word. (Has any physicist not read it?) If the paper contains a
"serious conceptual error" regarding one of its central results,
and if this has gone unnoticed by scientists and scholars for 100 years, it cannot
be attributed to neglect of the original text. |
|
Turning to the substantive issue, the specific aspect of stellar aberration that Rothman believes Einstein misunderstood is the physical meaning of the velocity “v” appearing in the formulas. There seems to have been some modulation of Rothman’s view of Einstein’s actual culpability, because in the book “Everything’s Relative” he says “Einstein clearly assumes that the velocity involved is the relative velocity between the Earth and the star”, whereas in the paper “Reference Frames for Stellar Aberration” he is more circumspect, saying only that |
|
|
|
Einstein gets aberration from the Doppler shift and does not further define the velocity in his equation, leading one to wonder whether his own lack of clarity was the source of the subsequent confusion. |
|
|
|
Presumably Rothman was trying to say that Einstein derives both the Doppler shift and the aberration formula from the same transformation, but the above statement leads me to wonder if Rothman was describing himself when he said “no one bothers to read the original papers any more”. Einstein was actually fairly careful and explicit about the relevant value of v appearing in the Doppler and aberration formulas. He wrote |
|
|
|
In the system K, very far from the origin of coordinates, let there be a source of electrodynamic waves, which in a part of space containing the origin of coordinates may be represented to a sufficient degree of approximation by the equations [A = A0 sin(F)] where |
|
|
|
|
|
|
|
Here [A] are the vectors defining the amplitude of the wave train, and l,m,n the direction-cosines of the wave normals. We wish to know the constitution of these waves when they are examined by an observer at rest in the moving system k [defined previously as moving at the speed v in the positive x direction with respect to K]. |
|
|
|
(For brevity, I’ve used “A” to denote the components of the electric and magnetic vectors, which Einstein spelled out explicitly.) As an aside, notice that Einstein does not say the source of the waves is at rest in K, whereas he does explicitly stipulate that the observer is at rest in k. Of course, the source of the waves may be at rest in K, but if we apply this interpretation we should also stipulate that it is continually (not just momentarily) at rest in K. In any case, this has no bearing on Einstein’s derivation. He defines a plane wave near the origin in terms of the K coordinate system (x,y,z,t), giving both its frequency w and the direction-cosines l,m,n of the wave normals with respect to these coordinates, and then proceeds to determine the frequency w’ and direction-cosines l’,m’,n’ with respect to the k coordinate system (x,h,z,t) which is moving at the speed v relative to K. He does this simply by applying the appropriate transformations to the field components and the space and time coordinates, giving the result |
|
|
|
|
|
|
|
where |
|
|
|
|
|
and |
|
|
|
|
|
The is clear and unambiguous. A single transformation determines both the Doppler shift and the aberration effect. In neither case does the position or state of motion of the source (at the moment of observation) enter into consideration, although we may assume for convenience that it is (continually) at rest relative to K. Einstein simply describes how a plane wave at a certain time near the origin, characterized by a certain frequency and direction with respect to one system of inertial coordinates (K), is characterized by a different frequency and direction with respect to another system of inertial coordinates (k), given that the latter is moving with velocity v relative to the former. This is the only meaning of the “v” appearing in these formulas. We can use these formulas to determine the aberration between any two inertial coordinate systems. This is a perfectly clear and correct derivation of relativistic aberration. |
|
|
|
Of course, the attributes of a plane wave near the origin at the given time depend on the position and state of motion of the source at some earlier time (when the light was emitted), and it’s conventional when defining “the” aberration angle to stipulate that the source has been at rest in K throughout that interval of time, but this has no bearing on the derivation as Einstein has described it. The velocity v is explicitly stated to be the observer’s velocity with respect to the frame of reference K in terms of which the plane wave takes a specified form near the origin. This is unobjectionable, and lends no support to the claim that there is “a serious conceptual error” in Einstein’s understanding of aberration. |
|
|
|
Notice that if we assume the source is (and has been) continually at rest in K, then v is also the velocity of the observer relative to the rest frame of the source, but this is just a straightforward logical implication. The assertion that {If the source is continually at rest in K, then the plane wave has the stated form near the origin at the given time} does not imply the converse {If the plane wave has the stated form at the origin at the given time, then the source is continually at rest in K}. If someone is confused about this, then their confusion is over simple logic, not over a subtle point of special relativity. I see no evidence that Einstein was confused about this. |
|
|
|
Given the above results describing how a plane wave transforms from one system of coordinates to another, Einstein enunciates a corollary, saying |
|
|
|
…it follows that if an observer moving with velocity v relatively to an infinitely distant source of light… in such a way that the connecting line “source-observer” makes an angle f with the velocity of the observer referred to a system of coordinates which is at rest relatively to the source of light… [and] if we call the angle between the wave normal (direction of the ray) in the moving system and the connecting line “source-observer” f’, the equation for l’ assumes the form |
|
|
|
This equation expresses the law of aberration in its most general form. |
|
|
|
The symbols l and l’ were already defined as the direction cosines of the wave normal with respect to the two relatively moving systems of coordinates, so there is nothing new in this equation. Einstein has merely re-stated the previously derived equation, writing the direction cosines explicitly in terms of the angles. |
|
|
|
However, there does appear to be a problem here, because f is the angle between the light ray and the observer’s velocity, both evaluated with respect to K, and f’ ought to be this same angle evaluated with respect to k, whereas the words define f’ as the angle between the light ray and the “connecting line source-observer”. (I bolded these words in the quote above). This doesn’t make any sense, and looks like a transcription error, especially because the words “connecting line source-observer” appear in the immediately preceding sentence as well. The assessment of this as a typo is confirmed by the existence of a pre-print of Einstein’s 1905 paper (in possession of G. Holton) showing some corrections in his own hand, and the phrase “connecting line light source-observer” was crossed out and replaced with “direction of motion”. The page also shows another correction, one that actually found its way into the later published versions, but the above correction evidently did not. In fact, it wasn’t included in the widely re-printed “Meuthen/Dover” collection (1923), although it has been noted in other editions, including those of A. I. Miller, John Stachel, and in Einstein’s collected papers. Note also that in Einstein’s review paper of 1907, containing essentially the same derivation, the typo doesn’t appear. |
|
|
|
This typo in the 1905 paper surely doesn’t suggest a “serious conceptual error”, and it is not mentioned by Rothman, et al. In fact, the “Reference Frames” paper doesn’t actually examine Einstein’s paper, so we can only surmise which precise statements they considered objectionable. Presumably they would argue that Einstein blundered in the above passage by specifying the observer’s velocity v relative to the source of light (which is to say, relative to the inertial coordinates in terms of which the source is at rest). Of course, the velocity v has also been clearly specified as being the velocity of k relative to the frame K in terms of which the local plane wave normal direction makes an angle of f with the observer’s velocity. This is the operative definition of v. It is merely convenient to stipulate that the source is continually at rest in K, so that the direction of the plane normal coincides with the direction of the “source-observer” line as judged from K. |
|
|
|
As is the case throughout Einstein’s paper, unless specifically stated otherwise, objects are considered to be moving inertially, and to have been moving inertially throughout the relevant time period. Hence it is entirely consistent with the rest of the paper to stipulate that the source is un-accelerated for at least the duration of the transit of light from the source to the observer. This is the only way in which it makes sense to categorically identify an object with a system of inertial coordinates, since the latter covers the entirety of space and time. The important point is that, if the source is (and has been since the emission of the light presently reaching the observer) in inertial motion, then the relevant velocity is indeed the velocity of the observer with respect to the rest frame of the source. This ceases to be the case only if the source of light has accelerated during the transit of the light. For example, the components of a binary star system are constantly accelerated, so the direction of the source-observer line at any given time is very different from the direction of the current wave normal as judged from the momentary rest frame of an arbitrarily distant source of light. This does not contradict anything Einstein has said. His description applies to an inertially moving source, which is exactly the context in which it is presented. |
|
|
|
Furthermore, the derivation makes it quite clear that f and f’ are defined as the angles in the frames K and k (respectively) between the velocity of k and the wave normal at a specified location, both evaluated with respect to K. Naturally it isn’t necessary for either K or k to be the rest frame of either the source or the observer. The aberration formula simply relates the specified angle as evaluated with respect to two different systems of inertial coordinates. It is only by convention that we typically define aberration with K and k taken as the rest frames of the source and the receiver, respectively, but this convention (dating back to when Bradley first discovered aberration in 1727, when stars were thought to be stationary) assumes the source is unaccelerated during the transit of light. In that case the wave normal and the line from source to receiver (with respect to the rest frame of the source) are the same. Needless to say, if the source is accelerating, these two directions may be different. This fact is not unique to relativistic theory, it is basic kinematics. |
|
|
|
By the way, I emphasized the words “infinitely distant” in the above passage, because this condition ensures that any transverse motion of the source has no effect on the direction of the “source-observer” line. Thus the direction of the light ray is unambiguous, independent of any bounded transverse oscillatory motions of the source. The worst we can say about Einstein’s statement of the aberration law is that he expressed it in terms of inertial sources, but this is no evidence of a misunderstanding. The paper is, after all, concerned with the relationship between descriptions of physical processes in terms of relatively moving systems of inertial coordinates. In only one case (the accelerated clock at the equator versus the pole) did Einstein explicitly address accelerated motion in this paper, but the application is straightforward. It’s also worth noting that binary star systems are not the only sources for which the insignificance of bounded oscillatory motion is pertinent, because the same issue arises when considering the agitated motions of the individual atoms comprising any source of electromagnetic waves, not to mention the rotations of macroscopic objects, and so on. Again, this consideration is not unique to relativity. The same kinematic effects must be taken into account when working with pre-relativistic theories of aberration as well. Hence if someone fails to understand this, they are not failing to understand special relativity, they are failing to understand basic kinematics. |
|
|
|
In summary, it seems to me that Einstein’s analysis of aberration in his 1905 paper on special relativity is unobjectionable – in fact, it’s fairly elegant, especially in comparison with all earlier (and many later) treatments. Of course, the subjects of relativistic aberration, transverse Doppler, etc., are somewhat subtle, especially if we introduce accelerations of the source, and it’s certainly possible for a reader to become confused when they attempt to apply the formulas for themselves. However, it’s inappropriate to project that confusion onto the paper’s author. |
|
|
|
Ironically, after declaring Einstein guilty of a “serious conceptual error” and of not understanding his own theory as it pertains to aberration, the description of aberration advanced by Rothman, et al, is (in my opinion) not notably perspicuous. They conclude rather laboriously that |
|
|
|
…the parameter v in the Lorentz transformation is interpreted as the difference between the velocity of the earth at the different points in spacetime in which the observations are made. |
|
|
|
Unfortunately this conflates the (albeit conventional) concept of aberration with observed changes in aberration - the issue discussed previously. If an observer alternately slows down and speeds up along a certain line of travel, and the aberration consequently varies between (say) 4 and 5 degrees, then according to Rothman the aberration is actually 1 degree. Clearly the word “aberration” is being used in two different senses. To unravel this, we must ask what it means to say the aberration is varying between 4 and 5 degrees. Absolute aberration is conventionally defined as the difference in angles with respect to the two coordinate systems in terms of which the source and the receiver (respectively) are “at rest”. Bear in mind that aberration applies in all circumstances, not just for light emitted by stars (which Einstein never explicitly mentions), and in many circumstances both the source and the receiver are in known states of fairly uniform motion over the length of time required for the transit of light between them. Of course, at the microscopic level, all real sources of light are accelerating, so the source of light is never continuously at rest with respect to any inertial coordinate system. In view of this, it might seem as if the conventional definition of aberration has no applicability. However, the reason the conventional definition exists is that, in fact, it has a very wide range of useful applicability. |
|
|
|
Quite generally, sources of light are found to move uniformly on average over some suitable time scale. If a light source is undergoing bounded oscillations with a period that is short compared with the transit time of the light, then the “rest frame of the source” can be well defined in terms of the average state of motion over those oscillations. This is true whether we are talking about the agitated motions of the molecules on the surface of a single radiating body, or the orbital motions of the components of a binary star system. By convention, this is how absolute aberration (as opposed to changes in aberration) is defined. Hence, if we’re trying to express aberration in terms that are recognizable to readers familiar with the conventional definition (as Einstein was trying to do), the velocity parameter v in the aberration equation is to be interpreted as the velocity of the observer with respect to the inertial coordinates in terms of which the source is “at rest” on average over the time scale of the light transit time. Notice that, for example, aberration of the light from one component of a binary star system as judged from a planet orbiting that star would be defined relative to the rest frame of the star itself during a small portion of its orbit, whereas judged from the Earth many light years away it is defined relative to the center of mass of the binary system. The time scales for the light transit are very different in these two cases, so the mean motion of the source is different. |
|
|
|
Admittedly it isn’t logically necessary to define absolute aberration, recognizing that it has no intuitive meaning for objects that are secularly accelerating. We can merely compute how directions are transformed under changes in reference systems (which is after all what Einstein’s derivation does), but in nearly all applications it is both convenient and meaningful to define absolute aberration based on the relative velocity between source and observer, with the understanding that bounded oscillations in the source’s motion, with periods that are small compared with the transit time of the light, are to be averaged out. This applies to all kinds of bounded oscillations, from sub-atomic vibrations to galactic rotations. On this basis, absolute aberration (as conventionally defined) intuitively represents the angle between the actual and the perceived direction of the light. Of course, there are really two different conventions, depending on whether we take the source or the observer to be at rest. This leads to the crucially important fact in special relativity that “transverseness” of motion is not transitive. In a sense, the equivalence of mass and energy can be traced directly to this fact – which of course did not escape Einstein’s notice, and led to a brief follow-up paper in September of 1905. This is why the charge that Einstein misunderstood relativistic aberration strikes me as so absurd. The profound understanding of the relativistic aberration/Doppler effect and its implications was arguably his greatest achievement. |
|
|
|
Regarding the charge that subsequent expositors of special relativity, notably Pauli, have “repeated Einstein’s mistake”, a review of these expositions shows that the confusion is over the meaning of the phrase “Consider a distant source of light at rest in system K”. When Pauli (for example) says this, we are to understand that the source is continually, not just momentarily, at rest in the system K. On this basis, everything that follows is perfectly correct. It would perhaps have been more clear had he included the word “continually”, but in discussions of special relativity when an object is identified with a reference frame it is usually understood that the object is unaccelerated for the duration in question, unless specifically stated otherwise. Readers are expected to realize that the wave normal (with respect to K) at some distance away from the source is parallel to the radial direction from the source in K only if the source has been at rest in K during the transit of the wave. Again, this is basic kinematics and geometry, not some abstruse subtlety of relativistic aberration. |
|
|