3.9 Constructing the Principles

Hypotheses are only the pieces of scaffolding which are erected round a building during the course of its construction, and which are taken away as soon as the edifice is completed. To the workman they are indispensable, but he must be careful not to confound the scaffolding with the building.

Goethe, 1832

In an essay on the history of physics included in the book “The Value of Science” published in 1904, Poincare described how, following Newton’s success with celestial mechanics, the concept of central forces acting between material particles was used almost exclusively as the basis for constructing physical theories (the exception being Fourier’s theory of heat). Poincare expressed an appreciation for this constructive approach to physics.

This conception was not without grandeur; it was seductive, and many among us have not finally renounced it; they know that one will attain the ultimate elements of things only by patiently disentangling the complicated skein that our senses give us; that it is necessary to advance step by step, neglecting no intermediary; that our fathers were wrong in wishing to skip stations; but they believe that when one shall have arrived at these ultimate elements, there again will be found the majestic simplicity of celestial mechanics.

Poincare then proceeded to a section called “The Physics of Principles”, where he wrote:

Nevertheless, a day arrived when the conception of central forces no longer appeared sufficient… What was done then? The attempt to penetrate into the detail of the structure of the universe, to isolate the pieces of this vast mechanism, to analyse one by one the forces which put them in motion, was abandoned, and we were content to take as guides certain general principles, the express object of which is to spare us this minute study…

He then listed what he regarded as the most important general principles for physics: conservation of energy, Carnot’s principle of entropy increase, Newton’s principle of equal action and reaction, the principle of relativity, conservation of mass, and the principle of least action.

These principles are results of experiments boldly generalized; but they seem to derive from their very generality a high degree of certainty. In fact, the more general they are, the more frequent are the opportunities to check them, and the verifications multiplying, taking the most varied, the most unexpected forms, end by no longer leaving place for doubt… Thus they came to be regarded as experimental truths; the conception of central forces became then a useless support, or rather an embarrassment, since it made the principles partake of its hypothetical character.

Einstein is known to have been an avid reader of Poincare’s writings, so it seems likely that he would have read Poincare’s views about “the physics of principles” as distinct from constructive theories. Following the sensational reports of the eclipse observations of 1919 Einstein instantly became a world-wide celebrity, and suddenly there was intense public interest in everything having to do with “Einstein’s theory”. The London Times asked him to explain his mysterious theory to its readers. He accommodated with a short essay that is notable for its description of what he regarded as two fundamentally different kinds of physical theories, echoing Poincare’s previous discussion. Einstein wrote:

We can distinguish various kinds of theories in physics. Most of them are constructive. They attempt to build up a picture of the more complex phenomena out of the materials of a relatively simple formal scheme from which they start out. Thus the kinetic theory of gases seeks to reduce mechanical, thermal, and diffusional processes to movements of molecules -- i.e., to build them up out of the hypothesis of molecular motion. When we say that we have succeeded in understanding a group of natural processes, we invariably mean that a constructive theory has been found which covers the processes in question.

Along with this most important class of theories there exists a second, which I will call "principle-theories." These employ the analytic, not the synthetic, method. The elements which form their basis and starting-point are not hypothetically constructed but empirically discovered ones, general characteristics of natural processes, principles that give rise to mathematically formulated criteria which the separate processes or the theoretical representations of them have to satisfy. Thus the science of thermodynamics seeks by analytical means to deduce necessary conditions, which separate events have to satisfy, from the universally experienced fact that perpetual motion is impossible.

The advantages of the constructive theory are completeness, adaptability, and clearness, those of the principle theory are logical perfection and security of the foundations. The theory of relativity belongs to the latter class.

He mentions three sets of alternative characteristics, all treated as representing essentially the same dichotomy. We're told that constructive theories proceed synthetically on the basis of hypothetical premises, whereas principle theories proceed analytically on the basis of empirical premises. He cites statistical thermodynamics as an example of a constructive theory, and classical thermodynamics as an example of a principle theory. His view of these two different approaches to thermodynamics may have been influenced by the debate concerning the reality of atoms, which Mach disdainfully called the "atomistic doctrine". The idea that matter is composed of finite irreducible entities was regarded as purely hypothetical, even in the late 1800’s. Einstein himself spent a great deal of time and effort trying to establish the reality of atoms. This was the explicit motivation for his paper on Brownian motion. Within this context, it's not surprising that he classified the premises of statistical thermodynamics as purely hypothetical, and the development of the theory as synthetic.

However, in another sense, it could be argued that the idea of atoms actually arises empirically, and represents an extreme analytic approach to observed phenomena. Literally the analytic method is to "take apart" the subject into smaller and smaller sub-components, until arriving at the elementary constituents. We regard macroscopic objects not as indivisible wholes, but as composed of sub-parts, each of which is composed of still smaller parts, and we continue this process of analysis at least until we can no longer directly resolve the sub-parts (empirically) into smaller entities. At this point we may resort to some indirect methods of inference to carry on the process of empirical analysis. Indeed, Einstein's work on Brownian motion did exactly this, insofar as he was attempting to analyze the smallest directly observable entities, and to infer, based on empirical observations, an even finer level of structure. It was apparently Einstein's view that, at this stage, a reversal of methodology is required, because direct observation no longer provides unique answers, and thus the inferences are necessarily indirect, i.e., they can only be based on a somewhat free hypothesis about the underlying structure, and then synthetically working out the observable implications of this hypothesis and comparing these with what we actually observe.

So Einstein's conception of a constructive (hypothetically based, synthetic) physical theory was of a theory arrived at by hypothesizing or postulating some underlying structure (consistent with all observations, of course), and then working out the logical consequences of those postulates to see how well they account for the whole range of observable phenomena. At this point we might expect Einstein to classify special relativity as a constructive theory, because it's well known that the whole theory of special relativity - with all its observable consequences - can be constructed synthetically based on the exceedingly elementary hypothesis that energy has inertia, from which it follows that inertial coordinate systems are related by Lorentz transformations. However, Einstein's whole point in drawing the distinction between constructive theories and principle theories was to argue that relativity is not a constructive theory, but is instead a theory of principle. He took the reverse order of implication, by postulating that inertial coordinate systems are related by Lorentz transformations, and then inferring the inertia of energy.

It's clear that Einstein's original conception of special relativity was based on the model of classical thermodynamics, even to the extent that he proposed exactly two principles on which to base the theory, consciously imitating the first and second laws of thermodynamics. Some indication of the ambiguity in the classification scheme can be seen in the various terms that Einstein applied to these two propositions. He variously referred to them as postulates, principles, stipulations, assumptions, hypotheses, definitions, etc. Now, recalling that a "constructive theory" is based on hypotheses, whereas a "principle theory" is based on principles, we can see that the distinction between principles and postulates (hypotheses) is significant for correctly classifying a theory, and yet Einstein was not very careful (at least originally) to clarify the actual role of his two foundational propositions.

Nevertheless, he consistently viewed special relativity as a theory of principle, with the invariance of light speed playing a role analogous to the conservation of energy in classical thermodynamics, both regarded as high-level empirical propositions rather than low-level elementary hypotheses. Indeed, it's possible to make this more than just an analogy, because in place of the invariance of light speed (with respect to all inertial coordinate systems) we could just as well posit conservation of total mass-energy (with the conversion E = mc²), and use this conservation, together with the original principle of relativity (essentially carried over from Newtonian physics), as the basis for special relativity. In his 1949 autobiographical notes (which he wryly called his "scientific obituary") Einstein wrote that

Gradually I despaired of the possibility of discovering the true laws by means of constructive efforts based on known facts. The longer and the more desperately I tried, the more I came to the conviction that only the discovery of a universal formal principle could lead us to assured results…

Here Einstein refers to "constructive efforts based on known facts", whereas in the 1919 article he indicated that constructive theories are based on "a relatively simple formal scheme" such as the hypothesis of molecular motion (i.e., the atomistic doctrine that Mach, for one, rejected as unempirical), and principle theories are based on empirical facts. In other words, the distinguishing characteristics that Einstein attributed to the two kinds of theories have been reversed. This illustrates one of the problematic aspects of Einstein's classification scheme: every (nominally viable) theory is based on what might be called known facts, i.e., is it connected to empirical results, and our principles (e.g., conservation laws) are also distillations of empirical results. Einstein was certainly well aware of this, as shown by the following comment in defense of his methodological approach:

A basic conceptual distinction, which is a necessary prerequisite of scientific and pre-scientific thinking, is the distinction between "sense-impressions" (and the recollection of such) on the one hand and mere ideas on the other. There is no such thing as a conceptual definition of this distinction (aside from circular definitions, i.e., of such as make a hidden use of the object to be defined). Nor can it be maintained that at the base of this distinction there is a type of evidence, such as underlies, for example, the distinction between red and blue. Yet, one needs this distinction in order to be able to overcome solipsism.

In view of this, what ultimately is the distinction between what Einstein called constructive theories and principle theories? It seems that the distinction can only be based on the conceptual level of the hypotheses, so that constructive theories are based on "low level" hypotheses, and principle theories based on "high level" hypotheses. In this respect the original examples (classical thermodynamics and statistical thermodynamics) cited by Einstein are probably the clearest, because they represent two distinct approaches to essentially the same subject matter. In a sense, they can be regarded as just two different interpretations of a single theory (much as special relativity and Lorentz's (amended) ether theory can be seen as two different interpretations of the same theory). Now, statistical thermodynamics was founded on hypotheses − such as the existence of atoms − that may be considered "low level", whereas the hypothesis of energy conservation in classical thermodynamics can plausibly be described as "high level". On the other hand, the premises of statistical thermodynamics include the idea that the molecules obey certain postulated equations of motions (e.g., Newton's laws) which are essentially just expressions of conservation principles, so the "constructive" approach differs from the "theory of principle" only in so far as its principles are applied to very low-level entities. The conservation principles are explicitly assumed only for elementary molecules in statistical thermodynamics, and then they are inferred for high-level aggregates like a volume of gas. In contrast, the principle theory simply observes the conservation of energy at the level of gases, and adopts it as a postulate.

In the case of special relativity, it's clear that Einstein originally developed the theory from a "high-level" standpoint, based on the observation that light propagates at the same speed with respect to every system of inertial coordinates. He himself felt that a constructive model or interpretation for this fact was lacking. In January of 1908 he wrote to Sommerfeld

A physical theory can be satisfactory only if its structures are composed of elementary foundations. The theory of relativity is ultimately just as unsatisfactory as, for example, classical thermodynamics was before Boltzmann interpreted entropy as probability.

Just eight months later, Minkowski delivered his famous lecture at Cologne, in which he showed how the theory of special relativity follows naturally from just a simple fundamental hypothesis about the physical metric of space and time. In one sense, there can hardly be a lower conceptual level than this, i.e., some assumption about the physical metric of space and time is seemingly a pre-requisite for any description − scientific or otherwise − of the phenomena of our experience. Kant even went further, and suggested that one particular metrical structure (Euclidean) was a sina qua non of rational thought. We no longer subscribe to such a restrictive view, and it may even be possible to imagine physical ideas prior to any spatio-temporal conceptions, but nevertheless the fact remains that such conceptions are among the most primitive that we possess. For example, the posited structure of space and time is more primitive than the notion of atoms moving in a void, because we cannot even conceive of "moving in a void" without some idea of the structure of space and time. Hence, if a physical theory can be based entirely on nothing other than the hypothesis of one simple form for the natural measures of space and time, such a theory could be regarded as "constructive". From this point of view, we might say that Minkowski’s spacetime interpretation does for special relativity what Boltzmann’s statistical interpretation did for thermodynamics, namely, it provided an elementary constructive foundation for the theory.

Einstein was not immediately very appreciative of his former instructor's contribution, describing it as "superfluous learnedness", and joking that "since the mathematicians have tackled the relativity theory, I myself no longer understand it any more". He seems to have been at least partly serious when he later said "The people in Gottingen [where both Minkowski and Hilbert resided] sometimes strike me not as if they wanted to help one formulate something clearly, but as if they wanted only to show us physicists how much brighter they are than we". Of course, Einstein's appreciation subsequently increased when he found it necessary to use Minkowski's conceptual framework in order to develop general relativity. Still, even in his autobiographical notes, Einstein gives no indication that he thought Minkowski’s approach represented a profound transformation of special relativity.

Minkowski's important contribution to the theory lies in the following: Before Minkowski's investigation it was necessary to carry out a Lorentz transformation on a law in order to test its invariance under Lorentz transformations; but he succeeded in introducing a formalism so that the mathematical form of the law itself guarantees its invariance under Lorentz transformations.

In other words, Minkowski's contribution was merely the introduction of a convenient mathematical formalism. Einstein then added, almost as an afterthought,

He [Minkowski] also showed that the Lorentz transformation (apart from a different algebraic sign due to the special character of time) is nothing but a rotation of the coordinate system in the four-dimensional space.

Actually it was Poincare who first noted (in his Palermo paper, published in 1906) that the Lorentz transformation is simply a rotation in the four-dimensional space consisting of the three space dimensions and an imaginary time dimension. Poincare had also noted the invariance of the quadratic form x²+ y²+ z²− c²t², although he didn’t fully develop the pseudo-metrical structure to the extent that Minkowski did. Whether this represents a profound insight or a convenient formalism is debatable.

The modern theory of special relativity passed through several phases of development. The first step (1892-1904) was the discovery of the Lorentz invariance of Maxwell’s equations, which might be regarded as constructive scaffolding. However, it should be remembered that those equations, as received by Maxwell, were not Lorentz invariant. They became Lorentz invariant only with the addition of the “displacement current” to Ampere’s law. Whether this term was added constructively or as a matter of principle is unclear. Some argue that it arose constructively from a mechanistic model of the luminiferous ether that Maxwell had in mind at the time (and later abandoned), while others argue that he was motivated by the fact that the term is necessary to satisfy the principle of charge conservation (which of course it is). Even if we stipulate that the Lorentz invariance of electromagnetism was discovered constructively, this by itself does not constitute a theory of relativity, because on this basis we would expect a divergence between the electromagnetic and the mechanical invariance groups. It was only the failure of all attempts to discern any such difference that motivated the sweeping high-level postulate of complete relativity, i.e., Lorentz invariance for all physical effects, including mechanical inertia. At the time, this might have been seen as a constructive step, because some researchers thought all inertia might be electromagnetic in origin, but we now know this is not the case. Thus the only valid warrant Lorentz, et al, had for postulating Lorentz invariance of mechanical inertia was the relativity principle itself. Although often advertised as a constructive theory, Lorentz’s theory actually achieved its mature form (and consistency with all experimental results) only by positing the principle of relativity, albeit in a convoluted way that obscured the foundations of the theory.

The second phase was Einstein's explicit theory of principle (1905), in which he isolated the small set of premises sufficient to imply the Lorentz invariance of all physical phenomena, highlighted the complete symmetry between relatively moving systems of inertial coordinates (rendering superfluous the stipulation of a hypothetical ether frame), and connected this with the inertia of all forms of energy, showing how all of this follows from a clear understanding of the inertial measures of space and time. With the latter he unified two of Poincare’s original fundamental principles, conservation of energy and conservation of mass, into one. In fact, excluding Carnot’s principle, one could argue that special relativity unifies all of the principles that Poincare listed. The inertia of energy, which was noticed only belatedly by Einstein in a follow-up paper in September of 1905, could plausibly have served as a constructive basis for the theory, but Einstein preferred to see it as the most important consequence of the theory.

The third phase was Minkowski's spacetime theory (1908), which might be regarded as a constructive theory if we take the invariant spacetime interval with Minkowski metric as fundamental. This is sometimes even presented as a sufficient explanation for complete Lorentz invariance. As Minkowski put it, when physics is formulated in accordance with the world-postulate “the disturbing lack of harmony between Newtonian mechanics and modern electrodynamics disappears of its own accord”. However, despite its intuitive appeal, this interpretation is problematic from an epistemological standpoint (just as are attempts to found physical geometry on a priori properties of space), since we arguably have no direct experience of space and time intervals, but only of the behavior of mass-energy in terms of coordinate systems (labels) that are assigned based on standard processes that are themselves operations involving the behavior of mass-energy. In view of this, we may say that the Minkowski metric of spacetime merely encodes in a convenient form the physical fact of Lorentz invariance (i.e., the inertia of energy).

The fourth phase was the development of relativistic quantum mechanics, beginning with the Dirac equation (1928) and quantum field theory (discussed in Section 9.4 and 9.10), according to which the advance of the quantum phase of any physical system is proportional to the lapse of proper time, as given by the Minkowski metric. Each stage represented a significant advance in clarity, with the end result being the establishment of a new fundamental principle – Lorentz invariance – which can be constructively understood as a consequence of the inertia of energy.

Return to Table of Contents