On Olber’s Paradox


Prior to the Copernican revolution, the stars were commonly supposed to be embedded in a sphere that rotated once each day, with the motionless Earth at its center. This conception of the heavens entailed a finite number of stars distributed more or less uniformly over the surface of the celestial sphere. However, from the Copernican point of view it’s much more natural to conceive of each star (like the Sun) occupying a place in a three-dimensional space. This shift in the conception of the heavens corresponds to the change in the analytical basis of most scientific thought from circular/angular to rectilinear, as in the evolving concepts of inertia and natural motion. This, in turn, leads to the idea of stars (perhaps of infinite age) being distributed more or less uniformly within an infinite three-dimensional space. The first to comment on this conception of the universe seems to have been Thomas Digges in 1576, who also noted an apparent paradox that arises from this conception.


The angular extent of any given star is inversely proportional to the square of the distance, but the number of stars at a distance between r and r + Dr is proportional to the square of the distance r. From this it seems to follow that each incremental distance Dr covers the same fraction of the sky with the surface of a star, and since there are infinitely many such increments (extending to infinite distances), the entire sky ought to ought to be as bright as the Sun. Digges thought that perhaps some stars are simply too distant to see, whereas Kepler argued in 1610 for a spatially finite universe. Subsequent writers, such as Halley (1705), Cheseaux (1744), and Olber (1826) thought the paradox might be resolved by interstellar dust obscuring the most distant stars, but in 1831 Herschel pointed out that the dust would simply be heated until it glowed as brightly as the stars.


Oddly enough, it seems that no one carried Herschel’s thought process to its conclusion, which is that the sky should be infinitely bright. If we neglect the blocking or absorption of light by stars, then the intensity of light reaching us from each concentric shell of thickness Dr is the same, and there are infinitely many such shells, so the intensity of light impinging on each point in space would be infinite. Now, if we allow for some absorption of light by other stars, we see that Herschel’s argument leads to the conclusion that this absorption can only lead to the ultimate equilibrium condition in which each star is infinitely bright. This should come as no surprise, since the premise of the paradox is that the stars are absolute sources of energy, and have been in existence for an infinitely long time, so they must have introduced infinite energy into the universe. Needless to say, these premises violate the conservation of energy, but that principle wasn’t formulated until the late 19th century, and the energy source of stars wasn’t understood until the 20th century. Recognizing that all the mass-energy which is emitted from a star over its lifetime actually was put into the star during its formation leads to an immediate resolution of the paradox, even in an infinitely old universe. (This is discussed further below.)


Olber, the German physician after whom the paradox is named, interpreted the situation in terms of “visual rays”, arguing that every visual ray must intersect with the surface of a star. It’s interesting that this conception became the standard way of expressing the paradox, considering that the concept of visual rays had long since been discarded in the field of optics. It harkens back to the ancient notion that sight originates in the eye, which reaches out with visual rays to “touch” distant objects. It is perhaps not surprising that this formulation is the most effective for stating the paradox, because it most effectively mis-directs our attention, and obscures the crucial facts about light and energy transfer that resolve the paradox. In particular, the finite propagation speed of light (and of everything else) is the basic fact that was not known to Digges and Kepler. On the other hand, it certainly was (or should have been) known to Cheseaux and Olber, but even with this knowledge, there remains a paradox if we posit stars with infinite lives.


In 1901 Kelvin argued that the solution of the paradox was the finite life span of the stars. Ironically, this was an expression of a more general belief of Kelvin’s that bodies including the Earth could not have existed for more than about 100 million years, based on the amount of heat flow from the interior. He engaged in a spirited debate with geologists and biologists, who assumed that the Earth had existed for billions of years. We now know Kelvin was quite wrong in his estimates of the age of the Earth, largely because he didn’t account for the heat of radioactive elements within the Earth. Likewise his lack of understanding of the nuclear reactions powering the stars made his estimates of the life spans of stars far too short. Nevertheless, he was correct in his belief that the life spans were finite, and this is enough to resolve Olber’s paradox, even in an infinite universe with uniformly distributed stars at all times.


Many different resolutions of Olber’s paradox have been proposed. Some have argued that that answer is the finite age (and/or size) of the universe. Others argued that it is the finite life spans of stars. Some claim the expansion of the universe is the answer, red-shifting the light from the most distant stars. Still others (e.g., Edgar Allan Poe) have pointed to the hierarchical clustering of stars, galaxies, galaxy clusters, and so on, to resolve the paradox. (This suggestion has sometimes been erroneously attributed to Mandelbrot.) The night sky actually does glow from the direction of the Milky Way when viewed from our perspective out near the rim, and presumably the entire sky would be illuminated if the stars everywhere had the density of the center of a galaxy. Hence there is certainly some truth to this hierarchical explanation. It’s also worth remembering that the entire sky actually does radiate, in the microwave regime, so this too shows that Olber’s paradox is not entirely an idle fallacy. Oddly enough, the principle of conservation of mass-energy has rarely if ever been invoked in these discussions.


Clearly the paradox relies on several premises, and so, in the abstract, it can be resolved by the denial of any one or more of those premises. Hence it isn’t surprising that there have been multiple suggestions for how to resolve the paradox. The question of which premise(s) are to be denied in the real world must be answered empirically. (This applies even to the conservation of energy, which is after all an empirically based principle.) At present the most widely accepted resolution of Olber’s paradox is based on denying the infinite age of the universe. Along with this, some authors also mention the expansion of the universe with the consequent red-shifting, but if the universe is closed and destined to enter a collapsing phase, would those authors claim the sky will shine brightly during that phase?


It’s worth noting that the resolution based on conservation of energy does not require a finite age for the universe, nor even a finite age for “stars”. It merely requires that all the mass-energy leaving a star must have first entered the star. In other words, we deny that stars are absolute sources of energy. They are merely concentrations of energy. To examine in more detail how this resolves the paradox, let us first consider an infinite and eternal universe filled with an omni-directional flux of point-like particles, each moving with the speed c in some direction chosen from a uniform distribution. To make this more definite, suppose we construct a Cartesian coordinate system [x,y,z] throughout the entire infinite space, and partition the space into cubicle regions Dx, Dy, Dz where each of these increments has length L. At some time t, we posit that within each of these cubicle regions there are exactly N particles, each moving with speed c in a direction selected at random from a uniform distribution. Thus, each of these small regions can be regarded as a source of radiation, since all the particle contained in the region will emerge at the speed c.


Now let’s consider some arbitrary small region in space at an arbitrary time in the future, and ask how many particles will be contained within that region at that time. Every other region of space will contribute some expected number of particles, and the expected number received from each region will be proportional to the inverse square of the distance. However, as Digges observed, the number of regions at a certain distance is proportional to the square of the distance, so each incremental distance contributed the same expected number of particles, and hence (for a time infinitely in the future) the number of particles in our selected region is infinite – just as Olber’s paradox suggests that the night sky should be infinitely bright – but of course this reasoning is fallacious. It’s true that each incremental distance contributes the same number of expected particles, but these contributions are not mutually exclusive. In fact, they are the very same particles, passing through successively closer and closer regions. The particles in the region near the origin (for example) were at a distance of c meters away one second previously, and those same particles were at a distance 2c meters away two seconds previously, and so on. Conversely, it is true that the regions at each distance at given instant t contain an equal expected number of particles destined to reach our selected region, but they will not reside in the selected region simultaneously. They will pass through the region sequentially. The overall density of particles is conserved, as is the finite intensity of the flux at every point. This shows, in a simplified way, why Olber’s paradox is fallacious, even in an infinitely old universe of infinite extent and with “stars” that are infinitely old. In this model, each small region is regarded as a mild “star”, meaning that it emits particles, but it also absorbs particles at the same rate.


It might be objected that this simplified model doesn’t accurately represent real stars, because the inflow and outflow of mass-energy for each region of space is always equal, whereas for real physical stars there is a period of formation during which the inflow exceeds the outflow, and then a period of shining, and then perhaps a period of going supernova when the material is re-emitted. However, the simplified model actually does not entail equal inflow and outflow, because it is inherently discrete, so the rates are zero most of the time, and then infinite (positive or negative) at each instant when a particle crosses a boundary. Of course, these average out to fairly uniform and equal flows over some suitable spatial and temporal intervals, but the same applies to actual regions of space containing real physical stars. On a large enough scale, and for sufficiently long periods of time, it can be argued that the inflow and outflow rates average out… as they must if mass-energy is conserved.


In terms of the simplified model, it’s interesting to consider how the density profile would evolve over time. We postulate that at the initial time t = 0 each cubicle region of space contains exactly N particles, and the directions of their motions are randomly selected from a uniform distributed, but the finite number of particles in each cell implies that the directions don’t exhibit perfect spherical symmetry. In fact, a small number of cells will contain distinctly non-symmetrical sets of directions. Hence, as we extrapolate forward in time, it will no longer always be true that each cubicle region contains N particles. Some regions will contain more, and others less. The transient concentrations could be regarded as analogs of “stars” in this simple model. What is the long-term distribution of particles? Is there an asymptotic variance, or does the minimum scale on which the distribution is uniform increase indefinitely?


On a related subject, there’s an interesting fact about random placements and the effect of re-scaling. First, consider a regular placement of points on the plane, such as is given by repeating the “square” pattern shown below.



If we re-scale the vertical axis, shrinking all the vertical distances, the pattern of points changes accordingly, as the original square pattern changes to a rectangular pattern, as shown below.



Of course, the mean “partial density” is necessarily the same in both directions, because the number of particles per unit distance along one axis (for a fixed unit of distance along the other axis) is the same. Nevertheless, this regular arrangement gives distinct patterns depending on the scaling of the axes. In contrast, if we postulate a random arrangement of particles with positions drawn from a uniform distribution, we find that this continues to be a valid description of the pattern, regardless of re-scaling along any axis. To determine the placements of N points in a typical region, we can consider a square region with edge lengths L such that N is fairly large, and then draw N real numbers in the range 0 to L from a uniform distribution over this interval. Call these the x coordinates of the points. Then draw another N real numbers in the same range from a uniform distribution, and call these the y coordinates of the points. This can be repeated for neighboring square regions to cover the entire plane with any given density of particles. (Because of boundary effects, the results of piecing four L x L squares together versus doubling L for a single square with 4N particles, are not exactly identical, but they converge as L increases.)


If we now re-scale the vertical axis, as we did previously with the rectangular array of particles, the resulting configuration is still randomly distributed in both directions, and the mean partial densities in both directions have increased in the same proportion. This increased density in the vertical direction is self-evident from the fact that we have reduced all the distances in that direction. It may be slightly less obvious that the density in the orthogonal direction increases in the same proportion, but this follows because more particles have been brought into the unit square by the contraction in the vertical direction, and each of these has an x coordinate as well as a y coordinate. The same was true in the case of regular arrangements, i.e., the mean densities remained equal in both directions, but the basic pattern became distinctly rectangular instead of square. In contrast, the x and y coordinates of randomly arranged particles continue to be describable as having been drawn randomly from uniform distributions in both directions. Therefore, in this formal sense, a random uniform arrangement of points is invariant under re-scaling in any direction. The density changes, but the configuration remains isotropic.


An interesting variation on the idea behind Olber’s paradox is involved in the Wheeler-Feynman “absorber theory” of classical electrodynamics. This theory attempted to represent the electromagnetic force as a retarded distant action between charged particles, without any fields. In order to account for the radiation reaction force that is observed whenever a charged particle is accelerated, Wheeler and Feynman suggested that the retarded potentials of an accelerating charge cause surrounding charges to accelerate in response, and those accelerating charges are accompanied by advanced potentials that act back on the original charged particle, thereby producing the damping effect that we conventionally interpret as radiation reaction. By the effects of constructive and destructive interference, they were able to show why only retarded waves would appear to exist. But to make this theory work, they found that it required complete absorption of the retarded potential in the future. In other words, there must be an absorbing charge in every direction over the entire sky, which then “radiates” back onto each particle (as fluctuations in the advanced potential). One of the drawbacks of this theory is that the universe doesn’t appear to satisfy the required conditions. In fact, one could argue that  (in an open universe) the direction of complete absorption is actually in the past, rather than the future, so we ought to observe only advanced waves rather than retarded waves. For this and other reasons (including inability to quantize it) the Wheeler-Feynman theory has never been widely accepted, but it’s interesting that, in a sense, it relies on a form of Olber’s hypothesis being true, which is ironic, considering that most theories strive to provide an explanation for why it is false. Oddly enough, the original motivation of Feynman in pursuing the distant-action approach to electrodynamics was to avoid the infinities that arise when considering the self-action of a particle’s field on the particle itself. He thought this could be done by having particles interacting only with each other directly, without introducing any fields. However, when all the effects of absorption and radiation reaction were correctly modeled in this way, he arrived at a theory that was (at best) equivalent to the standard formulations of classical electrodynamics, meaning that the infinities still appear, but in a different form – namely, in the form of Olber’s paradox, as every ray direction emanating from a particle eventually intersects with a charged particle. This is exactly analogous to how astronomers reasoned that in an infinite universe every ray direction in our sight ought to terminate on the surface of a star.


Return to MathPages Main Menu