1.2 Systems of Reference (Reflections on Relativity)
Posted by Unknown on 5:24 PM with No comments
Any one
who will try to imagine the state of a mind conscious of knowing the absolute
position of a point will ever after be content with our relative knowledge.
|
James
Clerk Maxwell, 1877
|
|
There are many theories of
relativity, each of which can be associated with some arbitrariness in our
descriptions of events. For example, suppose we describe the spatial relations
between stationary particles on a line by assigning a real-valued coordinate
to each particle, such that the distance between any two particles equals the
difference between their coordinates. There is a degree of arbitrariness in
this description due to the fact that all the coordinates could be increased
by some arbitrary constant without affecting any of the relations between the
particles. Symbolically this translational relativity can be expressed by
saying that if x is a suitable system of coordinates for describing the
relations between the particles, then so is x + k for any constant k. Likewise
if we describe the spatial relations between stationary particles on a plane
by assigning an ordered pair of real-valued coordinates to each particle,
such that the squared distance between any two particles equals the sum of
the squares of the differences between their respective coordinates, then
there is a degree of arbitrariness in the description (in addition to the
translational relativity of each individual coordinate) due to the fact that
we could rotate the coordinates of every particle by an arbitrary constant
angle without affecting any of the relations between the particles. This relativity
of orientation is expressed symbolically by saying that if (x,y) is a
suitable system of coordinates for describing the positions of particles on a
plane, then so is (ax-by, bx+ay) where a2 + b2 = 1.
These relativities are purely formal, in the sense that they are tautological consequences of the premises, regardless of whether they have any physical applicability. Our first premise was that it’s possible to assign a single real-valued coordinate to each particle on a line such that the distance between any two particles equals the difference between their coordinates. If this premise is satisfied, the invariance of relations under coordinate transformations from x to x + k follows trivially, but if the pairwise distances between three given particles were, say, 5, 3, and 12 units, then no three numbers could be assigned to the particles such that the pairwise differences equal the distances. This shows that the n(n-1)/2 pairwise distances between n particles cannot be independent of each other if those distances can be encoded unambiguously by just n coordinates in one dimension or, more generally, by kn coordinates in k dimensions. A suitable system of coordinates in one dimension exists only if the distances between particles satisfy a very restrictive condition. Letting d(A,B) denote the signed distance from A to B, the condition that must be satisfied is that for every three particles A,B,C we have d(A,B) + d(B,C) + d(C,A) = 0. Of course, this is essentially the definition of co-linearity, but we have no a priori reason to expect this definition to have any applicability in the world of physical objects. The fact that it has wide applicability is a non-trivial aspect of our experience, albeit one that we ordinarily take for granted. |
|
Likewise for particles in a
region of three dimensional space the premise that we can assign three
numbers to each particle such that the squared distance between any two
particles equals the sum of the squares of the differences between their
respective coordinates is true only under a very restrictive condition, because
there are only 3n degrees of freedom in the n(n-1)/2
pairwise distances between n particles.
|
Just as we found relativity of orientation for the pair of spatial coordinates x and y, we also find the same relativity for each of the pairs x,z and y,z in three dimensional space. Thus we have translational relativity for each of the four coordinates x,y,z,t, and we have rotational relativity for each pair of spatial coordinates (x,y), (x,z), and (y,z). This leaves the pairs of coordinates (x,t), (y,t) and (z,t). Not surprisingly we find that there is an analogous arbitrariness in these coordinate pairs, which can be expressed (for the x,t pair) by saying that the relations between the instances of particles on a line as a function of time are unaffected if we replace the x and t coordinates with ax – bt and –bx + at respectively, where a2 – b2 = 1. These transformations (rotations in the x,t plane through an imaginary angle), which characterize the theory of special relativity, are based on the premise that it is possible to assign pairs of values, x and t, to each instance of each particle on the x axis such that the squared spacetime distance equals the difference between the squares of the differences between the respective coordinates. |
|
Each of the above examples
represents an invariance of physically measurable relations under certain
classes of linear transformations. Extending this idea, Einstein’s general
theory of relativity shows how the laws of physics, suitably formulated, are
invariant under an even larger class of transformations of space and time
coordinates, including non-linear transformations, and how these
transformations subsume the phenomena of gravity. In general relativity the
metrical properties of space and time are not constant, so the simple
premises on which we based the primitive relativities described above turn
out not to be satisfied globally. However, it remains true that those simple
premises are satisfied locally, i.e., over sufficiently small regions of
space and time, so they continue to be of fundamental importance.
|
|
As mentioned previously,
the relativities described above are purely formal and tautological, but it
turns out that each of them is closely related to a non-trivial physical symmetry.
There exists a large class of
identifiable objects whose lengths maintain a fixed proportion to each other
under the very same set of transformations that characterize the relativities
of the coordinates. In other words, just as we can translate the coordinates
on the x axis without affecting the length of any object, we also find a large
class of objects that can be individually translated along the x axis without
affecting their lengths. The same applies to rotations and boosts. Such
changes are physically distinct from purely formal shifts of the entire
coordinate system, because when we move individual objects we are actually changing
the relations between objects, since we are moving only a subset of all the
coordinated objects. (Also, moving an object from one stationary position to
another requires acceleration.) Thus for each formal arbitrariness in the
system of coordinates there exists a physical symmetry, i.e., a large class
of entities whose extents remain in constant proportions to each other when
subjected individually to the same transformations.
|
|
We refer to these relations
as physical symmetries rather than physical invariances, because (for
example) we have no basis for asserting that the length of a solid object or
the duration of a physical process is invariant under changes in position,
orientation or state of motion. We have no way of assessing the truth of such
a statement, because our measures of length and duration are all comparative.
We can say only that the spatial and temporal extents of all the “stable”
physical entities and processes are affected (if at all) in exactly the same
proportion by changes in position, orientation, and state of motion. Of
course, given this empirical fact, it is often convenient to speak as if the
spatial and temporal extents are invariant, but we shouldn’t forget that,
from an epistemological standpoint, we can assert only symmetry, not
invariance.
|
In his original
presentation of special relativity in 1905 Einstein took measuring rods and
clocks as primitive elements, even though he realized the weakness of this
approach. He later wrote of the special theory
|
|
It is striking that the theory
introduces two kinds of physical things, i.e., (1) measuring rods and clocks,
and (2) all other things, e.g., the electromagnetic field, the material
point, etc. This, in a certain sense, is inconsistent; strictly speaking,
measuring rods and clocks should emerge as solutions of the basic equations
(objects consisting of moving atomic configurations), not, as it were, as
theoretically self-sufficient entities. The procedure was justified, however,
because it was clear from the very beginning that the postulates of the
theory are not strong enough to deduce from them equations for physical
events sufficiently complete and sufficiently free from arbitrariness to form
the basis of a theory of measuring rods and clocks.
|
|
This is quite similar to
the view he expressed many years earlier
|
|
…the solid body and the clock
do not in the conceptual edifice of physics play the part of irreducible
elements, but that of composite structures, which may not play any independent
part in theoretical physics. But it is my conviction that in the present
stage of development of theoretical physics these ideas must still be
employed as independent ideas; for we are still far from possessing such
certain knowledge of theoretical principles as to be able to give exact
theoretical constructions of solid bodies and clocks.
|
|
The first quote is from his
Autobiographical Notes in 1949, whereas the second is from his essay on
Geometry and Experience published in 1921. It’s interesting how little his
views had changed during the intervening 28 years, despite the fact that
those years saw the advent of quantum mechanics, which many would say
provided the very theoretical principles underlying the construction of solid
bodies and clocks that Einstein felt had been lacking. Whether or not the principles of quantum mechanics
are adequate to justify our conceptions of reference lengths and time
intervals, the characteristic spatial and temporal extents of quantum
phenomena are used today as the basis for all such references.
|
|
Considering the arbitrariness
of absolute coordinates, one might think our spatio-temporal descriptions could
be better expressed in purely relational terms, such as by specifying only the
mutual distances (minimum path lengths) between objects. Nevertheless, the most
common method of description is to assign absolute coordinates (three spatial
and one temporal) to each object, with reference to an established system of
coordinates, while recognizing that the choice of coordinate systems is to
some extent arbitrary. The relations between objects are then inferred from
these absolute (thought somewhat arbitrary) coordinates. This may seem to be
a round-about process, but there are several reasons for using absolute
coordinate systems to encode the relations between objects, rather than explicitly
specifying the relations themselves.
|
|
One reason is that this
approach enables us to take advantage of the efficiency made possible by the
finite dimensionality of space. As discussed in Section 1.1, if there were no
limit to the dimensionality of space, then we would expect a set of n
particles to have n(n-1)/2 independent pairwise spatial relations, so to
explicitly specify all the distances between particles would require n-1 numbers for each particle, representing the distances to each of
the other particles. For a large number of particles (to say nothing of a
potentially infinite number) this would be impractical. Fortunately the
spatial relations between the objects of our experience are not mutually
independent. The nth particle essentially adds only three (rather than n-1) degrees of freedom to the relational configuration. In physical
terms this restriction can be clearly seen from the fact that the maximum
number of mutually equidistant particles in D-dimensional space is D+1.
Experience teaches us that in our physical space we can arrange four, but not
five or more, particles such that they are all mutually equidistant, so we
conclude that our space has three dimensions.
|
|
Historically the use of
absolute coordinates rather than explicit relations may also have been partly
due to the fact that analytic geometry and Cartesian coordinates were
invented (by Fermat, Descartes and others) at almost the same time that the new
science of mechanics needed them, just as tensor analysis was invented, three
hundred years later, at the very moment when it was needed to facilitate the
development of general relativity. (Of course, such coincidences are not
accidental; contrivances requiring new materials tend to be invented soon
after the material becomes available.) The coordinate systems of Descartes
were not merely efficient, they were also consistent with the ancient Aristotelian
belief (also held by Descartes) that there is no such thing as empty space or
vacuum, and that continuous substance permeates the universe. In this context
we cannot even contemplate explicitly specifying each individual distance
between substantial points, because space is regarded as a continuum of
substance. For Aristotle and Descartes, every spatial extent is a measure of
the length of some substance, not a pure distance between particles as
contemplated by atomists. In this sense we can say that the continuous
absolute coordinate systems inherited by modern science from Aristotle and
Descartes are a remnant of the Cartesian natural philosophy.
|
|
Another, perhaps more compelling,
reason for the adoption of abstract coordinate systems in the descriptions of
physical phenomena was the need to account for acceleration. As Newton explained
with the example of a “spinning pail”, the mutual relations between a set of
material particles in an instant are not adequate to fully characterize a
physical situation – at least not if we are considering only a small subset
of all the particles in the universe. (Whether the mutual relations would be
adequate if all the matter in the universe was taken into account is an open
question.) In retrospect, there were other possible alternatives, such as characterizing
not just the relations between particles at a specific instant, but over some
temporal span of existence, but this would have required the unification of
spatial and temporal measures, which did not occur until much later.
Originally the motions of objects were represented simply by allowing the
spatial coordinates of each persistent object to be continuous single-valued
functions of one real variable, the time coordinate.
|
|
Incidentally, one
consequence of the use of absolute coordinates is that it automatically
entails a breaking of the alleged translational symmetry. We said previously
that the coordinate system x could be replaced by x + k for any real number
k, implying that every real value of k is in some sense equally suitable.
However, from a strictly mathematical point of view there does not exist a
uniform distribution over the real numbers, so this form of representation
does not exactly entail the perfect symmetry of position in an infinite
space, even if the space is completely empty.
|
|
The set of all combinations
of values for the three spatial coordinates and one time coordinate is
assumed to give a complete coordination not only of the spatial positions of
each entity at each time, but of all possible spatial positions at all
possible times. Any definite set of space and time coordinates constitutes a
system of reference. There are infinitely many distinct ways in which such
coordinates can be assigned, but they are not entirely arbitrary, because we
limit the range of possibilities by requiring contiguous physical entities to
be assigned contiguous coordinates. This imposes a definite structure on the
system, so it is more than merely a set of labels; it represents the most
primitive laws of physics.
|
|
One way of specifying an
entire model of a world consisting of n (classical) particles would be to
explicitly give the 3n functions xj(t), yj(t), zj(t)
for j = 1 to n. In this form, the un-occupied points of space would be
irrelevant, since only the actual paths of actual physical entities have any
meaning. In fact, it could be argued that only the intersections of these
particles have physical significance, so the paths followed by the particles
in between their mutual intersections could be regarded as merely hypothetical.
Following this approach we might end up with a purely combinatorial
specification of discrete interactions, with no need for the notion of a
continuous physical space within which entities reside and move. However, the
hypothesis that physical objects have continuous positions as functions of
time with respect to a specified system of reference has proven to be
extremely useful, especially for purposes of describing simple laws by which
the observable interactions can be efficiently described and predicted.
|
|
An important class of
physical laws that make use of the full spatio-temporal framework consists of
laws that are expressed in terms of fields. A field is regarded as
existing at each point within the system of coordinates, even those points
that are not occupied by a material particle. Therefore, each continuous
field existing throughout time has, potentially, far more degrees of freedom
than does a discrete particle, or even infinitely many discrete particles.
Arguably, we never actually observe fields, were merely observe effects
attributed to fields. It’s ironic that we can simplify the descriptions of
particles by introducing hypothetical entities (fields) with far more degrees
of freedom, but the laws governing the behavior of these fields (e.g.,
Maxwell’s equations for the electromagnetic field) along with symmetries and
simple boundary conditions suffice to constrain the fields so that actually
do provide a simplification. (Fields also provide a way of maintaining
conservation laws for interactions “at a distance”.) Whether the usefulness
of the concepts of continuous space, time, and fields suggests that they
possess some ontological status is debatable, but the concepts are undeniably
useful.
|
|
These systems of reference are
more than simple labeling. The numerical values of the coordinates are
intended to connote physical properties of order and measure. In fact, we
might even suppose that the sequence of states of all particles are uniformly
parameterized by the time coordinate of our system of reference, but therein
lies an ambiguity, because it isn't clear how the temporal states of one
particle are to be placed in correspondence with the temporal states of
another. Here we must make an important decision about how our model of the
world is to be constructed. We might choose to regard the totality of all
entities as comprising a single element in a succession of universal temporal
states, in which case the temporal correspondence between entities is
unambiguous. In such a universe the temporal coordinate induces a total
ordering of events, which is to say, if we let the symbol £ denote temporal precedence or equality, then for every three events
a,b,c we have
|
|
(i) a
£ a
|
(ii)
if a £ b and b £ a, then a = b
|
(iii) if
a £ b and b £ c, then a £ c
|
(iv) either
a £ b or b £ a
|
|
However, this is not the
only possible choice. We might choose instead to regard the temporal state of
each individual particle as an independent quantity, bearing in mind that
orderings of the elements of a set are not necessarily total. For example,
consider the subsets of a flat plane, and the ordering induced by the
inclusion relation Í. Obviously the first three axioms of a total
ordering are satisfied, because for any three subsets a,b,c of the plane we
have (i) a Í a , (ii) if a Í b and b Í a, then a = b, and (iii) if a Í b and b Í c, then a Í c. However, the fourth axiom is not satisfied,
because it's entirely possible to have two sets neither of which is included
in the other. An ordering of this type is called a partial ordering, and we
should allow for the possibility that the temporal relations between events
induce a partial rather than a total ordering. In fact, we have no a
priori reason to expect that temporal relations induce even a partial
ordering. It is safest to assume that each entity possesses its own temporal
state, and let our observations teach us how those states are mutually
related, if at all. (Similar caution should be applied when modeling the
relations between the spatial states of particles.)
|
|
Given any system of space
and time coordinates we can define infinitely many others such that speeds
are preserved. This represents an equivalence relation, and we can then define
a reference frame as an equivalence class of coordinate systems such that the
speed of each object has the same value in terms of each coordinate system in
that class. Thus within a reference frame we can speak of the speed of an
object, without needing to specify any particular coordinate system. Of
course, just as our coordinate systems are generally valid only locally, so
too are the reference frames.
|
|
Purely kinematic relativity
contains enough degrees of freedom that we can simply define our systems of
reference (i.e., coordinate systems) to satisfy the additivity of velocity.
In other words, we can adopt velocity additivity as a principle, and this is
essentially what scientists had tacitly done since ancient times. The great
insight of Galileo and his successors was that this principle is inadequate
to single out the physically meaningful reference systems. A new principle
was necessary, namely, the principle of inertia, to be discussed in
the next section.
|
0 comments:
Post a Comment