This is a file in the archives of the Stanford Encyclopedia of Philosophy.

Stanford Encyclopedia of Philosophy

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Experiment in Physics

Physics, and natural science in general, is a reasonable enterprise based on valid experimental evidence, criticism, and rational discussion. It provides us with knowledge of the physical world and it is experiment that provides the evidence that grounds that knowledge. Experiment plays many roles in science. One of its important roles is to test theories and to provide the basis for scientific knowledge.\*/ It can also call for a new theory, either by showing that an accepted theory is incorrect, or by exhibiting a new phenomenon which needs explanation. Experiment can provide hints toward the structure or mathematical form of a theory and it can provide evidence for the existence of the entities involved in our theories. Finally, it may also have a life of its own, independent of theory. Scientists may investigate a phenomenon just because it looks interesting. This will also provide evidence for a future theory to explain. [Examples of these different roles will be presented below.] As we shall see below, a single experiment may play several of these roles at once.

If experiment is to play these important roles in science then we must have good reasons to believe experimental results, for science is a fallible enterprise. Theoretical calculations, experimental results, or the comparison between experiment and theory may all be wrong. Science is more complex than "The scientist proposes, Nature disposes." It may not always be clear what the scientist is proposing. Theories often need to be articulated and clarified. It also may not be clear how Nature is disposing. Experiments may not always give clear-cut results, and may even disagree for a time.

In what follows, the reader will find an epistemology of experiment, a set of strategies that provides reasonable belief in experimental results. Scientific knowledge can then be reasonably based on these experimental results.


I. Experimental Results

A. The Case For Learning From Experiment

1. An Epistemology of Experiment

It has been almost two decades since Ian Hacking asked, "Do we see through a microscope?" (Hacking 1981). Hacking’s question really asked how do we come to believe in an experimental result obtained with a complex experimental apparatus? How do we distinguish between a valid result\1/ and an artifact created by that apparatus? If experiment is to play all of the important roles in science mentioned above and to provide the evidential basis for scientific knowledge, then we must have good reasons to believe in those results. Hacking provided an extended answer in the second half of Representing and Intervening (1983). He pointed out that even though an experimental apparatus is laden with, at the very least, the theory of the apparatus, observations remain robust despite changes in the theory of the apparatus or in the theory of the phenomenon. His illustration was the continuous belief in microscope images despite the major change in the theory of the microscope when Abbe pointed out the importance of diffraction in its operation. One reason Hacking gave for this is that in making such observations the experimenters intervened. They manipulated the object under observation. Thus, in looking at a cell through a microscope one might inject fluid into the cell or stain the specimen. One expects the cell to change shape or color when this is done. Observing the predicted effect strengthens our belief in both the proper operation of the microscope and in the observation. This is true in general. Observing the predicted effect of an intervention strengthens our belief in both the proper operation of the experimental apparatus and in the observations made with it.

Hacking also discussed the strengthening of one’s belief in an observation by independent confirmation. The fact that the same pattern of dots, dense bodies in cells, is seen with "different" microscopes, i.e. ordinary, polarizing, phase-contrast, fluorescence, interference, electron, acoustic etc., argues for the validity of the observation. One might question whether or not "different" is theory laden. After all, it is our theory of light and of the microscope that allows us to consider these microscopes "different." Nevertheless, the argument goes through. Hacking correctly argues that it would be a preposterous coincidence if the same pattern of dots were produced in two totally different kinds of physical systems. Different apparatuses have different backgrounds and systematic errors, making the coincidence, if it is an artifact, most unlikely. If it is a correct result, and the instruments are working properly, the coincidence of results is understandable.

Hacking’s answer is correct as far as it goes. It is, however, incomplete. What happens when one can perform the experiment with only one type of apparatus, such as an electron microscope or a radio telescope, or when intervention is either impossible or extremely difficult? Other strategies are needed to validate the observation.\2/ These may include:

1) Experimental checks and calibration, in which the experimental apparatus reproduces known phenomena. For example, if we wished to argue that the spectrum of a substance obtained with a new type of spectrometer is correct, we might check that this new spectrometer could reproduce the known Balmer Series in hydrogen. If we correctly observe the Balmer Series then we strengthen our belief that the spectrometer is working properly. This also strengthens our belief in the results obtained with that spectrometer. If the check fails then we have good reason to question the results obtained with that apparatus.

2) Reproducing artifacts that are known in advance to be present. An example of this comes from experiments to measure the infrared spectra of organic molecules (Randall et al. 1949). It was not always possible to prepare a pure sample of such material. Sometimes one had to place the substance in an oil paste or in solution. In such cases, one expects to observe, superimposed on the spectrum of the substance, the spectrum of the oil or the solvent, which one can compare with the known spectrum of the oil or the solvent. Observation of this artifact gives confidence in other measurements made with the spectrometer.

3) Elimination of plausible sources of error and alternative explanations of the result (the Sherlock Holmes strategy).\3/ Thus, when scientists claimed to have observed electric discharges in the rings of Saturn, they argued for their result by showing that it could not have been caused by defects in the telemetry, by interaction with the environment of Saturn, by lightning, or by dust. The only remaining explanation of their result was that it was due to electric discharges in the rings. There was no other plausible explanation of the observation. In addition, the same result was observed by both Voyager 1 and Voyager 2. This provided independent confirmation. Often, several epistemological strategies are used in the same experiment.

4) Using the results themselves to argue for their validity. Consider the problem of Galileo’s telescopic observations of the moons of Jupiter. Although one might very well believe that his early telescope might have created spots of light, it would have been extremely implausible that the telescope would create them so that they would appear to be a small planetary system with eclipses and other consistent motions. It would have been even more implausible to believe that the created spots would satisfy Kepler’s Third Law (R3/T2 = constant). A similar argument was used by Robert Millikan to support his observation of the quantization of electric charge and his measurement of the charge of the electron. Millikan remarked, "The total number of changes which we have observed would be between one and two thousand, and in not one single instance has there been any change which did not represent the advent upon the drop of one definite invariable quantity of electricity or a very small multiple of that quantity"(Millikan 1911, p. 360). In both of these cases one is arguing that there was no plausible malfunction of the apparatus, or background, that would explain the observations.

5) Using an independently well-corroborated theory of the phenomena to explain the results. This was illustrated in the discovery of the W±, the charged intermediate vector boson required by the Weinberg-Salam unified theory of electroweak interactions. Although these experiments used very complex apparatuses and used other epistemological strategies (see (Franklin 1986, pp. 170-72) for details) I believe that the agreement of the observations with the theoretical predictions of the particle properties helped to validate the experimental results. In this case the particle candidates were observed in events that contained an electron with high transverse momentum and in which there were no particle jets, just as predicted by the theory. In addition, the measured particle mass of 81 ± 5 GeV/c2 and 80+10-6, GeV/c2, found in the two experiments (note the independent confirmation also), was in good agreement with the theoretical prediction of 82 ± 2.4 GeV/c2. It was very improbable that any background effect, which might mimic the presence of the particle, would be in agreement with theory.

6) Using an apparatus based on a well-corroborated theory. In this case the support for the theory passes on to the apparatus based on that theory. This is the case with both the electron microscope and the radio telescope, whose proper operation is based on a well-supported theory, although other strategies are also used to validate the observations.

7) Using statistical arguments. An interesting example of this arose in the 1960s when the search for new particles and resonances occupied a substantial fraction of the time and effort of those physicists working in experimental high-energy physics. The usual technique was to plot the number of events observed as a function of the invariant mass of the final-state particles and to look for bumps above a smooth background. The usual informal criterion for the presence of a new particle was that it resulted in a three standard-deviation effect above the background, a result that had a probability of 0.27% of occurring in a single bin. This criterion was later changed to four standard deviations, which had a probability of 0.0064% when it was pointed out that the number of graphs plotted each year by high-energy physicists made it rather probable, on statistical grounds, that a three standard-deviation effect would be observed.

These strategies along with Hacking’s intervention and independent confirmation constitute an epistemology of experiment. They provide us with good reasons for belief in experimental results, They do not, however, guarantee that the results are correct. There are many experiments in which these strategies are applied, but whose results are later shown to be incorrect (examples will be presented below). Experiment is fallible.

2. Galison’s Elaboration

In How Experiments End (1987), Peter Galison extended the discussion of experiment to more complex situations. In his histories of the measurements of the gyromagnetic ratio of the electron, of the discovery of the muon, and of the discovery of weak neutral currents, he considered a series of experiments measuring a single quantity, a set of different experiments culminating in a discovery, and two high energy physics experiments performed by large groups with complex experimental apparatus.

Galison’s view is that experiments end when the experimenters believe that they have a result that will stand up in court. A result that I believe will include, and has included, the use of the epistemological strategies discussed earlier. Thus, David Cline, one of the weak neutral current experimenters remarked, "At present I don’t see how to make these effects [the weak neutral current event candidates] go away" (Galison, 1987, p. 235).

Galison emphasizes that, within a large experimental group, different members of the group may find different pieces of evidence most convincing. In the Gargamelle weak neutral current experiment, several group members found the single photograph of a neutrino-electron scattering event particularly important, whereas for others the difference in spatial distribution between the observed neutral current candidates and the neutron background was decisive. Galison attributes this, in large part, to differences in experimental traditions, in which scientists develop skill in using certain types of instruments or apparatus. In particle physics, for example, there is the tradition of visual detectors, such as the cloud chamber or the bubble chamber, in contrast to the electronic tradition of Geiger and scintillation counters and spark chambers. Scientists within the visual tradition tend to prefer "golden events" that clearly demonstrate the phenomenon in question, whereas those in the electronic tradition tend to find statistical arguments more persuasive and important than individual events. (For further discussion of this issue see Galison (1997)).

Galison points out that major changes in theory and in experimental practice and instruments do not necessarily occur at the same time. This persistence of experimental results provides continuity across these conceptual changes. The experiments on the gyromagnetic ratio spanned classical electromagnetism, Bohr’s old quantum theory, and the new quantum mechanics of Heisenberg and Schrodinger. Robert Ackermann has offered a similar view in his discussion of scientific instruments.

The advantages of a scientific instrument are that it cannot change theories. Instruments embody theories, to be sure, or we wouldn’t have any grasp of the significance of their operation....Instruments create an invariant relationship between their operations and the world, at least when we abstract from the expertise involved in their correct use. When our theories change, we may conceive of the significance of the instrument and the world with which it is interacting differently, and the datum of an instrument may change in significance, but the datum can nonetheless stay the same, and will typically be expected to do so. An instrument reads 2 when exposed to some phenomenon. After a change in theory,\4/ it will continue to show the same reading, even though we may take the reading to be no longer important, or to tell us something other than what we thought originally (Ackermann 1985, p. 33).

Galison also discusses other aspects of the interaction between experiment and theory. Theory may influence what is considered to be a real effect, demanding explanation, and what is considered background. In his discussion of the discovery of the muon, he argues that the calculation of Oppenheimer and Carlson, which showed that showers were to be expected in the passage of electrons through matter, left the penetrating particles, later shown to be muons, as the problem. Prior to their work, physicists thought the showering particles were the problem, whereas the penetrating particles seemed to be understood.

The role of theory as an "enabling theory," one that allows calculation or estimation of the size of the expected effect and also the size of expected backgrounds is also discussed by Galison. (See also (Franklin 1995b) and the discussion of the Stern-Gerlach experiment below). Such a theory can help to determine whether or not an experiment is feasible. He also emphasizes that elimination of background that might simulate or mask an effect is central to the experimental enterprise, and not a peripheral activity. In the case of the weak neutral current experiments the existence of the currents depended crucially on showing that the event candidates could not all be due to neutron background.\5/

There is also a danger that the design of an experiment may preclude observation of a phenomenon. Galison points out that the original design of one of the neutral current experiments, which included a muon trigger would not have allowed the observation of neutral currents. In its original form the experiment was designed to observe charged currents, which produced a high energy muon. Neutral currents do not. Therefore, having a muon trigger precluded their observation. Only after the theoretical importance of the search for neutral currents was emphasized to the experimenters was the trigger changed. Changing the design did not, of course, guarantee that neutral currents would be observed.

Galison also shows that the theoretical presuppositions of the experimenters may enter into the decision to end an experiment and report the result. Einstein and de Haas ended their search for systematic errors when their value for the gyromagnetic ratio of the electron, g = 1, agreed with their theoretical model of orbiting electrons. This effect of presuppositions might cause one to be skeptical of both experimental results and their role in theory evaluation. Galison’s history shows, however, that, in this case, the importance of the measurement led to many repetitions of the measurement. This resulted in an agreed upon result that disagreed with theoretical expectations. Scientists do not always find what they are looking for.

B. The Case Against Learning From Experiment

1. Collins and the Experimenters’ Regress

Collins, Pickering, and others, have raised objections to the view that experimental results are accepted on the basis of epistemological arguments. They point out that "a sufficiently determined critic can always find a reason to dispute any alleged ‘result’" (MacKenzie 1989, p. 412). Harry Collins, for example, is well known for his skepticism concerning both experimental results and evidence. He develops an argument that he calls the "experimenters’ regress" (Collins 1985, chapter 4, pp. 79-111): What scientists take to be a correct result is one obtained with a good, that is, properly functioning, experimental apparatus. But a good experimental apparatus is simply one that gives correct results. Collins claims that there are no formal criteria that one can apply to decide whether or not an experimental apparatus is working properly. In particular, he argues that calibrating an experimental apparatus by using a surrogate signal cannot provide an independent reason for considering the apparatus to be reliable.

In Collins’ view the regress is eventually broken by negotiation within the appropriate scientific community, a process driven by factors such as the career, social, and cognitive interests of the scientists, and the perceived utility for future work, but one that is not decided by what we might call epistemological criteria, or reasoned judgment. Thus, Collins concludes that his regress raises serious questions concerning both experimental evidence and its use in the evaluation of scientific hypotheses and theories. Indeed, if no way out of the regress can be found then he has a point.

Collins strongest candidate for an example of the experimenters’ regress is presented in his history of the early attempts to detect gravitational radiation, or gravity waves. (For more detailed discussion of this episode see (Collins 1985; 1994; Franklin 1994; 1997a) In this case, the physics community was forced to compare Weber’s claims that he had observed gravity waves with the reports from six other experiments that failed to detect them. On the one hand, Collins argues that the decision between these conflicting experimental results could not be made on epistemological or methodological grounds. He claims that the six negative experiments could not legitimately be regarded as replications\6/ and hence become less impressive. On the other hand, Weber’s apparatus, precisely because the experiments used a new type of apparatus to try to detect a hitherto unobserved phenomenon,\7/ could not be subjected to standard calibration techniques.

The results presented by Weber’s critics were not only more numerous, but they had also been carefully cross-checked. The groups had exchanged both data and analysis programs and confirmed their results. The critics had also investigated whether or not their analysis procedure, the use of a linear algorithm, could account for their failure to observe Weber’s reported results. They had used Weber’s preferred procedure, a nonlinear algorithm, to analyze their own data, and still found no sign of an effect. They had also calibrated their experimental apparatuses by inserting acoustic pulses of known energy and finding that they could detect a signal. Weber, on the other hand, as well as his critics using his analysis procedure, could not detect such calibration pulses.

There were, in addition, several other serious questions raised about Weber’s analysis procedures. These included an admitted programming error that generated spurious coincidences between Weber’s two detectors, possible selection bias by Weber, Weber’s report of coincidences between two detectors when the data had been taken four hours apart, and whether or not Weber’s experimental apparatus could produce the narrow coincidences claimed.

It seems clear that the critics’ results were far more credible than Weber’s. They had checked their results by independent confirmation, which included the sharing of data and analysis programs. They had also eliminated a plausible source of error, that of the pulses being longer than expected, by analyzing their results using the nonlinear algorithm and by explicitly searching for such long pulses.\8/ They had also calibrated their apparatuses by injecting pulses of known energy and observing the output.

Contrary to Collins, I believe that the scientific community made a reasoned judgment and rejected Weber’s results and accepted those of his critics. Although no formal rules were applied, i.e. if you make four errors, rather than three, your results lack credibility; or if there are five, but not six, conflicting results, your work is still credible; the procedure was reasonable.

Pickering argues that the reasons for accepting results are the future utility of such results for both theoretical and experimental practice and the agreement of such results with the existing community commitments. In discussing the discovery of weak neutral currents, Pickering states,

Quite simply, particle physicists accepted the existence of the neutral current because they could see how to ply their trade more profitably in a world in which the neutral current was real. (1984b, p. 87)

Scientific communities tend to reject data that conflict with group commitments and, obversely, to adjust their experimental techniques to tune in on phenomena consistent with those commitments. (1981, p. 236)

The emphasis on future utility and existing commitments is clear. These two criteria do not necessarily agree. For example, there are episodes in the history of science in which more opportunity for future work is provided by the overthrow of existing theory. (See, for example, the history of the overthrow of parity conservation and of CP symmetry discussed below and in (Franklin 1986, Ch. 1, 3)).

2. Pickering on Communal Opportunism and Plastic Resources

Pickering has recently offered a different view of experimental results. In his view the material procedure including the experimental apparatus itself along with setting it up, running it, and monitoring its operation; the theoretical model of that apparatus, and the theoretical model of the phenomena under investigation are all plastic resources that the investigator brings into relations of mutual support. (Pickering 1987; Pickering 1989). He says:
Achieving such relations of mutual support is, I suggest, the defining characteristic of the successful experiment. (1987, p. 199)
His example is Morpurgo’s search for free quarks, or fractional charges of 1/3 e or 2/3 e, where e is the charge of the electron. (See also (Gooding 1992)). Morpurgo used a modern Millikan-type apparatus and initially found a continuous distribution of charge values. Following some tinkering with the apparatus, Morpurgo found that if he separated the capacitor plates he obtained only integral values of charge. "After some theoretical analysis, Morpurgo concluded that he now had his apparatus working properly, and reported his failure to find any evidence for fractional charges" (Pickering 1987, p. 197).

Pickering has made the important point that experimental apparatuses rarely work properly when they are first operated, and that some adjustment, or tinkering, is required before it does. He has also correctly pointed out that the theory of the apparatus and the theory of the phenomena can, and do, form part of the argument for the validity of an experimental result. He has, I believe, overemphasized theory. It was known, from Millikan onwards, that fractional charges, if they exist at all, are very rare in comparison with integral charges. The failure of Morpurgo’s apparatus to find integral charges indicated quite strongly that, despite his initial theoretical analysis, it was not an accurate charge measuring device. Only after tinkering, when the apparatus measured integral charges, and thus passed a crucial experimental check, could one legitimately trust its measurements of charge. Although the modified theoretical analysis may have helped to clarify this, it was the experimental check that was crucial. There is more to an experimental apparatus than its theoretical analysis.

3. Critical Responses to Pickering

Ackermann has offered a modification of Pickering’s view. He suggests that the experimental apparatus itself is a less plastic resource then either the theoretical model of the apparatus or that of the phenomenon.
To repeat, changes in A [the apparatus] can often be seen (in real time, without waiting for accommodation by B [the theoretical model of the apparatus]) as improvements, whereas ‘improvements’ in B don’t begin to count unless A is actually altered and realizes the improvements conjectured. It’s conceivable that this small asymmetry can account, ultimately, for large scale directions of scientific progress and for the objectivity and rationality of those directions. (Ackermann 1991, p. 456)

Hacking (1992) has also offered a more complex version of Pickering’s later view. He suggests that the results of mature laboratory science achieve stability and are self-vindicating when the elements of laboratory science are brought into mutual consistency and support. These are (1) ideas: questions, background knowledge, systematic theory, topical hypotheses, and modeling of the apparatus; (2) things: target, source of modification, detectors, tools, and data generators; and (3) marks and the manipulation of marks: data, data assessment, data reduction, data analysis, and interpretation.

Stable laboratory science arises when theories and laboratory equipment evolve in such a way that they match each other and are mutually self-vindicating. (1992, p. 56)

We invent devices that produce data and isolate or create phenomena, and a network of different levels of theory is true to these phenomena. Conversely we may in the end count them only as phenomena only when the data can be interpreted by theory. (pp. 57-8)

One might ask whether or not such mutual adjustment between theory and experimental results can always be achieved? What happens when an experimental result is produced by an apparatus on which several of the epistemological strategies, discussed earlier, have been successfully applied, and the result is in disagreement with our theory of the phenomenon? Accepted theories can be refuted. Several examples will be presented below.

Hacking himself worries about what happens when a laboratory science that is true to the phenomena generated in the laboratory, thanks to mutual adjustment and self-vindication, is successfully applied to the world outside the laboratory. Does this argue for the truth of the science. In Hacking’s view it does not. If laboratory science does produce happy effects in the "untamed world,... it is not the truth of anything that causes or explains the happy effects" (1992, p. 60).

There is a rather severe disagreement on the reasons for the acceptance of experimental results. For some, like Galison and myself, it is because of epistemological arguments. For others, like Pickering, the reasons are utility for future practice and agreement with existing theoretical commitments. Although the history of science shows that the overthrow of a well-accepted theory leads to an enormous amount of theoretical and experimental work, proponents of this view seem to accept it as unproblematical that it is always agreement with existing theory that has more future utility. Hacking and Pickering also suggest that experimental results are accepted on the basis of the mutual adjustment of elements which includes the theory of the phenomenon.

Nevertheless, everyone seems to agree that a consensus does arise on experimental results. The question then is how are these results used?

II. The Roles of Experiment

A. A Life of Its Own

Although experiment often takes its importance from its relation to theory, Hacking pointed out that it often has a life of its own, independent of theory. He notes the pristine observations of Carolyn Herschel’s discovery of comets, William Herschel’s work on "radiant heat," and Davy’s observation of the gas emitted by algae and the flaring of a taper in that gas. In none of these cases did the experimenter have any theory of the phenomenon under investigation. One may also note the nineteenth century measurements of atomic spectra and the work on the masses and properties on elementary particles during the 1960s. Both of these sequences were conducted without any guidance from theory.

In deciding what experimental investigation to pursue, scientists may very well be influenced by the equipment available and their own ability to use that equipment (McKinney 1992). Thus, when the Mann-O’Neill collaboration was doing high energy physics experiments at the Princeton-Pennsylvania Accelerator during the late 1960s, the sequence of experiments was (1) measurement of the K+ decay rates, (2) measurement of the K +e3 branching ratio and decay spectrum, (3) measurement of the K+e2 branching ratio, and (4) measurement of the form factor in K+e3 decay. These experiments were performed with basically the same experimental apparatus, but with relatively minor modifications for each particular experiment. By the end of the sequence the experimenters had become quite expert in the use of the apparatus and knowledgeable about the backgrounds and experimental problems. This allowed the group to successfully perform the technically more difficult experiments later in the sequence. We might refer to this as "instrumental loyalty" and the "recycling of expertise" (Franklin 1997b). This meshes nicely with Galison’s view of experimental traditions. Scientists, both theorists and experimentalists, tend to pursue experiments and problems in which their training and expertise can be used.

Hacking also remarks on the "noteworthy observations" on Iceland Spar by Bartholin, on diffraction by Hooke and Grimaldi, and on the dispersion of light by Newton. "Now of course Bartholin, Grimaldi, Hooke, and Newton were not mindless empiricists without an ‘idea’ in their heads. They saw what they saw because they were curious, inquisitive, reflective people. They were attempting to form theories. But in all these cases it is clear that the observations preceded any formulation of theory" (Hacking 1983, p. 156). In all of these cases we may say that these were observations waiting for, or perhaps even calling for, a theory. The discovery of any unexpected phenomenon calls for a theoretical explanation.

B. Confirmation and Refutation

Nevertheless several of the important roles of experiment involve its relation to theory. Experiment may confirm a theory, refute a theory, or give hints to the mathematical structure of a theory.

1. The Discovery of Parity Nonconservation: A Crucial Experiment

Let us consider first an episode in which the relation between theory and experiment was clear and straightforward. This was a "crucial" experiment, one that decided unequivocally between two competing theories, or classes of theory. The episode was that of the discovery that parity, mirror-reflection symmetry or left-right symmetry, is not conserved in the weak interactions. (For details of this episode see Franklin (1986, Ch. 1) and Appendix 1). Experiments showed that in the beta decay of nuclei the number of electrons emitted in the same direction as the nuclear spin was different from the number emitted opoosite to the spin direction. This was a clear demonstartion of parity vilation in the weak interactions.

2. The Discovery of CP Violation: A Persuasive Experiment

After the discovery of parity and charge conjugation nonconservation, and following a suggestion by Landau, physicists considered CP (combined parity and particle-antiparticle symmetry), which was still conserved in the experiments, as the appropriate symmetry. One consequence of this scheme, if CP were conserved, was that the K1o meson could decay into two pions, whereas the K 2o meson could not.\9/ Thus, observation of the decay of K2o into two pions would indicate CP violation. The decay was observed by a group at Princeton University. Although several alternative explanations were offered, experiments eliminated each of the alternatives leaving only CP violation as an explanation of the experimental result. (For details of this episode see Franklin (1986, Ch. 3) and Appendix 2.)

3. The Discovery of Bose-Einstein Condensation: Confirmation After 70 Years

In both of the episodes discussed previously, those of parity nonconservation and of CP violation, we saw a decision between two competing classes of theories. This episode, the discovery of Bose-Einstein condensation (BEC), illustrates the confirmation of a specific theoretical prediction 70 years after the theoretical prediction was first made. Bose (1924) and Einstein (1924; 1925) predicted that a gas of noninteracting bosonic atoms will, below a certain temperature, suddenly develop a macroscopic population in the lowest energy quantum state.\10/ (For details of this episode see Appendix 3.)

C. Complications

In the three episodes discussed in the previous section, the relation between experiment and theory was clear. The experiments gave unequivocal results and there was no ambiguity about what theory was predicting. None of the conclusions reached has since been questioned. Parity and CP symmetry are violated in the weak interactions and Bose-Einstein condensation is an accepted phenomenon. In the practice of science things are often more complex. Experimental results may be in conflict, or may even be incorrect. Theoretical calculations may also be in error or a correct theory may be incorrectly applied. There are even cases in which both experiment and theory are wrong. As noted earlier, science is fallible. In this section I will briefly discuss several episodes which illustrate these complexities.

1. The Fall of the Fifth Force

The episode of the fifth force is the case of a refutation of an hypoothesis, but only after a disagreement between experimental results was resolved. The "Fifth Force" was a proposed modification of Newton’s Law of Universal Gravitation. The initial experiments gave conflicting results: one supported the existence of the Fifth Force whereas the other argued against it. After numerous repetitions of the experiment, the discord was resolved and a consensus reached that the Fifth Force did not exist. (For details of this episode see Appendix 4.)

2. Right Experiment, Wrong Theory: The Stern-Gerlach Experiment\11/

The Stern-Gerlach experiment was regarded as crucial at the time it was performed, but, in fact, wasn’t. In the view of the physics community it decided the issue between two theories, refuting one and supporting the other. In the light of later work, however, the refutation stood, but the confirmation was questionable. In fact, the experimental result posed problems for the theory it had seemingly confirmed. A new theory was proposed and although the Stern-Gerlach result initially also posed problems for the new theory, after a modification of that new theory, the result confirmed it. In a sense, it was crucial after all. It just took some time.

The Stern-Gerlach experiment provides evidence for the existence of electron spin. These experimental results were first published in 1922, although the idea of electron spin wasn’t proposed by Goudsmit and Uhlenbeck until 1925 (1925; 1926). One might say that electron spin was discovered before it was invented. (For details of this episode see Appendix 5).

3. Sometimes Refutation Doesn’t Work: The Double-Scattering of Electrons

In the last section we saw some of the difficulty inherent in experiment-theory comparison. One is sometimes faced with the question of whether the experimental apparatus satisfies the conditions required by theory, or conversely, whether the appropriate theory is being compared to the experimental result. A case in point is the history of experiments on the double-scattering of electrons by heavy nuclei (Mott scattering) during the 1930s and the relation of these results to Dirac’s theory of the electron, an episode in which the question of whether or not the experiment satisfied the conditions of the theoretical calculation was central. Initially, experiments disagreed with Mott’s calculation, casting doubt on the underlying Dirac theory. After more than a decade of work, both experimental and theoretical, it was realized that there was a background effect in the experiments that masked the predicted effect. When the background was eliminated experiment and theory agreed. (Appendix 6)

D. Other Roles

1. Evidence for a New Entity: J.J. Thomson and the Electron

Experiment can also provide us with evidence for the existence of the entities involved in our theories. J.J. Thomson’s experiments on cathode rays provided grounds for belief in the existence of electrons. (For details of this episode see Appendix 7).

2. The Articulation of Theory: Weak Interactions

Experiment can also help to articulate a theory. Experiments on beta decay during from the 1930s to the 1950s detremined the precise mathematical form of Fermi’s theory of beta decay. (For details of this episode see Appendix 8.)

III. Conclusion

In this essay varying views on the nature of experimental results have been presented. Some argue that the acceptance of experimental results is based on epistemological arguments, whereas others base acceptance on future utility, social interests, or agreement with existing community commitments. Everyone agrees , however, that for whatever reasons, a consensus is reached on experimental results. These results then play many important roles in physics and we have examined several of these roles, although certainly not all of them. We have seen experiment deciding between two competing theories, calling for a new theory, confirming a theory, refuting a theory, providing evidence that determined the mathematical form of a theory, and providing evidence for the existence of an elementary particle involved in an accepted theory. We have also seen that experiment has a life of its own, independent of theory. If, as I believe, epistemological procedures provide grounds for reasonable belief in experimental results, then experiment can legitimately play the roles I have discussed and can provide the basis for scientific knowledge.

Bibliography

Principal Works:

Other Suggested Reading

Other Internet Resources

[Please contact the author with suggestions.]

Related Entries

confirmation | logic: inductive | rationalism vs. empiricism | scientific method | scientific realism

Copyright © 1998 by
Allan Franklin
Allan.Franklin@Colorado.edu


A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Table of Contents

First published: October 5, 1998
Content last modified: October 26, 1998