Stanford Encyclopedia of Philosophy
This is a file in the archives of the Stanford Encyclopedia of Philosophy.

Counterfactual Theories of Causation

First published Wed Jan 10, 2001; substantive revision Sun Mar 30, 2008

The basic idea of counterfactual theories of causation is that the meaning of causal claims can be explained in terms of counterfactual conditionals of the form “If A had not occurred, C would not have occurred”. While counterfactual analyses have been given of type-causal concepts, most counterfactual analyses have focused on singular causal or token-causal claims of the form “event c caused event e”. Analyses of token-causation have become popular in the last thirty years, especially since the development in the 1970's of possible world semantics for counterfactuals. The best known counterfactual analysis of causation is David Lewis's (1973b) theory. However, intense discussion over thirty years has cast doubt on the adequacy of any simple analysis of singular causation in terms of counterfactuals. Recent years have seen a proliferation of different refinements of the basic idea to achieve a closer match with commonsense judgements about causation.


1. Early Counterfactual Theories

The first explicit definition of causation in terms of counterfactuals was, surprisingly enough, given by Hume, when he wrote: “We may define a cause to be an object followed by another, and where all the objects, similar to the first, are followed by objects similar to the second. Or, in other words, where, if the first object had not been, the second never had existed.” (1748, Section VII). It is difficult to understand how Hume could have confused the first, regularity definition with the second, very different counterfactual definition.

At any rate, Hume never explored the alternative counterfactual approach to causation. In this, as in much else, he was followed by generations of empiricist philosophers. The chief obstacle in empiricists' minds to explaining causation in terms of counterfactuals was the obscurity of counterfactuals themselves, owing chiefly to their reference to unactualised possibilities. Starting with J. S. Mill (1843), empiricists tried to analyse counterfactuals ‘metalinguistically’ in terms of implication relations between statements. The rough idea is that a counterfactual of the form “If it had been the case that A, it would have been the case that C” is true if and only if there is an auxiliary set S of true statements consistent with the antecedent A, such that the members of S, when conjoined with A, imply the consequent C. Much debate centred around the issue of the precise specification of the set S. (See N. Goodman 1947.) Most empiricists agreed that S would have to include statements of laws of nature, while some thought that it would have to include statements of singular causation. While the truth conditions of counterfactuals remained obscure in these ways, few empiricists thought it worthwhile to try to explain causation via counterfactuals.

Indeed, the first real attempts to present rigorous counterfactual analyses of causation came only in the late 1960's. (See A. Lyon 1967.) Typical of these attempts was J. L. Mackie's counterfactual analysis in Chapter 2 of his seminal book The Cement of the Universe (1974). As well as offering a sophisticated regularity theory of causation ‘in the objects’, Mackie presented a counterfactual account of the concept of a cause as “what makes the difference in relation to some background or causal field” (1980, p.xi). Mackie's account of the concept of causation is rich in insights, especially concerning its relativity to a field of background conditions. However, his account never gained as much attention as his regularity theory of causation ‘in the objects’, no doubt because his view of counterfactuals (in his (1973)), as condensed arguments that do not have truth values, compounded empiricists' scepticism about counterfactuals.

The true potential of the counterfactual approach to causation did not become clear until counterfactuals became better understood through the development of possible world semantics in the early 1970's.

2. Lewis's 1973 Counterfactual Analysis

The best known and most thoroughly elaborated counterfactual theory of causation is David Lewis's theory in his (1973b), which was refined and extended in articles subsequently collected in his (1986a). In response to doubts about the theory's treatment of preemption, Lewis subsequently proposed a fairly radical revision of the theory. (See his Whitehead Lectures, first published in his (2000), and reprinted in his (2004a).) In this section we shall confine our attention to the original 1973 theory, deferring the later changes he proposed for consideration below.

2.1 Counterfactuals and Causal Dependence

Like most contemporary counterfactual theories, Lewis's theory employs a possible world semantics for counterfactuals. Such a semantics states truth conditions for counterfactuals in terms of similarity relations between possible worlds. Lewis famously espouses a realism about possible worlds, according to which non-actual possible worlds are real concrete entities on a par with the actual world. (See Lewis's defence of modal realism in his (1986e).) However, most contemporary philosophers would seek to deploy the explanatorily fruitful possible worlds framework while distancing themselves from full-blown realism about possible worlds themselves. For example, many would propose to understand possible worlds as maximally consistent sets of propositions; or even to treat them instrumentally as useful theoretical entities having no independent reality.

The central notion of a possible world semantics for counterfactuals is a relation of comparative similarity between worlds (Lewis 1973a). One world is said to be closer to actuality than another if the first resembles the actual world more than the second does. Shortly we shall consider the respects of similarity that Lewis says are important for the counterfactuals linked to causation. For now we simply note two formal constraints he imposes on this similarity relation. First, the relation of similarity produces a weak ordering of worlds so that any two worlds can be ordered with respect to their closeness to the actual world, with allowance being made for ties in closeness. Secondly, the actual world is closest to actuality, resembling itself more than any other world resembles it.

In terms of this similarity relation, the truth condition for the counterfactual “If A were (or had been) the case, C would be (or have been) the case”, is stated as follows:

(1) “If A were the case, C would be the case” is true in the actual world if and only if (i) there are no possible A-worlds; or (ii) some A-world where C holds is closer to the actual world than is any A-world where C does not hold.

We shall ignore the first case in which the counterfactual is vacuously true. The fundamental idea of this analysis is that the counterfactual “If A were the case, C would be the case” is true just in case it takes less of a departure from actuality to make the antecedent true along with the consequent than to make the antecedent true without the consequent.

In terms of counterfactuals, Lewis defines a notion of causal dependence between events, which plays a central role in his theory (1973b).

(2) Where c and e are two distinct possible events, e causally depends on c if and only if, if c were to occur e would occur; and if c were not to occur e would not occur.

This condition states that whether e occurs or not depends on whether c occurs or not. Where c and e are actual occurrent events, this truth condition can be simplified somewhat. For in this case it follows from the second formal condition on the comparative similarity relation that the counterfactual “If c were to occur e would occur” is automatically true: this formal condition implies that a counterfactual with true antecedent and true consequent is itself true. Consequently, the truth condition for causal dependence becomes:

(3) Where c and e are two distinct actual events, e causally depends on c if and only if, if c were not to occur e would not occur.

The right hand side of this condition is, of course, Hume's second definition of causation. (As we shall see shortly, Lewis's official definition of causation differs from it, as he defines causation not in terms of causal dependence directly, but in terms of chains of causal dependence.) Why is it plausible to think that causation is conceptually linked with counterfactuals in the way specified by this definition of causal dependence? One reason is that the idea of a cause is conceptually linked with the idea of something that makes a difference and this idea in turn is best understood in terms of counterfactuals. In Lewis's words: “We think of a cause as something that makes a difference, and the difference it makes must be a difference from what would have happened without it. Had it been absent, its effects — some of them, at least, and usually all — would have been absent as well.” (1973b, p.161)

There are three important things to note about the definition of causal dependence. First, it takes the primary relata of causal dependence to be events. Lewis's own theory of events (1986b) construes events as classes of possible spatiotemporal regions. However, very different conceptions of events are compatible with the basic definition. Indeed, it even seems possible to formulate it in terms of facts rather than events. (For instance, see Mellor 1996, 2004.)

Secondly, the definition requires the causally dependent events to be distinct from each other. Distinctness means that the events are not identical, neither overlaps the other, and neither implies the other. This qualification is important if spurious non-causal dependences are to be ruled out. (For this point see Kim 1973 and Lewis 1986b.) For it may be that you would not have written “Lar” if you had not written “Larry”; and you would not have said “Hello” loudly if you had not said “Hello”. But neither dependence counts as a causal dependence since the paired events are not distinct from each other in the required sense.

Thirdly, the counterfactuals that are employed in the analysis are to be understood according to what Lewis calls the standard interpretation. There are several possible ways of interpreting counterfactuals; and some interpretations give rise to spurious non-causal dependences between events. For example, suppose that the events c and e are effects of a common cause d. It is tempting to reason that there must be a causal dependence between c and e by engaging in the following piece of counterfactual reasoning: if c had not occurred, then it would have to have been the case that d did not occur, in which case e would not have occurred. But Lewis says these counterfactuals, which he calls backtracking counterfactuals, are not to be used in the assessment of causal dependence. The right counterfactuals to be used are non-backtracking counterfactuals that typically hold the past fixed up until the time at which the counterfactual antecedent is supposed to obtain.

2.2 The Temporal Asymmetry of Causal Dependence

What constitutes the direction of the causal relation? Why is this direction typically aligned with the temporal direction from past to future? In answer to these questions, Lewis (1979) argues that the direction of causation is the direction of causal dependence; and it is typically true that events causally depend on earlier events but not on later events. He emphasises the contingency of the latter fact because he regards backwards or time-reversed causation as a conceptual possibility that cannot be ruled out a priori. Accordingly, he dismisses any analysis of counterfactuals that would deliver the temporal asymmetry by conceptual fiat.

Lewis's explanation of the temporal asymmetry of counterfactual dependence is based on a de facto asymmetry about the actual world. He defines a determinant for an event as any set of conditions jointly sufficient, given the laws of nature, for the event's occurrence. (Determinants of an event may be causes or traces of the event.) He claims it is contingently true that events typically have very few earlier determinants but very many later determinants. As an illustration, he cites Popper's (1956) example of a spherical wavefront expanding outwards from a point source. This is a process where each sample of the wave postdetermines what happens at the point at which the wave is emitted. He says the reverse process in which a spherical wave contracts inward with each sample of wave predetermining what happens at the point the wave is absorbed would obey the laws of nature but seldom happens in actual fact.

Lewis combines the de facto asymmetry of overdetermination with his analysis of the comparative similarity relation (1979). According to this analysis, there are several respects of similarity to be taken into account in evaluating non-backtracking counterfactuals: similarity with respect to laws of nature and also similarity with respect to particular matters of fact. Worlds are more similar to the actual world the fewer miracles or violations of the actual laws of nature they contain. Again, worlds are more similar to the actual world the greater the spatio-temporal region of perfect match of particular fact they have with the actual world. If the actual world is governed by deterministic laws, these rules will clash in assessing which counterfactual worlds are more similar to the actual world. For a world that makes a counterfactual antecedent true must differ from the actual world either in allowing some violation of the actual laws, or in differing from the actual world in particular matters of fact. Lewis's analysis allows a tradeoff between these competing respects of similarity in such cases. It implies that worlds with an extensive region of perfect match of particular fact can be considered very similar to the actual world provided that the match in particular facts with the actual world is achieved at the cost of a small, local miracle, but not at the cost of a big, diverse miracle. Taken by itself, this account contains no built-in time asymmetry. That comes only when it is combined with the asymmetry of overdetermination.

To see how the two parts combine, consider the famous example of Nixon and the Nuclear Holocaust. An early objection to Lewis's account of counterfactuals (Fine 1975) was that, counterintuitively, it makes this counterfactual false:

(4) If Nixon had pressed the button, there would have been a nuclear war.

The argument is that a world in which Nixon pressed the button, but some minute violation of the laws then prevented a nuclear war, is much more like the actual world than one in which Nixon pressed the button and a nuclear war took place. Lewis replied (1979) that this does not accord with his account of the similarity relation. On this account, a button-pressing world that diverges from the actual world by virtue of a miracle is more like the actual world than a button-pressing world that converges with the actual world by virtue of a miracle. For in view of the asymmetry of overdetermination, the divergence miracle that allows Nixon to press the button need only be a small, local miracle, but the convergence miracle required to wipe out the traces of Nixon's pressing the button must be a very big, diverse miracle. Of course, if the asymmetry of overdetermination went in the opposite temporal direction, the very same standards of similarity would dictate the opposite verdict.

In general, then, the symmetric analysis of similarity, combined with the de facto asymmetry of overdetermination, implies that worlds that accommodate counterfactual changes by preserving the actual past and allowing for divergence miracles are more similar to the actual world than worlds that accommodate such changes by allowing for convergence miracles that preserve the actual future. This fact in turn implies that, where the asymmetry of overdetermination obtains, the present counterfactually depends on the past, but not on the future.

2.3 Transitivity and Preemption

According to Lewis, causal dependence between actual events is sufficient for causation, but not necessary (1973b): it is possible to have causation without causal dependence. This can happen in the following way. Suppose that c causes d in virtue of the fact that d causally depends on c, and d causes e in virtue of the fact that e causally depends on d. Then because causation is transitive, Lewis insists, c must cause e. However, because causal dependence is not transitive like causation, the causal relation between c and e may not be matched by a causal dependence. (We shall shortly consider an example of this kind.)

To overcome this problem Lewis extends causal dependence to a transitive relation by taking its ancestral. He defines a causal chain as a finite sequence of actual events c, d, e,… where d causally depends on c, e on d, and so on throughout the sequence. Then causation is finally defined in these terms:

(5) c is a cause of e if and only if there exists a causal chain leading from c to e.

This definition not only ensures the transitivity of causation, but it also appears to solve an additional problem to do with preemption that is illustrated by the following example. Suppose that two crack marksmen conspire to assassinate a hated dictator, agreeing that one or other will shoot the dictator on a public occasion. Acting side-by-side, assassins A and B find a good vantage point, and, when the dictator appears, both take aim. A pulls his trigger and fires a shot that hits its mark, but B desists from firing when he sees A pull his trigger. Here assassin A's actions are the actual cause of the dictator's death, while B's actions are a preempted potential cause. (Lewis distinguishes such cases of preemption from cases of symmetrical overdetermination in which two processes terminate in the effect, with neither process preempting the other. Lewis believes that these cases are not suitable test cases for a theory of causation since they do not elicit clear judgements.) The problem raised by this example of preemption is that both actions are on a par from the point of view of causal dependence: if neither A nor B acted, then the dictator would not have died; and if either had acted without the other, the dictator would have died.

However, given the definition of causation in terms of causal chains, Lewis is able to distinguish the preempting actual cause from the preempted potential cause. There is a causal chain running from A's actions to the dictator's death, but no such chain running from B's actions to the dictator's death. Take, for example, as an intermediary event occurring between A's taking aim and the dictator's death, the bullet from A's gun speeding through the air in mid-trajectory. The speeding bullet causally depends on A's action since the bullett would not have been in mid-trajectory without A's action; and the dictator's death causally depends on the speeding bullett since by the time the bullett is in mid-trajectory B has refrained from firing so that the dictator would not have died without the presence of the speeding bullett. (Notice that this case illustrates the failure of transitivity of causal dependence since the dictator's death does not causally depend on A's actions.) Hence, we have a causal chain, and so causation. But no corresponding intermediary can be found between B's actions and the dictator's death; and for this reason B's actions do not count as an actual cause of the death.

2.4 Chancy Causation

So far we have considered how the counterfactual theory of causation works under the assumption of determinism. But what about causation when determinism fails? Lewis (1986c) argues that chancy causation is a conceptual possibility that must be accommodated by a theory of causation. Indeed, contemporary physics tells us the actual world abounds with probabilistic processes that are causal in character. To take a familiar example (Lewis 1986c): suppose that you mischievously hook up a bomb to a radioactive source and geiger counter in such a way that the bomb explodes when the counter registers a certain number of clicks. If it happens that the counter registers the required number of clicks and the bomb explodes, your act caused the explosion, even though there is no deterministic connection between them.

In order to accommodate chancy causation, Lewis (1986c) defines a more general notion of causal dependence in terms of chancy counterfactuals. These counterfactuals are of the form “If A were the case Pr (C) would be x”, where the counterfactual is an ordinary would-counterfactual, interpreted according to the semantics above, and the Pr operator is a probability operator with narrow scope confined to the consequent of the counterfactual. Lewis interprets the probabilities involved as temporally indexed single-case chances. (See his (1980) for the theory of single-case chance.)

The more general notion of causal dependence reads:

(6) Where c and e are distinct actual events, e causally depends on c if and only if, if c were not occurred, the chance of e's occurring would be much less than its actual chance.

This definition covers cases of deterministic causation in which the chance of the effect with the cause is 1 and the chance of the effect without the cause is 0. But it also allows for cases of irreducible probabilistic causation where these chances can take non-extreme values. It is similar to the central notion of probabilistic relevance used in probabilistic theories of type-causation, except that it employs chancy counterfactuals rather than conditional probabilities. (See the discussion in Lewis 1986c for the advantages of the counterfactual approach over the probabilistic one. Also see the entry “Probabilistic Causation”.)

The rest of the theory of chancy causation follows the outlines of the theory of deterministic causation. Causal dependence is extended to a transitive notion by taking its ancestral. As before, we have causation when we have one or more steps of causal dependence.

2.5 The Theory's Advantages

Before turning to survey some of the problems confronting Lewis's theory of causation, it is worthwhile pausing to consider some of the advantages it affords.

At the time that Lewis advanced his original theory, regularity theories of causation were the orthodoxy. Taking Hume's first definition as their point of departure, these theories defined causation in terms of subsumption under lawful regularities. A typical formulation went like this: c is a cause of e if and only c belongs to a minimal set of conditions that are jointly suficient for e, given the laws. It was well known that theories of this kind were faced with a number of recalcitrant counterexamples. Thus, while c might belong to a minimal set of sufficient conditions for e when c is a genuine cause of e, this might also be true when c is an effect of e — an effect which could not have occurred, given the laws and the actual circumstances, except by being caused by e. Or it might be true when c and e are joint effects of a common deterministic cause. Or when c is a preempted potential cause of e — something that did not cause e, but would have done so if the actual cause had been absent.

In contrast, Lewis's counterfactual analysis of causation is not subject to the same counterexamples, so long the counterfactuals in the definition of causal dependence and causation are interpreted in a non-backtracking fashion. The theory implies that even if c belongs to a minimal set of sufficient conditions for e, e will not causally depend on c when c occurs after e as its effect, since earlier events do not typically causally depend on later events. Nor will e causally depend on c when c and e are joint effects of a common cause, since the non-backtracking counterfactual “If c had not occurred, e would still have occurred” will be true in view of the fact that it holds fixed the presence of the common cause. Nor will c count as a cause of e when c is a preempted potential cause of e in a typical case of preemption. For, as we have seen, c will not be connected to e by a chain of causal dependences.

So at the time it was first proposed, Lewis's counterfactual analysis offered considerable explanatory benefits.

3. Problems for Lewis's Counterfactual Theory

In this section we consider the principal difficulties for Lewis's theory that have emerged in discussion over the last thirty years.

3.1 Context-sensitivity

One relatively overlooked aspect of the concept of causation is its sensitivity to contextual factors. In so far as Lewis's theory overlooks this context-sensitivity, it represents a problem for the theory.

The theory assumes that causation is an absolute relation whose nature does not vary from one context to another. (This follows from the way the counterfactuals that define the central notion of causal dependence are governed by a unique, context-invariant system of weighted respects of similarity.) According to the theory, any event but for which an effect would not have occurred is one of the effect's causes. But this generates some absurd results. For example, suppose a camper lights a fire, a sudden gust of wind fans the fire, the fire gets out of control and the forest burns down. It is true that if the camper had not lit the fire, the forest fire would not have occurred. But it is also true that the forest fire would not have occurred if any of a vast number of contingencies, including the camper's birth and his failure to be struck down by a meteor before striking the match, had not occurred. But commonsense draws a distinction between causes and background conditions, ranking the camper's lighting of the fire among the former, and his birth and his failure to be struck down by a meteor, among the latter.

H. L. A. Hart and A. Honoré (1965; 2nd ed 1985) argue that the distinction between causes and conditions is relative to context in at least two different ways. One form of relativity might be called relativity to the context of occurrence. If a forest is destroyed by fire, the presence of oxygen would be cited as a mere condition of the forest's destruction. On the other hand, if a fire breaks out in a laboratory where oxygen is deliberately excluded, it may be appropriate to cite the presence of oxygen as a cause of the fire. The second form of relativity might be called relativity to the context of enquiry. For example, the cause of a great famine in India may be identified by an Indian farmer as the drought, but the World Food Authority may identify the Indian government's failure to build up reserves as the cause, and the drought as a mere condition.

For the most part, Lewis ignores these subtle context-sensitive distinctions, as he says he is interested in a broad notion of cause. In his view (1986d), every event has an objective causal history consisting of a vast structure of events ordered by causal dependence. The human mind may select parts of the causal history for attention, perhaps different parts for different purposes of enquiry. However, Lewis does not specify the ‘principles of invidious selection’ by which some parts of the causal history are selected for attention, except to mention the relevance of Grice's maxims of conversation. But Grice's maxims of conversation, as general principles of rational information exchange, are not well suited to explaining the causation-specific distinctions we draw. As several philosophers have pointed out (A. Garfinkel (1981); C. Hitchcock (1996a, 1996b); P. Lipton (1990); J. Woodward (1984); and B. Van Fraassen (1981)), some of the contextual principles behind our causal judgements seem to rely on considerations concerning which class of situations the effect is contrasted with.

Thus, in the example of the Indian famine, we contrast the actual situation in which a famine occurs with another situation in which normal conditions prevail and a famine does not occur. A cause is then thought of as a factor that makes the difference between these situations; and the background conditions are thought of as those factors that are common to the two situations. In different contexts of enquiry, the contrast situation is framed in different terms. A farmer may take the contrast situation to be the normal situation in which the government does not stockpile food reserves but there is no famine. In this case it would be reasonable for the farmer to identify the drought as the factor that makes the difference between this contrast situation and the actual situation in which there is famine. On the other hand, an official of the World Food Authority with a different conception of what normally happens may take the contrast situation to be one in which governments build up food reserves as a precaution against droughts. Consequently, it would be reasonable for the official to see the failure of the government to build up food reserves as the factor that makes the difference between the contrast situation and the actual situation in which there is a famine. (For discussion of the relevance of contrastive explanation to the causes/conditions distinction see Menzies 2004a; 2007.)

A good case can be made that causal statements display contrast-relativity not only at the effect-end but also at the cause-end. (See Hitchcock, 1996a, 1996b; Maslen 2004; Schaffer 2005) Recognising this helps to deal with a problem affecting Lewis's original theory. In evaluating whether an event c caused an event e, Lewis's theory says we have to consider what would have happened in those closest worlds in which c did not occur. For example, in evaluating whether the camper's lighting of the fire caused the forest, we have to consider what would have happened in those closest possible worlds in which the camper's action of lighting the fire did not occur. Are these worlds in which the camper does not light the fire but does something else instead, or are they worlds in which he lights the fire in slightly different manner (perhaps with a lighter instead of matches) or at a slightly different time (three minutes later when the wind died down)? In order to answer such questions, Lewis says it is necessary to say how much of a change or a delay it takes for an event to become an altogether different event, rather than a different version of the same event. (Lewis sometimes discusses this issue as the question of how fragile events are: a modally fragile event is one which cannot occur in a different manner or at a different time from its actual manner and time of occurrence. See Lewis 1986b.) The problem, as he sees it, is that there is no unique principled way in which we do this: there is linguistic indeterminancy about what event nominals refer to. He writes: “We have not made up our minds: and if we presuppose sometimes one answer and sometimes another answer, we are entirely within our linguistic rights. This is itself a big problem for a counterfactual analysis of causation, quite apart from the problem of preemption.” (2000, p.186)

However, if we recognise that the cause-end of causal statements displays contrast-relativity as well as the effect-end, we can obviate the need to provide an account of the identity of events under counterfactual changes. For example, suppose we are interested in why the forest fire took one path P1 rather than another path P2. Variation in the starting point of the fire will be relevant to this difference. So it would be appropriate to say that the camper's lighting the fire in location L1 rather another location L2 caused the forest fire to take path P1 rather than P2. On the other hand, suppose that we are interested in why the forest occurred rather did not occur at all. Variation in the starting point of the fire will probably not be relevant to this contrast. Rather the appropriate causal statement will be one that says the camper's lighting the fire (in some or other location) rather his not lighting it (in any location) caused the fire to occur rather than not to occur. Such causal statements reveal the relevant contrasts at both the cause- and the effect-ends. Sometimes, such contrasts are indicated by the use of emphasis as in “The fire's starting in location L1 caused the fire to take path P1”. But more often than not the surface form of causal statements does not disclose the contrasts that are intended and they must be supplied by context. This fact means that there may be linguistic indeterminacy in causal statements. But it is not indeterminacy about the reference of event nominals, but rather about the situations that are intended as contrasts for the cause and the effect. Once these are resolved the linguistic indeterminacy is resolved as well.

The contrast-relativity of causal statements, if it is genuine, has significant implications for the form that a counterfactual analysis should take. Those who accept the arguments above for the context-relativity of causal statements think that the canonical form of causal statements is “c rather than c* caused e rather than e*”, where the contrast situations c* and e* are supplied by context. This suggests that the definition of causal dependence should not be formulated in terms of the counterfactual “If c had not occurred, e would not have occurred”, but the more specific counterfactual “If c* had occurred instead of c, then e* would have occurred instead of e”. This formulation has several advantages over the old formulation. (See Schaffer 2005.) Its chief advantage from the point of our discussion is that it obviates the need for the counterfactual theory to provide an account of the identity of events under hypothetical changes. With this new formulation, there is no need to work out whether c* and e* are identical with, or different from, c and e,respectively. It is simply stipulated on the basis of contextual considerations that c* and e* are intended to act as contrasts to c and e.

3.2 Temporal Asymmetry

There have been several important critical discussions of Lewis's explanation of the temporal asymmetry of causation. (See A. Elga 2000;M. Frisch 2005; D. Hausman 1998, Chap. 6;P. Horwich 1987, Chap. 10; and H. Price 1996, Chap. 6.)

One kind of criticism has focused on the psychological implausibility of Lewis's explanation. (See Horwich 1987.) Recall that the explanation appeals, on the one hand, to a system of weighted respects of similarity between possible worlds that is delivered by a priori conceptual analysis and, on the other hand, to an asymmetry of overdetermination that is claimed to be a contingent a posterioritruth about the actual world. The two-part explanation is supposed to employ facts that are sufficiently well known to play a role in the explanation of our linguistic use of counterfactuals. However, it is psychologically implausible that the intricate system of weighted respects of similarity involving comparison of miracles of different sizes could capture the intuitive similarity relation used in counterfactual reasoning. Why should we have developed such a baroque notion of similarity? Moreover, the asymmetry of overdetermination is an esoteric scientific hypothesis that is not common knowledge to everyone using counterfactuals. So it is very unlikely that this hypothesis could account for ordinary speakers' mastery of the temporal asymmetry of counterfactuals. (For Lewis's reply to this criticism see Postscript E to “Counterfactual Dependence and Time's Arrow” in his (1986a, p. 66).)

Another criticism is that the asymmetry of overdetermination does not exist in the form required to support Lewis's explanation of the temporal asymmetry of counterfactuals. Lewis's idea is that any event e has many postdeterminants and few predeterminants, where a predeterminant or postdeterminant of an event is a set of conditions that are jointly sufficient, given the laws of nature, for the occurrence of the event. But if Lewis is assuming that the laws involved are like those of classical mechanics, he is mistaken on this score. For a theory that is time symmetric and deterministic in both the forward and backward direction will imply that for any local event e and any time t, there is a unique set of conditions obtaining at t that are necessary and sufficient, given the laws, for the occurrence of the event e. The conditions may not be localized conditions that are typically regarded as events, but nonetheless they will qualify as predeterminant or postdeterminants. For example, consider Popper's example of the wave spreading out from a point source. If there is a process that postdetermines what happens at the point at which the wave is emitted, there is also a process, perhaps a very unlocalized process, that predetermines this. Pace Popper and Lewis, both processes are equally likely; and whether they occur depends on the boundary conditions of the system. (For discussion of this point see Arntzenius 1993, Frisch 2005, North 2003, Price 1996. Also see the entry “Thermodynamic Asymmetry in Time”.)

A related criticism concerns the asymmetry of miracles that is central to Lewis's account of the temporal asymmetry of causation. The asymmetry of miracles consists in the fact that a miracle that realises a counterfactual antecedent about particular facts at time t by having a possible world diverge from the actual world just before the time t is smaller and less diverse than a miracle that realises the same counterfactual antecedent and makes a possible world converge to the actual world after the time t. Adam Elga (2000) has argued that the asymmetry of miracles does not hold in many cases.

Elga's argument proceeds by way of an example: Gretta cracks an egg into a hot frying pan at 8:00am and at 8:05am the egg is cooked. Consider the process that occurs in the period from 8:00am to 8:05am, run backwards in time: a cooked egg sits in the frypan; it coalesces into a raw egg and leaps upward; and a shell closes around it. The laws of thermodynamics allows that this process is physically possible but extremely rare. These laws also state that the process is very sensitive in its initial conditions: even the slightest changes in the molecules making up the state of the cooked egg would result in the process evolving in such a way that the cooked egg continues to sit in the pan rather than coalescing into a raw egg and leaping upwards. But this is, Elga points out, exactly the kind of change that would make for a “convergence miracle”. Take the state of the actual world at 8:05am, holding fixed its future after this point; make some small changes to the molecules making up this state; and then run the laws of thermodynamics backwards in time, and we will almost certainly arrive at a state in which the egg sits in the pan growing colder. This state will be one in which Gretta does not crack the egg. The small change in the state of the actual world at 8:05am is a “convergence miracle” that yields a possible world that realises the counterfactual proposition that Gretta does not crack the egg at 8:00am while holding fixed the actual future after 8:05am. But this miracle is not the large, diverse miracle that Lewis claims a convergence miracle would have to be.

3.3 Transitivity

As we have seen, Lewis builds transitivity into causation by defining it in terms of chains of causal dependence. The transitivity of causation fits with some of our explanatory practices. For example, historians wishing to explain some significant historical event will trace the explanation back through a number of causal links, concluding that the event at the beginning of the causal chain is responsible for the event being explained. On the other hand, a number of counter-examples have been presented which cast doubt on transitivity. (Lewis 2004a presents a short catalogue of these counterexamples.) Here is a sample of three counterexamples.

First, an example due to Michael McDermott (1995). A and B each have a switch in front of them, which they can move to the left or right. If both switches are thrown into the same position, a third person C receives a shock. A does not want to shock C. Seeing B's switch in the left position, A moves her switch to the right. B does want to shock C. Seeing A's switch thrown to the right, she now moves her switch to the right as well. C receives a shock. Clearly, A's throwing her switch to the right causes B to throw her switch to the right, which in turn causes C to receive the shock. But A attempted to prevent the shock so that it seems unreasonable to say that A's move causes C to be shocked.

Second, an example due to Ned Hall (2004).  A person is walking along a mountain trail, when a boulder high above is dislodged and comes careering down the mountain slopes. The walker notices the boulder and ducks at the appropriate time. The careering boulder causes the walker to duck and this, in turn, causes his continued stride. (This second causal link involves double prevention: the duck prevents the collision between walker and boulder which, had it occurred, would have prevented the walker's continued stride.) However, the careering boulder is the sort of thing that would prevent the walker's continued stride and so it seems counterintuitive to say that it causes the stride.

Third, an example due to Douglas Ehring (1987). Jones puts some potassium salts into a hot fire. Because potassium compounds produce a purple flame when heated, the flame changes to a purple colour, though everything else remains the same. The purple flame ignites some flammable material nearby. Here we judge that putting the potassium salts in the fire caused the purple flame, which in turn caused the flammable material to ignite. But it seems implausible to judge that putting the potassium salts in the fire caused the flammable material to ignite.

Various replies have been made to these counterexamples. The last counterexample seems the most easily deflected. For example, Maslen (2004), who endorses the contrast-relativity of causl statements, has argued that this example is misdiagnosed as a counterexample to transitivity, as the contrast situation at the effect-end of the first causal statement does not match up with the contrast situation the cause-end of the second causal statement. Thus, the first causal statement should be interpreted as saying that Jones's putting potassium salts in the fire rather not doing so caused the flame to turn purple rather than yellow; but the second causal statement should be interpreted as saying that the purple fire's occurring rather than not occurring caused the flammable material to ignite rather not to ignite. Where there is a mismatch of this kind, we do not have a genuine counterexample to transitivity. L. Paul (2004) offers a similar diagnosis of the last example, though her diagnosis proceeds in terms of event aspects, which she takes to be causation primary relata. She argues similarly that there is mismatch between the event aspect that is the effect of the first causal link (the flame's being a purple colour) and the event aspect that is the cause of the second causal link (the flame's touching the flammable material).

The first and second examples cannot be handled in the same way. Some defenders of transitivity have replied that our intuitions about the intransitivity of causation in these examples are misleading. For instance, Ned Hall (2000) has argued that we should suspect our intuition in the second example because it involves double prevention, which he claims is not a genuine kind of causation. Thus, he denies that the walker's ducking caused his continued stride since this holds only by double prevention.(This also commits him to denying that causal dependence is sufficient for causation, since the walker's continued stride causally depends on his ducking the boulder.) He offers a different diagnosis of why our intuitions go awry in the first example. Lewis (2004a) adopts a similar strategy of trying to explain away the force of our intuitions in these examples. He points out that the counterexamples to transitivity typically involve a structure in which a c-type event generally prevents an e-type but in the particular case the c-event actually causes another event that counters the threat and causes the e-event. If we mix up questions of what is generally conducive to what, with questions about what caused what in this particular case, he says, we may think that it is reasonable to deny that c causes e. But if we keep the focus sharply on the particular case, we must insist that c does in fact cause e.

The debate about the transitivity of causation is not easily settled, partly because it is tied up with the issue of how it is best for a counterfactual theory to deal with examples of preemption. As we have seen, Lewis's counterfactual theory relies on the transitivity of causation to handle cases of preemption. If such cases could be handled in some other way, that would take some of the theoretical pressure off the theory, allowing it concede the persuasive counterexamples to transitivity without succumbing to the difficulties posed by preemption. (For more on this point see Hitchcock 2001.)

3.4 Preemption

As we have seen, Lewis employs his strategy of defining causation in terms of chains of causal dependence not only to make causation transitive, but also to deal with preemption examples. However, there are preemption examples that this strategy cannot deal with satisfactorily. Difficulties concerning preemption have proven to be the biggest bugbear for Lewis's theory.

In his (1986c), Lewis distinguishes cases of early and late preemption. In early preemption examples, the process running from the preempted alternative is cut short before the main process running from the preempting cause has gone to completion. The example of the two assassins, given above, is an example of this sort. The theory of causation in terms of chains of causal dependence can handle this sort of example. In contrast, cases of late preemption are ones in which the process running from the preempted cause is cut short only after the main process has gone to completion and brought about the effect. The following is an example of late preemption due to Hall (2004).

Billy and Suzy throw rocks at a bottle. Suzy throws first so that her rock arrives first and shatters the glass. Without Suzy's throw, Billy's throw would have shattered the bottle. However, Suzy's throw is the actual cause of the shattered bottle, while Billy's throw is merely a preempted potential cause. This is a case of late preemption because the alternative process (Billy's throw) is cut short after the main process (Suzy's throw) has actually brought about the effect.

Lewis's theory cannot explain the judgement that Suzy's throw was the actual cause of the shattering of the bottle. For there is no causal dependence between Suzy's throw and the shattering, since even if Suzy had not thrown her rock, the bottle would have shattered due to Billy's throw. Nor is there a chain of stepwise dependences running cause to effect, because there is no event intermediate between Suzy's throw and the shattering that links them up into a chain of dependences. Take, for instance, Suzy's rock in mid-trajectory.  Certainly, this event depends on Suzy's initial throw, but the problem is that the shattering of the bottle does not depend on it, because even without it the bottle would still have shattered because of Billy's throw.

To be sure, the bottle shattering that would have occurred without Suzy's throw would be different from the bottle shattering that actually occurred with Suzy's throw. For a start, it would have occurred later. This observation suggests that one solution to the problem of late preemption might be to insist that the events involved should be construed as fragile events. Accordingly, it will be true rather than false that if Suzy had not thrown her rock, then the actual bottle shattering, taken as a fragile event with an essential time and manner of occurrence, would not have occurred. Lewis himself does not endorse this response on the grounds that a uniform policy of construing events as fragile would go against our usual practices, and would generate many spurious causal dependences. For example, suppose that a poison kills its victim more slowly and painfully when taken on a full stomach. Then, the victim's eating dinner before he drinks the poison would count as a cause of his death since the time and manner of the death depend on the eating of the dinner. (For discussion of the limitations of this response see Lewis 1986c, 2000.)

When we turn from preemption examples involving deterministic causation to those involving chancy causation, we see that the problems for Lewis's theory multiply. One particularly recalcitrant problem is described in Menzies 1989. (See also Woodward 1990.) Suppose that two systems can produce the same effect, perhaps at the same time and in the same manner. (It does not matter whether this is an example of early or late preemption.) However, one system is much more reliable than the other. The reliable system starts and, left to itself, will very probably produce the effect. But you do not leave it to itself. You throw a switch that shuts down the reliable system and turns on the unreliable one. As luck would have it, the unreliable system works and brings about the effect. This kind of example presents a problem for the probabilistic generalisation of the counterfactual theory because the preempting actual cause decreases the chance of the effect while the preempted potential cause increases its chance. In addition to the problem of explaining how the preempting cause qualifies as a cause when the effect does not causally depend on it, the probabilistic counterfactual theory faces the problem of explaining how the preempted cause is not really a cause when the effect does causally depend on it.(Examples of this kind have been the subject of extensive discussion in the context of both counterfactual and probabilistic theories of causation. For discussions about how best to deal with them within theories admitting of indeterminism,see Barker 2004; Beebee 2004; Dowe 2000, 2004; Hitchcock 2004; Kvart 2004; Noordhof 1999, 2004; Ramachandran 1997, 2004.)

4. Later Developments

In this section we shall consider some recent developments of the counterfactual approach to causation, which have been motivated by the desire to overcome the deficiencies in Lewis's 1973 theory, especially with respect to preemption.

4.1 Lewis's 2000 Theory

In an attempt to deal with the various problems facing his 1973 theory, Lewis developed a new version of the counterfactual theory, which he first presented in his Whitehead Lectures at Harvard University in March 1999. (A shortened version of the lectures appeared as his (2000). The full lectures are published as his (2004a).)

Counterfactuals play a central role in the new theory, as in the old. But the counterfactuals it employs do not simply state dependences of whether one event occurs on whether another event occurs. The counterfactuals state dependences of whether, when, and how one event occurs on whether, when, and how another event occurs. A key idea in the formulation of these counterfactuals is that of an alteration of an event. This is an actualised or unactualised event that occurs at a slightly different time or in a slightly different manner from the given event. An alteration is, by definition, a very fragile event that could not occur at a different time, or in a different manner without being a different event. Lewis intends the terminology to be neutral on the issue of whether an alteration of an event is a version of the same event or a numerically different event.

The central notion of the new theory is that of influence.

(7) Where c and e are distinct events, c influences e if and only if there is a substantial range of c1, c2, … of different not-too-distant alterations of c (including the actual alteration of c) and there is a range of e1, e2, … of alterations of e, at least some of which differ, such that if c1 had occurred, e1 would have occurred, and if c2 had occurred, e2 would have occurred, and so on.

Where one event influences another, there is a pattern of counterfactual dependence of whether, when, and how upon whether, when, and how. As before, causation is defined as an ancestral relation.

(8) c causes e if and only if there is a chain of stepwise influence from c to e.

One of the points Lewis advances in favour of this new theory is that it handles cases of late as well as early pre-emption. (The theory is restricted to deterministic causation and so does not address the example of probabilistic preemption described in section 3.4.) Reconsider, for instance, the example of late preemption involving Billy and Suzy throwing rocks at a bottle. The theory is supposed to explain why Suzy's throw, and not Billy's throw, is the cause of the shattering of the bottle. If we take an alteration in which Suzy's throw is slightly different (the rock is lighter, or she throws sooner), while holding fixed Billy's throw, we find that the shattering is different too. But if we make similar alterations to Billy's throw while holding Suzy's throw fixed, we find that the shattering is unchanged.

Another point in favour of the new theory is that it handles a type of preemption Lewis that have come to be called trumping. (Trumping was first described by Jonathan Schaffer: see his (2000).) Lewis gives an example involving a major and a sergeant who are shouting orders at the soldiers. The major and sergeant simultaneously shout “Advance”; the soldiers hear them both and advance. Since the soldiers obey the superior officer, they advance because the major orders them to, not because the sergeant does. So the major's command preempts or trumps the sergeant's. Where other theories have difficulty with trumping cases, Lewis's argues his new theory handles them with ease. Altering the major's command while holding fixed the sergeant's, the soldier's response would be correspondingly altered. In contrast, altering the sergeant's command, while holding fixed the major's, would make no difference at all.

There is, however, some reason for scepticism about whether the new theory handles the examples of late preemption and trumping completely satisfactorily. In the example of late preemption, Billy's throw has some degree of influence on the shattering of the bottle. For if Billy had thrown his rock earlier (so that it preceded Suzy's throw) and in a different manner, the bottle would have shattered earlier and in a different manner. Likewise, the sergeant's command has some degree of influence on the soldiers' advance in that if the sergeant had shouted earlier than the major with a different command, the soldiers would have obeyed his order. In response to these points, Lewis must say that these alterations of the events are too-distant to be considered relevant. But some metric of distance in alterations is required, since it seems that similar alterations of Suzy's throw and the major's command are relevant to their having causal influence.

It has also been argued that the new theory generates a great number of spurious instances of causation. (For discussion see Collins 2000; Kvart 2001.) The theory implies that any event that influences another event to a certain degree counts as one of its causes. But commonsense is more discriminating about causes. To take an example of Jonathan Bennett (1987): rain in December delays a forest fire; if there had been no December rain, the forest would have caught fire in January rather than when it actually did in February. The rain influences the fire with respect to its timing, location, rapidity, and so forth. But commonsense denies that the rain was a cause of the fire, though it allows that it is a cause of the delay in the fire. Similarly, in the example of the poison victim discussed above, the victim's ingesting poison on a full stomach influences the time and manner of his death (making it a slow and painful death), but commonsense refuses to countenance his eating dinner as a cause of his death, though it may countenance it as a cause of its being a slow and painful death. Pace Lewis, commonsense does not take anything that affects the time and manner of an event to be a cause of the event simpliciter.

4.2 Causation as Intrinsic Relation

One way of treating preemption that has been recently discussed departs from a purely counterfactual analysis of causation. It has been argued that preemption examples highlight the intuitive idea that causation is an intrinsic relation between events, which is to say it is a local relation depending on the intrinsic properties of the events and what goes on between them, and nothing else. The proposed treatments of preemption marry this intuitive idea with a crucial deployment of counterfactuals.

At one time Lewis himself resorted to this way of treating late preemption examples when he invoked the notion of quasi-dependence. (See his (1986c).) To explain this notion consider a case that resembles the case of Billy and Suzy throwing rocks at a bottle. Suzy throws a rock and shatters the bottle in exactly the same way in which she does in the original case. But in this case Billy and his rock are entirely absent. Lewis argued that since the process in the original case and the process in the comparison case are intrinsically alike (and also obey the same laws), both or neither must be causal. However, the comparison process is surely a causal process since, thanks to Billy's absence, it exhibits a causal dependence. Accordingly, the process in the original case must be a causal process too, even though it does not exhibit a causal dependence. In such examples Lewis has said that the actual process that does not exhibit causal dependence is, nonetheless, causal by courtesy: it exhibits quasi-dependence in virtue of its intrinsic resemblance to the causal process in the comparison case.

A related idea is pursued in Menzies (1996; 1999). Menzies argues that there is an element in our concept of causation that resists capture in purely counterfactual terms. This element consists in the idea that causation is a structural relation that underlies and supports causal dependences. This idea can be captured by treating the concept of causation as the concept of a theoretical entity. Applying a standard treatment of theoretical concepts, he argues that causation should be defined as the unique occupant of a certain characteristic role given by the platitudes of the folk theory of causation. One platitude is that causation is an intrinsic relation between events. Another platitude is that it is typically, but not invariably, accompanied by causal dependence. Accordingly, causation is defined in the following way:

(9) c causes e if only if the intrinsic relation that typically accompanies causal dependence holds between c and e.

On this account, causation is not constituted by causal dependence. It is, in fact, a distinct relation for which causal dependence is, at best, a defeasible marker. The relation may be identified a posteriori with some physically specificable relation such as energy-momentum transfer. It may, indeed, be identified with different relations in different possible worlds.

This definition is supposed to explain commonsense intuitions about preemption examples. For example, Suzy's throw, and not Billy's throw, caused the shattering of the bottle, because the intrinsic relation that typically accompanies causal dependence connects Suzy's throw, but not Billy's throw, with the shattering of the bottle.

Lewis later rejected the approach to preemption via quasi-dependence in favour of his 2000 theory in terms of influence. In Lewis 2004a and 2004b, he claims that theories of causation as an intrinsic relation do not do justice to the full range of our intuitions about causation. (For related points see Hall 2002, 2004.) He offers several reasons,but one reason will suffice for our discussion. The intuition that causation is an intrinsic matter does not apply to cases of double prevention. Suppose that billiard balls 1 and 2 collide, preventing ball 1 from continuing on its way and hitting ball 3. If the collision of balls 1 and 3 had occurred, ball 3 would not have later collided with ball 4. So, we have double prevention: the collision of balls 1 and 2 prevented the collision of balls 1 and 3, which would have prevented the later collision of balls 3 and 4. Here it seems reasonable to say that the collision of balls 1 and 2 was a cause of the later collision of balls 3 and 4. Lewis observes that the causation in such cases of double prevention is partly an extrinsic matter. If there had been some other obstruction that would have stopped ball 1 from hitting ball 3, the collision of 3 and 4 would not have depended on the collision of 1 and 2. Moreover, he notes that much of the spatiotemporal region between the collision of balls 1 and 2 and the collision of balls 3 and 4 is simply empty so that there is no chain of events to serve as a connecting process between cause and effect. The intuition that causation is an intrinsic relation does not apply in this case. More generally, he argues that theories of causation as an intrinsic relation are overhasty generalisations of one specific kind of causation, and they fail to do justice to our intuitions about causation involving absences (as causes, effects or intermediaries).

4.3 The Structural Equations Framework

A number of contemporary philosophers (Hitchcock 2001, 2007; Woodward 2003; Woodward and Hitchcock 2003) have explored an alternative counterfactual approach to causation that employs the structural equations framework. This framework, which has been used in the social sciences and biomedical sciences since the 1930s and 1940s, received its state-of-the-art formulation in Judea Pearl's landmark 2000 book. Hitchcock and Woodward acknowledge their debt to Pearl's work and to the related work on causal Bayes nets by Peter Spirtes, Clark Glymour, and Richard Scheines (1993). However, while Pearl and Spirtes, Glymour and Scheines focus on issues to do with causal discovery and inference, Woodward and Hitchcock focus on issues of the meaning of causal claims. For this reason, their formulations of the structural equations framework are better suited to purposes of this discussion. The exposition of this section follows that of Hitchcock 2001, in particular. While philosophical work using this framework has only just begun, it would seem that this framework looks likely to rival Lewis's framework in terms of its theoretical richness and fruitfulness.

The structural equations framework describes the causal structure of a system in terms of a causal model of the system, which is identified as an ordered pair <V, E>, where V is a set of variables and E a set of structural equations stating deterministic relations among the variables. (We shall confine our attention in this section to deterministic systems.) The variables in V describe the different possible states of the system in question. While they can take any number of values, in the simple examples to be considered here the variables are typically binary variables that take the value 1 if some event occurs and the value 0 if the event does not occur. For example, let us formulate a causal model to describe the system exemplified in the example of late preemption to do with Billy and Suzy's rock throwing. We might describe the system using the following set of variables:

  • BT = 1 if Billy throws a rock, 0 otherwise;
  • ST = 1 if Suzy throws a rock, 0 otherwise;
  • BH = 1 if Billy's rock hits the bottle, 0 otherwise;
  • SH = 1 if Suzy's rock hits the bottle, 0 otherwise;
  • BS = 1 if the bottle shatters, 0 otherwise.

Here the variables are binary. But a different model might have used many-valued variables to represent the different ways in which Billy and Suzy threw their rocks, their rocks hit the bottle, or the bottle shattered.

The structural equations in a model describe the dynamical evolution of the system being modelled. There is a structural equation for each variable. The form taken by a structural equation for a variable depends on which kind of variable it is. The structural equation for an exogenous variable (the values of which are determined by factors outside of the model) takes the form of Y = y, which simply states the actual value of the variable. The structural equation for an endogenous variable (the values of which are determined by factors within the model) states how the value of the variable is determined by the values of the other variables. It takes the form:

Y = f(X1,…, Xn)

What does this structural equation mean? There are in fact competing interpretations. The interpretation favoured by Woodward and Hitchcock is that the equation for an endogenous variable encodes a set of counterfactuals of the following form:

If it were the case that X1 = x1, X2 = x2,…, Xn = xn, then it would be the case that Y = f(x1,…,xn).

As this form of counterfactual suggests, the structural equations are to be read from right to left: the antecedent of the counterfactual states possible values of the variables X1 through to Xn and the consequent states the corresponding value of the endogenous variable Y. There is a counterfactual of this kind for every combination of possible values of the variables X1 through to Xn. It is important to note that a structural equation of this kind is not, strictly speaking, an identity that is equivalent to f(X1,…, Xn) = Y: there is a right-to-left asymmetry built into the equation. Another important feature of the structural equations for endogenous variables is that they must be complete in the sense that the equation for a variable Y must express the value of Y as a function of all and only the variables Xi on which it counterfactually depends given the values of the other variables. A crucial question for those interested in the semantic and metaphysical foundations of the structural equations framework is the status of the counterfactuals encoded by the structural equations. Are they semantically and metaphysically primitive so that the structural equations are simply a summary of the more basic counterfactuals? Or are the structural equations themselves to be taken as the conceptual and metaphysical primitives, with the counterfactuals having a secondary, derivative status? So far there is no consensus on the best way to answer these questions.

As an illustration, consider the set of structural equations that might be used to model the late preemption example of Billy and Suzy. Given the variables listed above, the structural equations might be stated as follows:

  • ST = 1;
  • BT = 1;
  • SH = ST;
  • BH = BT & ~SH;
  • BS = SH v BT.

In these equations logical symbols are used to represent mathematical functions on binary variables: ~X = 1 − X; X v Y = max{X, Y}; X & Y = min{X, Y}. The first two equations simply state the actual values of the exogenous variables ST and BT. The third equation encodes two counterfactuals, one for each possible value of ST. It states that if Suzy threw a rock, her rock hit the bottle; and if she didn't throw a rock, her rock didn't hit the bottle. The fourth equation encodes four counterfactuals, one for each possible combination of values for BT and ~SH. It states that if Billy threw a rock and Suzy's rock didn't hit the bottle, Billy's rock hit the bottle; but didn't do so if one or more of these conditions was not met. The fifth equation encodes four counterfactuals, one for each possible combination of values for SH and BH. It states that if one or other (or possibly both) of Suzy's rock or Billy's rock hit the bottle, the bottle shattered; but if neither rock hit the bottle, the bottle didn't shatter.

The structural equations above can be represented in terms of a directed graph. The variables in the set V are represented as nodes in the graph. An arrow directed from one node X to another Y represents the fact that the variable X appears on the right-hand side of the structural equation for Y. In this case, X is said to be a parent of Y. Exogenous variables are represented by nodes that have no arrows directed towards them. A directed path from X to Y in a graph is a sequence of arrows that connect X with Y. The directed graph of the model described above of Billy and Suzy example is depicted in Figure 1 below:

Figure 1
Figure 1

The arrows in this figure tell us that the bottle's shattering is a function of Suzy's rock hitting the bottle and Billy's rock hitting the bottle; that Billy's rock hitting the bottle is a function of Billy's throwing a rock and Suzy's rock hitting the bottle; and that Suzy's rock hitting the bottle is a function of her throwing the rock. (The existence of an arrow from one variable to another does not always signify a stimulatory connection. For example, the arrow directed from SH to BH is inhibitory.)

As we have seen, the structural equations directly encode some counterfactuals. However, some counterfactuals that are not directly encoded can be derived from them. Consider, for example, the counterfactual “If Suzy's rock had not hit the bottle, it would still have shattered”. As a matter of fact, Suzy's rock did hit the bottle. But we can determine what would have happened if it hadn't done so, by replacing the structural equation for the endogenous variable SH with the equation SH = 0, keeping all the other equations unchanged. So, instead of having its value determined in the ordinary way by the variable ST, the value of SH is set “miraculously”. Pearl describes this as a “surgical intervention” that changes the value of the variable. In terms of its graphical representation,this amounts to wiping out the arrow from the variable ST to the variable SH and treating SH as if it were an exogenous variable. After this operation, the value of the variable BS can be computed and shown to be equal to 1: given that Billy had thrown his rock, his rock would have hit the bottle and shattered it. So this particular counterfactual is true. This procedure for evaluating counterfactuals has direct affinities with Lewis's non-backtracking interpretation of counterfactuals: the surgical intervention that sets the variable SH at its hypothetical value but keeps all other equations unchanged is similar in its effects to Lewis's small miracle that realises the counterfactual antecedent but preserves the past.

In general, to evaluate a counterfactual, say “If it were the case that X1,…,Xn, then …”, one replaces the original equation for each variable Xi with a new equation stipulating its hypothetical value,while keeping the other equations unchanged; then one computes the values for the remaining variables to see whether they make the consequent true. This technique of replacing an equation with a hypothetical value set by a “surgical intervention” enables us to capture the notion of counterfactual dependence between variables:

(10) A variable Y counterfactually depends on a variable X in a model if and only if it is actually the case that X = x and Y = y and there exist values x'x and y'y such that replacing the equation for X with X = x' yields Y = y'.

How does the structural equations framework deal with examples of late pre-emption that pose such problems for Lewis's counterfactual theory? Can this framework deliver the intuitively correct verdicts in the example about Suzy and Billy? Halpern and Pearl (2001,2005), Hitchcock (2001),and Woodward (2003a) all give roughly the same treatment of examples of late preemption. The key to their treatment is the employment of a certain procedure for testing the existence of a causal relation. The procedure is to look for an intrinsic process connecting the putative cause and effect; suppress the influence of their noninstrinsic suroundings by “freezing” those surroundings as they actually are; and then subject the putative cause to a counterfactual test. So, for example, to test whether the variable Suzy's throwing a rock caused the bottle to shatter, we should consider the examine the process running from ST through SH to BS; hold fix at its actual value the variable BH which is extrinsic to this process; and then wiggle the variable ST to see if it changes the value of BS. The last steps involve evaluating the counterfactual “If Suzy hadn't thrown a rock and Billy's rock hadn't hit the bottle, the bottle would not have shattered”. It is easy to see that this counterfactual is true. In contrast, when we carry out a similar procedure to test whether Billy's throwing a rock caused the bottle to shatter,we are required to consider the counterfactual “If Billy hadn't thrown his rock and Suzy's rock had hit the bottle, the bottle would not shattered”. This counterfactual is false. It is the difference in the truth-value of these two counterfactuals that explains the fact that it was Suzy's rock throwing, and not Billy's, that caused the bottle to shatter.

Hitchcock (2001) presents a useful regimentation of this reasoning. He defines a route between two variables X and Z in the set V to be an ordered sequence of variables <X, Y1,…, Yn, Z> such each variable in the sequence is in V and is a parent of its successor in the sequence. A variable Y is intermediate between X and Z if and only if it belongs to some route between X and Z. Then he introduces the new concept of an active causal route:

(11) The route <X, Y1,…, Yn, Z> is active in the causal model <V, E> if and only if Z depends counterfactually on X within the new system of equations E' constructed from E as follows: for all Y in V, if Y is intermediate between X and Z but does not belong to the route <X, Y1,…, Yn, Z>, then replace the equation for Y with a new equation that sets Y equal to its actual value in E. (If there are no intermediate variables that do not belong to this route, then E' is just E.)

This definition generalises the informal idea sketched in the example of Suzy and Billy. There is an active causal route going from Suzy's throwing her rock through her rock hitting the bottle to the bottle shattering: when we hold fixed Billy's rock not hitting the bottle, which is the actual value of the only intermediate variable BH that is not on this route, we see that the bottle's shattering counterfactually depends on Suzy's throwing her rock. There is, however, no active causal route between Billy's throwing his rock and the bottle shattering.

In terms of the notion of an active causal route, Hitchcock defines actual or token causation in the following terms:

(12) If c and e are distinct actual events and X and Z are binary variables whose values represent the occurrence and non-occurrence of these events, then c is a cause of e if and only if there is an active causal route from X to Z in an appropriate causal model <V, E>.

A crucial notion in this definition is that of “an appropriate” model. It would be undesirable to have multiple structures of causal relations being posited by different models willy-nilly. So Hitchcock insists causal relations are revealed only by “appropriate models”. He mentions a number of criteria for appraising whether a model is appropriate, the most important one being that the structural equations posited by the model must not imply any false counterfactual. In order to deal with examples of symmetric overdetermination, Hitchcock (2001) defines a notion of a weakly active route, the essential idea being that there is a weakly active route between X and Y just when Y counterfactually depends on X under the freezing of some possible, not necessarily actual, values of the variables that are not on the route from X to Y. As we shall not be considering any examples of oversymmetric overdetermination, we shall focus on the stronger notion of an active causal route.

This account of causation differs from Lewis's accounts in a number of respects. One difference is that the account does not appeal to the transitivity of causation to deal with preemption examples, in contrast to Lewis's accounts, both early and late. Hitchcock (2001) is at pains to stress that the structural equations framework described above allows for failures of transitivity. Another difference between the accounts is that the structural equations account appeals to special counterfactuals with complex antecedents in order to handle preemption examples. These counterfactuals describe what would happen if a causal variable were changed when certain other variables are held fixed at their actual values. (Hitchcock calls these “explicitly nonforetracking counterfactuals”.) Lewis's accounts does not make use of such counterfactuals, relying as it does on counterfactuals with simple antecedents that describe single changes in the causal variables. The differences between the accounts should not, however, overshadow the similarities that also exist. Both accounts make central use of non-backtracking counterfactuals and they interpret these counterfactuals in roughly the same fashion. Setting aside complications to do with backwards causation, Lewis's account and the structural equations account have us evaluate a non-backtracking counterfactual in much the same way: we are to hold fixed the past history of the system, imagine that the antecedent is realised “miraculously” by a surgical intervention from outside the system, and then consider how the new state of the system would evolve in conformity with the structural equations or laws of the system without any further interventions.

How plausible is this new counterfactual approach to causation? It is too early to say with any confidence, as the approach is still being developed and it has not been subjected to sustained, rigorous testing. Nonetheless, some early problems have emerged. (See Hall 2007; Hitchcock 2007; and Menzies 2004b.) Consider, for instance, the following example, which is a variant of one described by Hitchcock (2007). An assassin puts poison in the king's coffee. The bodyguard responds by pouring an antidote in the king's coffee. If the bodyguard had not poured the antidote in the coffee, the king would have died. On the other hand, the antidote is fatal when taken by itself; and if the poison had not been poured in first, it would have killed the king. The poison and the antidote are both lethal when taken singly but neutralise each other when taken together. In fact, the king drinks the coffee and survives.

Suppose we model this scenario using the following variables:

  • A = 1 if the assassin pours poison into the king's coffee, 0 otherwise;
  • G = 1 if the bodyguard responds by pouring antidote into the coffee, 0 otherwise;
  • S = 1 if the king survives, 0 otherwise.

And also suppose that we employ these structural equations:

  • A = 1;
  • G = A;
  • S = (A & G) v (~A & ~G).

The directed graph for this model is depicted in Figure 2.

Figure 2
Figure 2

Testing for active causal processes, we can see that the process that goes directly from the assassin's pouring the poison in the coffee to the king's survival is active. Holding fixed the fact that the bodyguard poured the lethal antidote into the coffee, we note that the king would not have survived if the assassin had not put the poison in the coffee first. So the theory licenses the verdict that the assassin's pouring in the poison caused the king to survive. However, many regard this as a mistaken causal verdict: putting poison in the king's coffee is exactly the kind of thing that is likely to kill the king. It might be argued that the causal verdict is justified in view of the fact that the assassin's action caused the bodyguard's action, which in turn caused the king's survival. But this appeal to the transitivity of causation is not open to the defenders of this theory, who deny the validity of transitivity.

One counterexample by itself is not enough to disprove the whole structural equations framework. Strictly speaking, it only casts doubt on the theory of causation that defines causation in terms of the presence of an active causal route. Perhaps there are alternative definitions within the structural equations framework that fare better. Indeed, a number of philosophers have explored the possibility of framing a better theory by appealing to a distinction between what Hitchcock has called “default” and “deviant” values of variables.(See Hitchcock 2007.) The default value of some variable represents a normal or to-be-expected state of the system, whereas a deviant value represents an abnormal or unusual state of the system. The correlative notion of the default course of evolution for a system can be characterised as a temporally-ordered sequence of values that the variables in a model take when the default values of the exogenous variables are plugged into the structural equations of the model. Thus, if we set the value of the exogenous variable A in the example above at its default value 0 instead of its actual value 1, we can see that the scenario described above will evolve in the following way: the assassin doesn't put the poison in the coffee, the bodyguard doesn't put the antidote into the coffee, and the king survives. Now if we evaluate counterfactual dependences with counterfactuals centred on the default course of evolution rather than the actual course of evolution, we can see that the bodyguard's action counterfactually depends on the assassin's action and the king's survival depends on the bodyguard's action, but the king's survival doesn't depend on the assassin's action. If counterfactual dependences centred on the default course of evolution are taken to indicate causal relations, these counterfactual dependences accurately reflect our intuitive causal judgements. (For further discussion of this idea, see Menzies 2004a, 2004b, 2007.) It remains to be see whether the various attempts to augment the structural equations framework with a distinction between default and deviant values are successful or not. (For other attempts see Hall 2007; Hitchcock 2007; and for discussion of the role of the default/deviant distinction in causal judgements see Maudlin 2004.)

Bibliography

Other Internet Resources

[Please contact the author with suggestions.]

Related Entries

backwards causation | causation: causal processes | causation: probabilistic | causation and manipulability | conditionals: counterfactual | determinism: causal | events | facts | Hume, David | implicature | intrinsic vs. extrinsic properties | possible worlds | probability, interpretations of | rationalism vs. empiricism | scientific explanation | the metaphysics of causation | time: thermodynamic asymmetry in