Ladder of causation

I’ve read an interesting piece on Twitter from the always excellent Kareem Carr on the ladder of causation. I found it very interesting, because it allows you to go beyond the mantra “corelation is not causation“, and links statistics to the concept of falsifiability that Thomas Kuhn puts as central to sciences.

The Ladder of Causation

The Ladder of Causation has three levels:

1. Association. This involves the prediction of outcomes as a passive observer of a system.

2. Intervention. This involves the prediction of the consequences of taking actions to alter the behavior of a system.

3. Counterfactuals. This involves prediction of the consequences of taking actions to alter the behavior of a system had circumstances been different.

I even read the book from which – “The Book of Why” [Full book on the Internet Archive] by Judea Pearl, a Turing prize recipient who worked on Bayesian network. The book quite illuminating, mentioning a bit too often  dark figures such as Galton, Pearson and Fisher (it seems statistician get really high on their own supply.)

This certainly begs the question – “Why not?”

The book is interesting in that it brings back causality on the table and explain that this is how human naturally perceive the world.

As I re-read Genesis for the hundredth time, I noticed a nuance that had somehow eluded
my attention for all those years. When God finds Adam hiding in the garden, he asks: “Have you eaten from the tree which I forbade you?” And Adam answers: The woman you gave me for a companion, she gave me fruit from the tree and I ate. “What is this you have done?” God asks Eve. She replies: The serpent deceived me, and I ate.

As we know, this blame game did not work very well on the Almighty, who banished
both of them from the garden. The interesting thing, though, is that God asked what and they
answered why. God asked for the facts, and they replied with explanations. Moreover, both were
thoroughly convinced that naming causes would somehow paint their actions in a different color.
Where did they get this idea?

The Ladder of Causation

This is a fundamental idea that everybody who cares about data needs to know about.

The Ladder of Causation has three levels:

1. Association. This involves the prediction of outcomes as a passive observer of a system.

2. Intervention. This involves the prediction of the consequences of taking actions to alter the behavior of a system.

3. Counterfactuals. This involves prediction of the consequences of taking actions to alter the behavior of a system had circumstances been different.

The levels of causation are defined by the kinds of data to which we have access as experimenters.

Imagine a machine that can spit out data. You can observe what comes out of the machine and analyze the information, but you can’t touch the machine. This is the level of Association. It is the level of most statistical analyses and AI algorithms.

Now imagine that you are allowed to touch the machine. You can set it to generate specific datasets that you would like in the future. This is the level of Intervention. It is the kind of data that we get out of standard scientific experiments where we fully control the environment.

The final level of the ladder of causation is our own imaginations. It is the ability to imagine the machine as it used to be, and to imagine what we would have gotten from it had we pushed different buttons. We have to use our imaginations because we can never directly observe what would have happened in the past had we done things differently. This is the level of Counterfactuals.

It is the hardest kind of science to do because we can’t do experiments to answer the questions that we really want to answer.

The counterfactual imagination is central to our science, our morality and even our storytelling.

The counterfactual level is the most conceptually difficult level because we can never get counterfactual data even in theory. It is also the most fascinating level.

It is the level of thought experiments like the one that led Einstein to invent the theory of relativity. He imagined himself riding along a light beam and took that idea to its logical conclusion.

Most of the laws of physics are counterfactual laws. They tell us what will happen in all situations including worlds with histories that are different from our own. This is one of the reasons that physics is such a powerful science.

The counterfactual level is also the level of our moral imaginations. It is core to our ability to reason morally that we can imagine what would have happened had we or others acted differently.

The ability to consider counterfactuals is central to our ability to read and write stories.

Arguably, it is the ability to understand counterfactuals that sets human beings apart from other animals.

https://twitter.com/kareem_carr/status/1703793915717878140

To finish, another excerpt from the book of why quoting the bible:

Another fascinating and revealing instance of counterfactual reasoning occurs in the book of Genesis in the Bible. Abraham is talking with God about the latter’s intention to destroy the cities of Sodom and Gomorrah as retribution for their evil ways.

And Abraham drew near , and said, Wilt thou really destroy the righteous with the wicked?

Suppose there be fifty righteous within the city: wilt thou also destroy and not spare the place for the sake of the fifty righteous that are therein?…

And the Lord said, If I find in Sodom fifty righteous within the city, then I will spare all the place for their sakes.

But the story does not end there. Abraham is not satisfied and asks the Lord, what if there are only forty-five righteous men? Or forty? Or thirty? Or twenty? Or even ten? Each time he receives an affirmative answer, and God ultimately assures him that he will spare Sodom even for the sake of ten righteous men, if he can find that many.

What is Abraham trying to accomplish with this haggling and bargaining? Surely he does not doubt God’s ability to count. And of course, Abraham knows that God knows how many righteous men live in Sodom. He is, after all, omniscient.

Knowing Abraham’s obedience and devotion, it is hard to believe that the questions are meant to convince the Lord to change his mind. Instead, they are meant for Abraham’s own comprehension. He is reasoning just as a modern scientist would, trying to understand the laws that govern collective punishment. What level of wickedness is sufficient to warrant destruction? Would thirty righteous men be enough to save a city? Twenty? We do not have a complete causal model without such information. A modern scientist might call it a dose-response curve or a threshold effect.

Now that we live in a age defined by neural networks, I like toying with the idea that the way these network are structured is actually a way to let causality emerge naturally, hence explaining why they work so well. The trouble is that the question “why” depends a lot on the context in which the question is asked – but that’s another story.