> Or do I need to provide that graph to this function?
You need to do that, and the math can help you measure how much each arrow contributes. The idea that you need to provide your model of the world is strangely not a key part of most introductions, but it’s crucial.
> outdoor temperatures and ice cream sales
That’s too simple: a simple regression can handle that. Causal inference can handle cases with three variables, assuming you provide an interaction graph. Say: your ice cream truck goes either to a fancy neighborhood or a working-class plaza. After observing the weather, you decide where to go, so know that wealth and weather influence sales, but sales can’t influence the other two. Assuming you have data all for cases (sunny/poor, sunny/rich, rainy/poor, rainy/rich), then you can separate the two effects.
> > outdoor temperatures and ice cream sales
> That’s too simple: a simple regression can handle that.
Not quite. Regression by itself will not answer the causal (or equivalently, the counterfactual) question.
I strongly suspect you already know this and was elaborating on a related point. But just for the sake of exposition, let me add a few words for the HN audience at large.
Let me give an example. In an email corpus, mails that begin with "Honey sweetheart," will likely have a higher than baseline open rate. A regression on word features will latch on to that. However, if your regular employer starts leading with "Honey sweetheart" that will not increase the open rate of corporate communications.
Causal or counterfactual estimation is fundamentally about how a dependent variable responds to interventional changes in a causal variable. Regression and relatedly, conditional probabilities are about 'filtering' the population on some predicate.
An email corpus when filtered upon the opening phrase "Honey sweetheart" may have disproportionately high email open rates, but that does not mean that adding or adopting such a leading phrase will increase the open rate.
Similarly, regressing dark hair as a feature against skin cancer propensity will catch an anti-correlation effect. Dyeing blonde hair dark will not reduce melanoma propensity.
You need to do that, and the math can help you measure how much each arrow contributes. The idea that you need to provide your model of the world is strangely not a key part of most introductions, but it’s crucial.
> outdoor temperatures and ice cream sales
That’s too simple: a simple regression can handle that. Causal inference can handle cases with three variables, assuming you provide an interaction graph. Say: your ice cream truck goes either to a fancy neighborhood or a working-class plaza. After observing the weather, you decide where to go, so know that wealth and weather influence sales, but sales can’t influence the other two. Assuming you have data all for cases (sunny/poor, sunny/rich, rainy/poor, rainy/rich), then you can separate the two effects.