

Causal Inference in Python: Applying Causal Inference in the Tech Industry



D**G
An almost complete guide to the tools you need to apply causal inference in tech
Context: I’ve worked as an applied scientist specializing in causal inference for the past 10 years and taught a graduate course on data science in industry at Columbia for several years. Most recently I worked at Lyft.I don’t usually go out of my way to write a review,, but I cannot speak highly enough of Matheus Facure’s book Causal Inference in Python: Applying causal inference in the tech industry. There are several books on causal inference which do a great job of covering the theory, but fail to help the reader achieve an understanding of 1) What kinds of problems researchers actually encounter in industry 2) The difficulties arising with real data. Many books abstract these problems away in generic variables and by assuming well behaved oracle models, but in reality it’s often hard/impossible to verify many of these assumptions, requiring very careful thinking that Matheus always goes through for each example. Finally 3) Coding everything from scratch (in Python! Many statisticians still prefer R but if you’re writing any kind of production code, Python is the language of choice). This last point is essential in my opinion to helping the reader really understand the theory and is something I included in my own teaching philosophy when I was teaching Data Science in Industry at Columbia.I can attest that the problems and methods in this book are reflective of what you really need to know for a large variety of industry applications.For anyone working in tech who wants to get a hands-on understanding of how to apply causal inference in industry, I would recommend picking up a copy of this book.
J**.
Lovely, practical, and well-illustrated
It's a great book for getting started in causal inference. The explanations and illustrations are excellent. As a data scientist, the book has directly impacted my job since I can apply and explore the techniques explained here. Several "a-ha!" moments and the author also exposes the readers to no-so-well-known techniques in data analysis.A must-read if you are interested in causal analysis!
F**B
Great resource for technical readers
Several years ago, I was part of a book reading club in my data science department where we read Judea Pearl's Book of Why. In it, he argued that causal inference was a revolution in progress in analytics. I joked that it couldn't be that much of a revolution if there was no O'Reilly book about it! This prompted me to reach out to O'Reilly and ultimately write their first book about applied causal inference in business, "Behavioral Data Analysis with R and Python".Two years later, it is great to see Matheus Facure continue that path. My book was targeted to junior data analysts and therefore kept things accessible, at the price of numerous simplifications; for data scientists with an advanced degree in a quantitative field, Matheus' book is in my opinion the best one bar none. Its breadth and depth of coverage is impressive, and he manages to provide illuminating intuitions on advanced methods. At the same time, he manages to keep the tone conversational and engaging throughout.As an economics PhD in business, I used to rely on "Mostly Harmless Econometrics" (Angrist & Pischke) and "Field Experiments: Design, Analysis, and Interpretation" (Gerber & Green) as references to refresh my memory when facing thorny inference questions. I suspect this book will become my new go-to reference.
J**O
good book but lack of color print makes graphs hard to read
while the material is awesome and examples useful, the pictures/graphs don't have any color, making the book difficult to read.
P**Z
Astonishing Balance of Theory and Application
From causal attribution 30 years ago, through Judea Pearl, to the gaffes in KBS and expert systems in the 80's and 90's, causal inference has long been the golden fleece of machine learning, as it is at the core of all AI: prediction.The two top talents in ML at Google and Amazon, friends of mine on zooms, give today's "two biggest challenges in ML" as load and bias.In the early days of robotics we used to strap a video camera on a robot, attach it to actuators, and wonder why our logic gates couldn't make it walk?!The then bizarre answer was discovered: it is all statistics, our unconscious motor system thinks in odds driven by calculus!Enter causality. Are stats our doom or savior? The rub is that learning is both observational and experimental, both study and practice. Any book purporting to help us actually USE CI in practice has to deal with that issue, which is far from resolved- can huge vector data sets ever have an empirical side that is ethical, sees bias, snd balances the boosting and bagging tradeoffs of high/low variance in variance vs noise vs bias?Kahnemanns Noise (reprising Silbermans signal and noise just as thinking fast and slow reprised bounded rationality) gives a great philosophical foundation for bias vs noise in prediction. In essence the causal issue ends up being an energy/entropy problem!By throwing the word Python in the title, given the above, do we really believe that the state of the art of CI is at the coding level, or is the publisher assuming the recommender engine will find me writing this and you reading it as hopeful but naive data scientists?Focus to the rescue. The author does an amazing job of limiting the topics to areas where experimentation is possible and ethical. It wont help you build your medical causal real time generative recommender system, but will greatly assist you in developing the one that brought you to this book. BUT unpacking data causally won't give us causal prediction even there, but will give us the a/b tests that get the weights and odds up for a family of causes.Point bias is the culprit. Killing Archduke Ferdinand is a causal war event with lynchpin conditional point bias, but like all data history, 1,000 causes are under that cause including that ubiquitous Brazilian butterfly.
Trustpilot
1 week ago
2 days ago