A causal inference framework for analyzing large administrative healthcare databases with a focus on multiple sclerosis

Provincial health authorities routinely collect patient information on a massive scale, but health researchers face the challenge of exploring cause-and-effect relationships using these non-randomized population-based data sources. Machine learning methods are increasingly used to analyze these large datasets, although they do not inherently take causal structures (i.e., how the variables affect each other) into consideration and may lead to less-than-optimal or even erroneous conclusions.

Health researchers urgently need new big-data analytic methods that are geared towards extracting causal explanations rather than merely increasing prediction accuracy. This project will develop innovative biostatistical methodologies that will better equip health researchers to infer causation from big-data sources.

As a motivating problem, with a bias reduction goal in mind, Dr. Karim will investigate potential benefits of disease-modifying drugs in multiple sclerosis patients 50 years of age or older. Ultimately, this methodological development will enable health researchers to convert information into actionable knowledge for other common, chronic conditions, leading to cost-effective medical decision making and improving the health of Canadians.