Dealing with partially observed covariates in propensity score analysis of observational data
Observational data, such as electronic health records, are a valuable source of information for researchers seeking to answer health-related questions. Since treatment allocation is not typically randomised in studies using observational data, there is confounding systematic differences in the characteristics of patients in different treatment groups. Propensity score analysis (PSA) can be used to handle confounding by modelling the probability of being allocated to a particular treatment, based on patient characteristics. However, a common issue in analyses of observational data is missing data. In general, not dealing appropriately with missing data can lead to loss of efficiency and biased estimates of the treatment effect. Furthermore, having partially observed covariate data can complicate the estimation of the propensity score. The missingness pattern approach (MPA) has been proposed to handle partially observed covariate data in PSA. One key objective of my thesis is to understand when the approach is appropriate, by exploring its underlying assumptions. I began by comparing different statements of the MPA's underlying assumptions given in the literature. I considered the plausibility of the MPA's assumptions in simple scenarios, finding that they are separate to the conventional classification of missingness mechanisms. I used d-separation (a rule for testing conditional independence statements) with single world intervention graphs, representing a variety of scenarios, in order to develop guidance for when the assumptions seem plausible. I also explored the connection between using the MPA and using missing indicators in the context of PSA, finding that the use of missing indicators is a simplification of the MPA. I extended this work to outcome regression, mathematically proving that using missing indicators is valid under the MPA's assumptions as well as an additional simplifying assumption. I also conducted simulation studies to assess bias when using missing indicators to handle partially observed covariate data in outcome regression.
Item Type | Thesis (Doctoral) |
---|---|
Thesis Type | Doctoral |
Thesis Name | PhD |
Contributors | Williamson, E; Leyrat, C; Carpenter, J |
Research Group | Missing Data Interest Group |
Copyright Holders | Helen Abigail Blake |