Effect of imbalanced sampling and missing data on associations between gender norms and risk of adolescent HIV.

Ribhav Gupta; Safa Abdalla; Valerie Meausoone; Nikitha Vicas; Iván Mejía-Guevara; Ann M Weber; Beniamino Cislaghi ORCID logo; Gary L Darmstadt; (2022) Effect of imbalanced sampling and missing data on associations between gender norms and risk of adolescent HIV. eClinicalMedicine, 50. 101513-. ISSN 2589-5370 DOI: 10.1016/j.eclinm.2022.101513
Copy

BACKGROUND: Despite strides towards gender equality, inequalities persist or remain unstudied, due potentially to data gaps. Although mapped, the effects of key data gaps remain unknown. This study provides a framework to measure effects of gender- and age-imbalanced and missing covariate data on gender-health research. The framework is demonstrated using a previously studied pathway for effects of pre-marital sex norms among adults on adolescent HIV risk. METHODS: After identifying gender-age-imbalanced Demographic and Health Survey (DHS) datasets, we resampled responses and restricted covariate data from a relatively complete, balanced dataset derived from the 2007 Zambian DHS to replicate imbalanced gender-age sampling and covariate missingness. Differences in model outcomes due to sampling were measured using tests for interaction. Missing covariate effects were measured by comparing fully-adjusted and reduced model fitness. FINDINGS: We simulated data from 25 DHS surveys across 20 countries from 2005-2014 on four sex-stratified models for pathways of adult attitude-behaviour discordance regarding pre-marital sex and adolescent risk of HIV. On average, across gender-age-imbalanced surveys, males comprised 29.6% of responses compared to 45.3% in the gender-balanced dataset. Gender-age-imbalanced sampling significantly affected regression coefficients in 40% of model-scenarios (N = 40 of 100) and biased relative-risk estimates away from gender-age-balanced sampling outcomes in 46% (N = 46) of model-scenarios. Model fitness was robust to covariate removal with minor effects on male HIV models. No consistent trends were observed between sampling distribution and risk of biased outcomes. INTERPRETATION: Gender-health model outcomes may be affected by sampling gender-age-imbalanced data and less-so by missing covariates. Although occasionally attenuated, the effect magnitude of gender-age-imbalanced sampling is variable and may mask true associations, thus misinforming policy dialogue. We recommend future surveys improve balanced gender-age sampling to promote research reliability. FUNDING: Bill & Melinda Gates Foundation grant OPP1140262 to Stanford University.


picture_as_pdf
Gupta_etal_2022_Effect-of-imbalanced-sampling-and.pdf
subject
Published Version
Available under Creative Commons: 4.0

View Download

Atom BibTeX OpenURL ContextObject in Span Multiline CSV OpenURL ContextObject Dublin Core Dublin Core MPEG-21 DIDL EndNote HTML Citation JSON MARC (ASCII) MARC (ISO 2709) METS MODS RDF+N3 RDF+N-Triples RDF+XML RIOXX2 XML Reference Manager Refer Simple Metadata ASCII Citation EP3 XML
Export

Downloads