Incorporating Baseline Outcome Data in Individual Participant Data Meta-Analysis of Non-randomized Studies

Background: In non-randomized studies (NRSs) where a continuous outcome variable (e.g., depressive symptoms) is assessed at baseline and follow-up, it is common to observe imbalance of the baseline values between the treatment/exposure group and control group. This may bias the study and consequently a meta-analysis (MA) estimate. These estimates may differ across statistical methods used to deal with this issue. Analysis of individual participant data (IPD) allows standardization of methods across studies. We aimed to identify methods used in published IPD-MAs of NRSs for continuous outcomes, and to compare different methods to account for baseline values of outcome variables in IPD-MA of NRSs using two empirical examples from the Thyroid Studies Collaboration (TSC).

Methods: For the first aim we systematically searched in MEDLINE, EMBASE, and Cochrane from inception to February 2021 to identify published IPD-MAs of NRSs that adjusted for baseline outcome measures in the analysis of continuous outcomes. For the second aim, we applied analysis of covariance (ANCOVA), change score, propensity score and the naïve approach (ignores the baseline outcome data) in IPD-MA from NRSs on the association between subclinical hyperthyroidism and depressive symptoms and renal function. We estimated the study and meta-analytic mean difference (MD) and relative standard error (SE). We used both fixed- and random-effects MA.

Results: Ten of 18 (56%) of the included studies used the change score method, seven (39%) studies used ANCOVA and one the propensity score (5%). The study estimates were similar across the methods in studies in which groups were balanced at baseline with regard to outcome variables but differed in studies with baseline imbalance. In our empirical examples, ANCOVA and change score showed study results on the same direction, not the propensity score. In our applications, ANCOVA provided more precise estimates, both at study and meta-analytical level, in comparison to other methods. Heterogeneity was higher when change score was used as outcome, moderate for ANCOVA and null with the propensity score.

Conclusion: ANCOVA provided the most precise estimates at both study and meta-analytic level and thus seems preferable in the meta-analysis of IPD from non-randomized studies. For the studies that were well-balanced between groups, change score, and ANCOVA performed similarly.