Perils of Flawed Meta-Analytic Methodology
Ignoring explainable heterogeneity within a 'random-effects' model can have dire consequences.
The purpose of this post is to provide an addendum to my other critiques of a meta-analytic review by Chris Ferguson — see Fundamental Flaws in Meta-Analytical Review of Social Media Experiments and Social Media Experiments and Weighted Averages for reference.
Flawed Methodology
Ferguson failed to uphold basic requirements of systematic reviews, especially in the manner that he designed and misrepresented his ‘random-effects’ model.
Ferguson imposed a severe conflation of disparate effects on his model, such as:
The impact on self-esteem of viewing your own Instagram for 5 minutes.
The impact on momentary moods (such as feeling bored or irritated) of one-day smartphone abstinence in school.
The impact on subjective loneliness of increasing the frequency of FB status updates for one week.
The impact on depression of reducing SM by half for 3 weeks.
What is some weighted average of such disparate effects supposed to mean? What can be inferred from it beyond some vague relationship with general well-being?
Furthermore, Ferguson declared that his model provides a legitimate test of the theories of Haidt and Twenge that SM time reductions would improve mental health. This is mistaken.
First, the primary thesis of Haidt and Twenge regarding SM time reductions is that these would reduce the prevalence of depression and anxiety among young people — not that they would improve every possible aspect of mental well-being, such as if you felt bored yesterday or if you are satisfied with the conditions of your life.
Since Ferguson does not examine impacts of SM time reductions on depression and anxiety, itrenders his model of Haidt and Twenge theories invalid.
Second, Haidt and Twenge hold that young people have become psychologically dependent on smartphones and social media, and therefore predict that sudden abstinence would produce temporary ‘withdrawal’ symptoms such as worsened mood.
Ferguson does not acknowledge this point, even though he includes a number of studies measuring the impact of short-term abstinence on momentary mood. These studies are then interpreted within his model as evidence against the theories of Haidt and Twenge despite the fact that the outcomes of these experiments actually confirm their predictions.
This is akin to Ferguson constructing a model in which withdrawal symptoms during initial stages of heroin abstinence are interpreted as evidence that heroin is good and that heroin abstinence is harmful.
Explainable Heterogeneity
Ferguson designed a ‘random-effects’ model that obscured and negated SM time reduction impacts on mental health outcomes such as depression and anxiety while misrepresenting it as a valid model of Haidt and Twenge theories.
A random-effects model is supposed to examine unexplained heterogeneity — if some of the differences in the outcomes depend on independent factors, one needs to investigate and likely conduct a subgroup analysis or a meta-regression.
Borenstein et al. 2010, in their guide A basic introduction to fixed-effect and random-effects models, caution:
The third caveat is that a random-effects meta-analysis can be misleading if the assumed random distribution for the effect sizes across studies does not hold.
The authors also explain:
While the random-effects model takes account of variation in effect sizes, it simply incorporates this variation into the weighting scheme—it makes no attempt to explain this variation. This is appropriate when the variation is assumed to be random, in the sense that it depends on unknown factors. By contrast, if we anticipate that some of the variance can be explained by specific variables, then we can study the relationship between these putative moderators and effect size. This has the effect of reducing the unexplained (residual) between-studies variation.
Note: all emphases are mine.
The authors also provided the following example:
For example, suppose the analysis includes 20 studies, where 10 employed one variant of the intervention while 10 employed another variant. We can perform a subgroup analysis, computing a summary effect within each set of 10 studies, and then comparing the two summary effects. This allows us to partition the total variance (of all effects vs the overall mean) into variance between the two subgroup means and variance within the subgroups.
In parts 1 and 2, Zach Rausch and Jon Haidt did what Chris Ferguson failed to do, that is investigate obviously non-random variations between the outcomes within his ill-designed model. We also do so in this article, investigating studies that included a measure of depression or anxiety, the primary focus of scholars like Haidt and Twenge.
To better understand the importance of such investigations, let’s consider an example that shows how the failure to acknowledge non-random distributions within a random-effects model could have terrible consequences.
Why Proper Analysis Matters
Imagine that a researcher is working for a pharmaceutical company that manufactures a weight-loss drug X.
The company conducts numerous experimental studies: there are 15 studies indicating that drug X improves body image and 10 studies providing statistically significant evidence that drug X increases suicidal ideation.
This researcher combines the effect sizes from the 15 body image studies and the 10 suicidal conduct studies to calculate a ‘weighted average’ in a random-effects model and obtains a result that indicates harm but is statistically insignificant.
A subgroup analysis would have revealed that the 10 suicidal conduct studies provide statistically significant evidence of possibly fatal mental health harm, but the researcher omits any hint of this when he publishes his meta-analysis.
Instead, the researcher declares that his meta-analytic results ‘undermine’ any concerns about the effects of the weight-loss drug X on mental health and even announces publicly that his findings show that drug X has “no impacts on mental health”.
Such conduct would be wrong. Proper analysis is a necessity in any meta-analytic review, especially when public health is at risk.