1) Past research from Amy Cuddy’s lab (amongst others) has shown that “power posing” (e.g., expansive postures, of the type associated with dominance) affects people’s sense of power. This research has also shown that the effects of power posing affect testosterone and cortisol levels and affect risk taking behavior.
2) A replication attempt, using a much bigger sample size found no evidence for effects on testosterone, cortisol, and risk taking. However, the replication did find an effect on self-reported sense of power.
3) Cuddy replied to this replication, suggesting several moderating variables. For example, a major difference between the original study and the replication was that in the replication participants were told that power posing could affect their behavior.
4) Cuddy also provided a qualitative meta-analysis showing that lots of other work has shown the same effect she has.
5) Uri Simonsohn and Joe Simmons attempted to analyze the situation in a recent blog post, trying to figure out why the replication failed (http://datacolada.org/2015/05/08/37-power-posing-reassessing-the-evidence-behind-the-most-popular-ted-talk/). All of the relevant papers are linked there.
Here is a critical read of their analysis:
6) Using a method developed by Simonsohn (the small telescope method) to quantify the detectability of an effect, they show that in retrospect, the original study had only a 5% chance of detecting the effect they report on risk taking.
In other words, it seems that the original study didn’t employ the proper methods to allow it to claim what it had claimed regarding risk taking. The effect on cortisol and testosterone supposedly aren’t so detectable as well.
This definitely raises doubts. But here is my question-
Carney and Cuddy use one manipulation and show an effect on 4 different measures.
If the effect was random noise, what are the odds of observing it on 4 measures? (risk taking, testosterone levels, cortisol levels, sense of power). It seems that a different application of the “small telescope approach” is needed to account for the conjunction of events.
Alternatively, it could mean that the correlation between the 4 outcomes measures is extremely high, and due to luck, high-testosterone participants (who are also more powerful, risk takers, lower cortisol) ended up in the power pose group.
7) Simonsohn and Simmons then moved on to examine Cuddy’s claim that the effect was replicated by other people. Apparently, a p-curve analysis shows that these previous studies seem to suffer from publication bias.
This seems to undercut Cuddy’s claims that the failed replication is just a drop of water in an ocean of significant results.
But- was the failed replication indeed a failed replication?
Ranehill successfully replicated the effect of posture on self-reported power. They didn't replicate the effect on hormones, and risk taking.
Simonsohn and Simmons refers to this as a “manipulation check”, following Cuddy and Careney’s own words. However, when we take a step back and think about it, isn't this the effect that we are talk about when we think of this literature? That striking a pose makes you feel more powerful? Feeling more powerful must have meaningful effects. Only question is what are these effects.
Now- we know from a lot of other studies, in humans and in non-humans, that a sense of social dominance is correlated with increased testosterone, lowers cortisol, increases risk taking. We don’t need to trust Carney and Cuddy on this one.
If a sense of power is indeed correlated with all those things, then (knowing that the effect of power on self-report is real) suggests that some aspect of Ranehill’s design (e.g., when testosterone was sampled, something about the population and its levels of stress) may indeed be at fault for failing to find the correlation between sense of power and testosterone.
If a sense of power is not correlated with testosterone, cortisol, risk taking, this brings me back to the comment made in point 6 (above):
In such a case it is unlikely that Carney and Cuddy lucked out on 4 out of 4 uncorrelated measures. The actual metrics used to assess the level of evidence provided in her original study are biased against her. A small telescope examination that accounts for the conjunction of 4 outcome measures is needed. No?
If I’m wrong, I’d be happy to hear comments.