How the choice of method influences results
04/14/2026A large-scale study shows: When hundreds of researchers re-analyse the same data, they often arrive at different results. This analytical variability must therefore be given greater consideration.
A good ten years ago, the so-called reproducibility crisis caused heated debate in the scientific community. At the time, there was great concern that the public's trust in science could be shaken. What had happened? In 2015, the Reproducibility Project: Psychology published the results of a large-scale study. The scientists involved had attempted to replicate 100 studies from the field of psychology - with the result that only around a third of the results could be confirmed.
Since then, the social and behavioural sciences have undergone significant reforms with the aim of making research more transparent, rigorous and reliable. Measures such as pre-registration, replication studies and checks on analytical reproducibility are intended to help reduce the incidence of chance findings and biased results. However, one important question has received relatively little attention: To what extent do research results depend on the specific way in which data are analysed?
Analytical robustness under the magnifying glass
An international research team has now investigated this question. Their study "Investigating the analytical robustness of the social and behavioural sciences", published in the journal Nature, comes to the conclusion that scientific conclusions can vary greatly depending on who carries out which analysis. Two scientists from the University of Würzburg were also involved in the study: Dr Martin Weiß (Chair of Psychology I - Clinical Psychology and Psychotherapy) and Dr Marcel Schreiner (Chair of Psychology III - Cognition & Behaviour).
"In common scientific practice, a data set is usually analysed by a single researcher or a research team, and the resulting publication presents the outcome of a specific analysis path," says Martin Weiß, explaining the background to the study. Although reviewers assess methodological acceptability before publication, it rarely matters what results would have emerged from alternative but equally justifiable statistical decisions.
"However, empirical research involves numerous decision points: how data is cleaned, how variables are defined, which statistical models or software are used and how results are interpreted," adds Marcel Schreiner. Together, these decisions form what is known as analytical variability - the flexibility that can fundamentally affect the final conclusions.
More than 500 independent re-analyses
To objectively measure the extent of analytical variability, the team organised an international crowd initiative. This involved randomly selecting 100 studies from the social and behavioural sciences that had been published between 2009 and 2018. A total of 457 researchers took part in the project and conducted 504 independent re-analyses. They all received the same data set and the same central research question, but had the freedom to conduct the analysis according to their individual approach. The aim was to re-analyse the central scientific claim for each study using the original data.
The key finding was that although most of the new analyses largely supported the main claims of the original studies, effect sizes, statistical estimates and degrees of uncertainty often differed considerably. The details are as follows:
- Statistical variability and effect sizes: 81 per cent of the analysers used different methods and only 34 per cent arrived at exactly the same statistical result compared to the original publication. The average effect size in the re-analyses was significantly lower than in the original studies.
- Scientific conclusions: Despite these numerical deviations, 74 per cent of the re-analyses confirmed the original core statement, while 24 per cent found no clear effects and 2 per cent came to the opposite conclusion.
- Influence of study design: Experimental studies proved to be robust to alternative analyses in 47 per cent of cases, whereas this figure fell to 27 per cent for observational studies. This indicates that more complex data structures allow for greater analytical flexibility - and therefore greater uncertainty.
- Influence of expertise: Discrepancies were not due to a lack of expertise. Experienced researchers with sound statistical knowledge arrived at different results just as often as others.
What does this mean from a scientific point of view?
Balázs Aczél, professor at Eötvös Loránd University (Budapest) and one of the study leaders, concludes: "These results do not call into question the credibility of previous research. Rather, they draw attention to the fact that the presentation of a single analysis often does not reflect the actual degree of empirical uncertainty and that ignoring analytical variability can lead to unwarranted confidence in scientific conclusions."
Barnabás Szászi, Assistant Professor at Eötvös Loránd University and Corvinus University (Budapest), adds: "We argue in favour of a broader application of multi-analyst and 'multiverse' approaches, especially to questions of high scientific or societal importance. Rather than looking for a single true answer, these approaches reveal how robust - or fragile - scientific conclusions actually are."
The study was conducted as part of the DARPA-funded Systematizing Confidence in Open Research and Evidence (SCORE) programme; DARPA is the Defense Advanced Research Projects Agency, an agency of the US Department of Defense.
Original publication
Aczel, B., Szaszi, B., Clelland, H.T. et al. Investigating the analytical robustness of the social and behavioural sciences. Nature 652, 135-142 (2026). https://doi. org/10.1038/s41586-025-09844-9
Contact
Dr Martin Weiß, Chair of Psychology I - Clinical Psychology and Psychotherapy
T +49 931 31-82378, martin.weiss@uni-wuerzburg.de
Dr Marcel Schreiner, Chair of Psychology III,
T +49 931 31-83380, marcel.schreiner@uni-wuerzburg.de
