Hot Topic

Assuring the Quality of Psychological Research

Research in psychology and the behavioral sciences in general, is currently facing
a debate on research practices and how to improve replicability. When replicated, many findings seem to either diminish in magnitude or to disappear altogether, as, for instance, recently shown in the Reproducibility Project: Psychology. Several reasons for false-positive results in psychology have been identified (e.g., p-hacking, selective reporting, underpowered studies) and call for reforms across the whole range of academic practices. These range from (1) new journal policies promoting an open research culture to (2) hiring, tenure and funding criteria that reward credibility and replicability rather than sexiness and quantity to (3) actions for increasing transparent and open research practices within individual labs.

Keynote

Brian Nosek, University of Virginia & Center for Open Science

„Addressing the Reproducibility of Psychological Science“

The currency of science is publishing.  Producing novel, positive, and clean results maximizes the likelihood of publishing success because those are the best kind of results.  There are multiple ways to produce such results: (1) be a genius, (2) be lucky, (3) be patient, or (4) employ flexible analytic and selective reporting practices to manufacture beauty.  In a competitive marketplace with minimal accountability, it is hard to resist (4).  But, there is a way.  With results, beauty is contingent on what is known about their origin.  With methodology, if it looks beautiful, it is beautiful. The only way to be rewarded for something other than the results is to make transparent how they were obtained.  With openness, I won’t stop aiming for beautiful papers, but when I get them, it will be clear that I earned them.

Brian Nosek is a Professor of Psychology at the University of Virginia and Executive Director of the Center for Open Science (http://cos.io/).  His research and practical interests are on the gap between values and practices.

Panel Discussion

Assuring the Quality of Psychological Research
Leitung: Manfred Schmitt
Diskutanten: Manfred Schmitt, Andrea Abele-Brehm, Klaus Fiedler, Kai Jonas, Brian Nosek, Felix Schönbrodt, Rolf Ulrich, Jelte Wicherts

When replicated, many findings seem to either diminish in magnitude or to disappear altogether, as, for instance, recently shown in the Reproducibility Project: Psychology. Several reasons for false-positive results in psychology have been identified (e.g., p-hacking, selective reporting, underpowered studies) and call for reforms across the whole range of academic practices. These range from (1) new journal policies promoting an open research culture to (2) hiring, tenure and funding criteria that reward credibility and replicability rather than sexiness and quantity to (3) actions for increasing transparent and open research practices within and across individual labs. Following Brian Nosek's (Center of Open Science) keynote, titled "Addressing the Reproducibility of Psychological Science" this panel discussion aims to explore the various ways in which our field may take advantage of the current debate. That is, the focus of the discussion will be on effective ways of improving the quality of psychological research in the future. Seven invited discussants provide insights into different current activities aimed at improving scientific practice and will discuss their potential. The audience will be invited to contribute to the discussion.  

Invited Symposium

Reproducibility and trust in psychological science
Chair: Jelte Wicherts (Tilburg University)

In this symposium we discuss issues related to reproducibility and trust in psychological science. In the first talk, Jelte Wicherts will present some empirical results from meta-science that perhaps lower the trust in psychological science. Next, Coosje Veldkamp will discuss results bearing on actual public trust in psychological science and in psychologists from an international perspective. After that, Felix Schönbrodt and Chris Chambers will present innovations that could strengthen reproducibility in psychology. Felix Schönbrodt will present Sequential Bayes Factors as a novel method to collect and analyze psychological data and Chris Chambers will discuss Registered Reports as a means to prevent p-hacking and publication bias. We end with a general discussion.

  1. Reproducibility problems in psychology: what would Wundt think?
    Jelte Wicherts (Tilburg University)

    In this introductory talk I present an overview of recent evidence from the emerging field of meta-science bearing on the reproducibility of results published in the psychological literature. I compare the current state of psychological science with a (perhaps overly romantic) view of how Wilhelm Wundt and his contemporaries conducted science and reported their findings. I present results that concern common failures to publish, analytic flexibility and selective reporting, statistical reporting errors, lack of transparency, failures to replicate, small study effects (underpowered studies showing larger effects), measurement problems, and suboptimal inferences drawn from data. I conclude that Wundt and his contemporaries would have difficulty appreciating current problems, because many of these originate from statistical methods that were developed after their days. Yet we could certainly learn from how psychological science was done over a century ago. 

  2. Trust in psychology and psychologists
    Coosje Veldkamp (Tilburg University) 

    Despite mounting retractions due to scientific misconduct and increasing doubts about the reproducibility of findings in many scientific fields, public trust in science and scientists remains high. However, cases of research misconduct that are covered extensively in the media may affect public trust in specific areas of science or specific groups of scientists. Two years after the fraud of Dutch psychologist Diederik Stapel became a major news item in The Netherlands and abroad, we examined public trust in psychology and psychologists in The Netherlands, Germany, and the United Kingdom. We found that trust in psychology was much lower than trust in physics, genetics, and science in general in all three countries. The same was found for trusts in psychologists: again considerably lower than trust in physicists, geneticists, and scientists in general. In addition, people in all three countries attributed less integrity to psychologists than to physicists, geneticists, and scientists in general. Overall, trust in psychologists and integrity attributed to psychologists was highest in the UK, and lowest in Germany. 

  3. Never underpowered again: Sequential Bayes Factors guarantee compelling evidence
    Felix Schönbrodt (University of Munich)

    Unplanned optional stopping has been criticized for inflating false positive inference under the NHST paradigm. Nonetheless, this research practice is not uncommon, probably as it appeals to researcher’s intuition to collect more data in order to push an indecisive result into a decisive region. The Sequential Bayes Factor design, in contrast, allows optional stopping with unlimited multiple testing, even after each participant. In this hypothesis testing design, sample sizes are adaptively increased until the Bayes factor reaches the desired level of evidence, either for H0 or for H1. Compared to an optimal NHST design, this leads on average to 50-70% smaller sample sizes, while having the same error rates.
    The SBF design has properties that are particularly useful in the light of the current replication debate: Inconclusive results (the "p=.08 problem") can be pushed into a decisive region, and sample size can be adaptively extended until there is much stronger evidence than typically achieved (which reduces the rate of false positive results). Furthermore, in the case of replication attempts, it is not necessary to commit to shaky effect size (ES) guesses in a power analysis. As the amount of upward bias of the reported ES is unknown, seemingly properly powered replication attempts often turn out to be actually underpowered. In the SBF design, one simply starts collecting data and stops adaptively when there is enough evidence for either hypothesis. In this way, every replication attempt will be a success (in the sense that it is guaranteed to provide compelling evidence, either for H0 or H1).

  4. The Registered Reports project: Three years on
    Chris Chambers (Cardiff University)

    In 2013 the journal Cortex became the first outlet to offer Registered Reports, a format of pre-registered empirical publication in which peer review happens prior to data collection and analysis (see https://osf.io/8mpji/wiki/home/). The philosophy of Registered Reports is that in order to counteract publication bias and various forms of researcher bias (such as p-hacking and HARKing), the publishability of a scientific study should be decided by the importance of the research question and rigour of the methodology, and never based on the results of hypothesis testing. In this talk I will provide an update on the progress of Registered Reports at Cortex and beyond, including uptake by more than 20 journals. I will focus in particular on some of the emerging challenges of the format as it has expanded, together with insights it has offered into forms of bias that pervade both research and the peer review process. Together with allied initiatives, Registered Reports are helping to reshape the incentive structure of the life sciences to place transparency and reproducibility on par with conventional indicators of scientific quality.