Sunday, 29 January 2012

The Trials & Tribulations of the Journal of Null Results

Follow me on Twitter at:

Ok - straight to the point again. There will never be a Journal of Null Results. Recently Richard Wiseman tweeted “Psychologists: I am thinking of starting a journal that only publishes expts that obtained null results. Wld you submit to it?” I replied – Yes, I have drawers full of studies with null results. But – having said that - I struggle with even thinking about writing them up, let alone submitting them – and the reality is that I know that no editor would contemplate accepting them for publication. But I have great sympathy for creating a Journal of Null Results.

So what would a Journal of Null Results look like? Is it going to be swamped with submissions from people who didn’t get significant results? That would be nice for all our young and up-and-coming researchers. It would mean that those 50% or so of PhD studies that never achieve significance would see the light of day; it would mean that all undergraduate projects – at a sweep – would be suddenly publishable. But I’m not sure that’s what’s intended. One of the cornerstones of science is replication, but we still seem quite bad at deciding when replication is necessary and deciding when non-replication is publishable. These are the decisions upon which science – according to the principle of replication – should be based. I haven’t looked closely at the literature, but I suspect that most journal editors would not be interested in publishing (1) a straight replication of a study that is already published, and (2) a study that fails to replicate an already published study – especially if the results are null ones.

There are two issues that immediately come to mind here. First, we can never quite know why a study has generated null results. As a good scientist, you certainly wouldn’t design your experiment to generate predictions that would support the null hypothesis. We know that null results can be produced by low power, inappropriate control conditions, poor experimental environments, experimenter bias, etc. etc. – so null results studies already have that inherent disadvantage unless they are obsessively faithful reproductions of studies that have produced significant results in the past.

As I said earlier there are two distinct, but related issues here: who decides whether a finding is so significant that it needs replicating, and is a replication going to generate enough citations to make an editor believe it is worth publishing? For example, in 2011 Daryl Bem published an article in Journal of Personality & Social Psychology – one of the flagship journals of the APA – apparently showing that ESP existed and that people could predict the future. The normal process for the vast majority of research findings would be that this finding  - published in one of the major psychological journals – would now simply enter into core psychological knowledge. It would find its way into Introduction to Psychology textbooks and become taught as psychological fact. But it is controversial. Richard Wiseman, Chris French & Stuart Ritchie have had the salutary experience of trying to publish a replication of this finding, and their failure to get their null results published in that same journal is documented in this blog ( Quite strangely, Journal of Personality & Social Psychology told them ‘no, we don’t ever accept straight replication attempts’. I assume the implication there is that whatever they publish is so rock hard fact because they are such a high impact journal that they have no need to justify the scientific integrity of what they publish? So where do you publish replications if not in the same journal as the original study? Wiseman et al. did submit their replication to another journal, but the reasons for not publishing it in that journal seem from the above blog also to be bizarre. Once a finding does get published – no matter how bizarre and leftfield – there are editors bending over backwards to find reasons not to publish attempted replications.

One reason why replications may not get published, and specifically why null findings may not get published, is because editors are not only nowadays being pressurized to find reasons to reject quite scientifically acceptable studies (because the journal simply has to manage demand), but because a study may not generate citations. This is certainly true of successful replications, which we assume will never be as heavily cited as the original study. In the last couple of weeks I had an article rejected by an APA journal, and one of the reasons given by the action editor was “we must make decisions based not only on the scientific merit of the work but also with an eye to the potential level of impact for the findings”. Whoa! – that is very worrying. Who is making decisions about whether an article is citable or not?

Finally, I should recite one of my own experiences on how you try to get the research community to hear about null findings. During the early 2000s, my colleague Andy Field and I did quite a bit of research on evaluative conditioning. There was a small experimental literature in the 1990s suggesting that evaluative conditioning was real, important, and possessed characteristics that made it distinct from standard Pavlovian conditioning – such as occurring outside of consciousness and being resistant to extinction. Interestingly, multinational corporations suddenly got interested in this research because it suggested that it might be a useful paradigm for changing people’s preferences, and so had important ramifications for advertising generally. Andy’s PhD thesis rather eloquently demonstrated that the evaluative conditioning effects found in most of the 1990s studies could be construed as artifacts of the procedures that had been used in those earlier studies and need not necessarily be seen as examples of associative learning (Field & Davey, 1999). We spent many subsequent years trying to demonstrate evaluative conditioning in paradigms that were artifact free, but there were many very highly powered studies that resulted in failure after failure to obtain significant effects. Evaluative conditioning was much more difficult to obtain than the published literature suggested. So what should we do with these bucket loads of null results? No one really wanted to publish them. In fact, we didn’t want to publish them because we thought we hadn’t quite got it right yet! We kept on thinking there must be a procedure that produces robust evaluative conditioning but we haven’t quite refined it. In 1998 I was awarded a 3-year BBSRC project grant worth £144K of taxpayers money to investigate evaluative conditioning. We found that it was extremely difficult to demonstrate evaluative condition – and it was also very difficult to publish the fact that we couldn’t demonstrate evaluative conditioning! Andy and I attempted to publish all of our evaluative conditioning null findings in relatively mainstream journals – summarizing a total of 12 studies largely with null results. No luck there, but we did eventually manage to publish these findings in the Netherlands Journal of Psychology (Field, Lascelles, Lester, Askew & Davey, 2008). I must admit, I don’t know what the impact factor of that journal is, and I don’t know how many people have ever read that article. But I suspect it will never make headlines in any review articles on evaluative conditioning.

The problem for null results is that they lie in that no man’s land between flawed design and genuine refutation. Our existing peer review process leads us to believe that what is published is published because this is the way we verify scientific fact. It then becomes significantly more difficult to produce some evidential material saying that ‘scientific fact’ is wrong. It then becomes even more difficult to prove that a ‘scientific fact’ is wrong because of a study demonstrating null results.

Field A. S. & Davey G.C.L. (1999) Re-evaluating evaluative conditioning: A nonassociative explanation of conditioning effects in the visual evaluative conditioning paradigm. Journal of Experimental Psychology: Animal Behavior Processes, 25, 211-224.

Field AP, Lascelles KRR, Lester KJ, Askew C & Davey GCL (2008) Evaluative conditioning: Missing, presumed dead. Netherlands Journal of Psychology, 64, 46-64.

Follow me on Twitter at:


  1. People have been thinking about this for a while.

    Important sources that should not be ignored are:
    (1) Jonah Lehrer's wonderful article in the New Yorker (
    (2) Obviously, Ioannidis' papers (e.g. 2005 "Why most published research findings are false", 2011a "Excess Significance Bias in the Literature on Brain Volume Abnormalities", etc.)
    (3) Simmons, Nelson & Simonsohn 2011 ("False-positive Psychology").

    The main point you are making about null results - "that they lie in that no man’s land between flawed design and genuine refutation" - is a problem indeed. But I don't see why it should be a bigger problem than the problem that 5% of all studies are false positives. Or a problem that is harder to handle. And obviously, it doesn't mean that all undergrad papers should get published.

    Noone wants ALL null results to be published.

    But if a study fulfills the same scientific criteria as a positive result, there is no scientific reason not to publish them. And we know all fields have problems here. Most modern meta-analyses include funnel plots, nicely showing the insanely strong publication bias.


  2. There are plenty of examples of when null results are eagerly accepted by editors, and in fact, Jacob Cohen wrote an entire book about how poorly designed null-finding published behavior science research could be: _Statistical Power Analysis on the Behavioral Sciences_.

    There are also likely plenty of times when prevailing political and social attitudes encourage publication of null-finding research, that very obviously has been designed to enhance the likelihood of null-finding.

    Two examples are meditation research published during the Carter and Reagan Administrations, and more recently, research purporting to show that genetically modified organisms show no significant differences in safety. A plethora of null-finding studies on each topic have been published, with as few as 7 experimental subjects in the case of one null-finding meditation study I have seen, and as few as 3 experimental animals in the case of one GMO safety study.

    It is trivially easy to get such studies published when the political climate is such that null-findings are deemed desirable.