a3nm's blog

Double-blind reviewing

More and more conferences in theoretical computer science (TCS) are moving to double-blind peer review. This is the implementation of peer review where the reviewers who evaluate submitted articles do not know the identity of paper authors, at least initially. The two major database theory conferences have adopted double-blind peer review (at least in an "experimental" way): PODS since 2023 and ICDT since 2024. Among general theoretical computer science conferences, we have ESA since 2019, STACS since 2021, ICALP since 2022, FOCS since 2022, SODA since 2022, STOC since 2024, and ITCS since 2024. An up-to-date list is maintained on double-blind.org. However, double-blind reviewing is not yet used in all conferences in all subareas of TCS. Further, it is also not commonly used in journals. In fact, I do not know of TCS journals that use double-blind peer review, except TODS which has used it since 2006 with a detailed rationale. See also the discussion of this issue at SoCG, which uses double-blind reviewing since 2023.

I think that this move to double-blind reviewing is a welcome change, and thought I'd try to summarize some of my thoughts about it.

First: should conferences and journals adopt double-blind reviewing? The implicit premise behind double-blind reviewing is that reviewers should evaluate articles based on their intrinsic merits, so that the identity and affiliations of the authors are not relevant for evaluation. When discussing double-blind reviewing, I think it is important to check first if all parties agree about this point. Indeed, this view is not universal: some researchers insist that it would be normal for conferences to evaluate submissions by newcomers with higher standards (see, for instance, this answer on academia.SE); or to the contrary conferences could be more welcoming towards outsiders1. However, if you believe, like TODS, that "every submission should be judged on its own merits", then information about the authors and their affiliations is indeed irrelevant to reviewers (assuming they have no conflicts of interest — see below). The question becomes: does hiding this information make a difference, and is the difference sufficient to justify the change.

Second: how much of a difference does double-blinding make? This is not, in fact, straightforward to evaluate. There has been decades of debate on this topic, and policies have been adopted by scholarly associations in many fields, given that double-blind peer review has been around since the 1950s (according to Wikipedia). Further, many scientific studies have attempted to quantify its effectiveness. The TODS rationale contains some pointers to this literature, and after 2006 one can find many other articles arguing for or against double-blind peer review (see for instance this one and the many articles that it cites or that cite it). Of course, the effectiveness of double-blind reviewing may depend on the community, or on how specifically it is implemented. In computer science, one influential data point was the 2017 WSDM experiment, in which submissions were scored by double-blind reviewers and by single-blind reviewers (i.e., who knew the identity and affiliation of authors). The experiment found that "single-blind reviewing confers a significant advantage to papers with famous authors and authors from high-prestige institutions". Here I must confess that I have not myself read all these works: my point is just that the issue is not simple, and that you cannot dismiss double-blind reviewing simply because you are unfamiliar with it or because you are personally convinced that it does not work.

Third: how should double-blind reviewing be implemented? Here, there is an idea I'd really like to get across: making reviewing "double-blind" does not necessarily mean that it should be impossible for reviewers to deanonymize authors. Indeed, essentially all2 conferences in the list above are using one specific implementation, called lightweight double-blind reviewing. This means that, while papers should not contain identifying details, and while reviewers are advised not to try to identify authors, it is OK if reviewers happen to find out about who the authors are. In particular, authors are encouraged to post their work, e.g., on preprint servers, even if this means a reviewer may find it (deliberately, or accidentally, e.g., by stumbling upon the work before reviewing). Lightweight double-blind reviewing still offers two important benefits:

  • Reviewers are not immediately biased by reading the author names and affiliations on the first page of the paper. In my personal experience as a reviewer, seeing familiar names or affiliations on a paper will immediately affect my impression of the paper and my expectation about the outcome ("probably accept unless the contents are a bad surprise" vs "probably reject unless the contents are a good surprise"). I would like not to be influenced by this, but I doubt I can avoid it, so I prefer not to see the information3.
  • Authors who worry about discrimination and do not trust the reviewers can choose to take steps to be completely anonymous, e.g., they can choose not to post preprints of their work.

By contrast, I do not like double-blind policies that try to guarantee complete anonymity of submissions at all costs, e.g., by prohibiting or discouraging the posting of papers as a preprints, informal talks, etc. I believe this is harmful, because it means that papers are only available when they are finally published4 — by contrast, preprints are immediately accessible to everyone. Unfortunately, there are conferences (especially practical conferences) that follow this route, e.g., SIGMOD 20245. Disallowing preprints can have a chilling effect on an entire field: as papers often end up rejected from a conference and resubmitted at another, authors may eschew the posting of preprints because of the risk that some ulterior, unspecified conference may disqualify their work on these grounds.

Fourth: What are the real problems with double-blind reviewing? Many of the criticism I have heard does not make sense to me:

  • "Anonymizing papers is tedious work." This I really don't understand: removing author names is trivial, writing self-citations in the third person (e.g., "The previous work of Myself et al." vs "Our previous work") feels weird but takes a few seconds... Altogether the impact seems to be minimal6.
  • "It it complicated to host/submit supplementary data, e.g., source code or experiments, in an anonymous fashion." But there are tools like Anonymous GitHub, and guides on what you can do.
  • "Double-blind reviewing is not perfect, and reviewers can guess who the authors are." Well, first, guessing is not the same as being sure; second, it's still useful if you can remove bias in some cases. An improvement can be valuable even if it is not a perfect solution.
  • "It's unnecessary, everyone is honest and exempt of bias." Even assuming that reviewers try to be fair, this misses the point that bias is often unconscious.

To me, the real (minor) shortcomings of double-blind reviewing are:

  • Some journals (and conferences?) are "epijournals", where papers are first submitted to a preprint server, and then reviewed by the journal (and endorsed if they pass peer review). I think this is a great practice, that neatly separates the hosting of papers (which is done for free by preprint servers) from the evaluation process. Unfortunately the interaction with double-blind peer review is not perfect: you cannot send the preprint to reviewers, because of course it includes author information. The fix is obvious, just a bit inelegant: simply ask authors for an extra blinded version of the paper when they submit it for evaluation by the journal.
  • The management of conflicts of interest (COIs) is more complicated. Many conferences and journals have policies to ensure that papers are not reviewed by colleagues, collaborators, supervisors, or personal friends of the authors — or, of course, the authors themselves! When reviewers know who the authors are, they can easily detect COIs and recuse themselves from reviewing the paper. With double-blind reviewing, this is more complicated. Typical solutions include asking authors upon submission to disclose with which members of the program committee (PC) they are in COI, and/or asking PC members to disclose with which authors they are in COI (during the bidding phase). But these are not easy to adapt to journals, where papers are typically sent to editors not affiliated to the journal. Or, for conferences, it does not address subreviewing, where PC members delegate a paper to an outside expert: this expert could end up being in COI with the authors, or be one of the authors, which is especially embarrassing. This is typically handled either by unblinding papers from PC members if they decide to subreview a paper, or by making subreview invitations pass through an unblinded party (e.g., the PC, or a specific organizer not involved in reviewing) who can check for the absence of COI7.

I hope this post can help clarify some points about double-blind reviewing, and maybe encourage more conferences and journals to consider it!

  1. To put it in another way: there are many academic events that are invitation-only, and many public conferences include some invited talks that have been nominatively selected. Whether this is OK or not, and what is the right balance, is a different question. But conferences who make an open call for contributions should be clear about whether they commit to treat submissions from everyone in the same way, or whether some authors are more equal than others

  2. The exceptions are ICDT, which uses the same concept but does not give the name; and FOCS, which does not give specific details about implementation but which I would expect to follow the lead of other conferences. 

  3. I don't know if all reviewers approach their job in the same way, but personally, I'm always worried about misjudging a paper — e.g., be the lone reviewer to advocate for rejection when the others reviewers have better reasons to accept it, or vice-versa. (On reviewing platforms I have used, reviewers typically cannot see the other reviews on a submission before inputting their own — thus forcing them to come up with their own independent assessment.) Of course, it doesn't all come down to the grade, but an important question remains as I read the paper: how will I evaluate it in the end? And I observe myself looking for clues that could give away how other reviewers are likely to evaluate it, e.g., the formatting, the writing quality — or, indeed, the identity of the authors. The same goes for originality: I suspect that a paper that looks very unconventional may end up being rejected just because the reviewers will think that others will not take it seriously (a kind of reverse Schelling point). Unfortunately, I further suspect that this phenomenon is encouraging papers to be more conformist and to avoid standing out, in a kind of herd-like behavior. 

  4. Of course, the final published version will often not actually be available to everyone. So it is especially important to encourage authors to post preprints (and postprints!) of their work. 

  5. The specific wording is: "we request that the authors refrain from publicizing and uploading versions of their submitted manuscripts to pre-publication servers, such as arXiv". That said, later, the call for paper grudgingly concedes that papers can still be submitted if there is an online preprint. 

  6. If we're trying to streamline the publication process, one more immediate area for improvement would be to fix the duplication of effort between submitting preprints (e.g., on arXiv), and submitting "camera-ready versions" of papers that often have a limited page counts and other inessential differences. Or: harmonizing paper stylesheets (e.g., using LIPIcs) and page limits (or removing strict page limits altogether), so as to avoid tedious reformatting work when resubmitting a paper to a different conference. I have spent orders of magnitude more time on busywork of this kind than I have ever spent on the anonymous-vs-nonanonymous question. 

  7. About COIs, by the way: I wish we eventually have a platform to handle these more automatically. Knowing that researchers are often (alas, not always) disambiguated with a unique ORCID identifier, and that there are bibliographic databases (e.g., DBLP for computer science, or Crossref or BASE) with co-authorship information, many COIs nowadays can be detected from public, structured data. This can be refined with affiliation information (sometimes from email addresses, or from ORCID records or sometimes from DBLP), and supervision information (e.g., the Mathematics Genealogy Project). Sure, there are also COIs arising from close personal friendships or other undetectable causes... but it would be nice if manual handling were limited to these — or they could also be given by authors and stored by some trusted third party. Such a system would be useful, like how the Toronto Paper Matching System (TPMS) (cf reference) is streamlining the paper bidding phase for large AI conferences. 

comments welcome at a3nm<REMOVETHIS>@a3nm.net