I am a strong proponent of signed reviews, the practice of
reviewers disclosing their identities by including their names in their peer
reviews. I have explained my reasoning in a previous
blog post and editorial when discussing open
science practices at Emerging Adulthood.
It is my personal opinion that any potential costs associated with the practice
are outweighed by the benefits afforded through openness and transparency.
People seem quick to cite the potential for retaliation, especially for early
career scholars and those from under-represented communities. I am honestly
confused by this response because it a) assumes the worst of our colleagues and
b) as Hilda
Bastian nicely articulated, if indeed there are instance of retaliation
then as a community we should address the perpetrators directly rather than
cower in fear. Moreover, masked review does not prevent retaliation, as authors
may be certain they know who the reviewers are and may treat them accordingly,
even though they may be wrong. Signing reviews brings these dynamics all into
the open.
All of that said, my feelings and beliefs are likely to be
of little interest to most people. Rather, what we all want are some data that
speak to the issue. There have been some studies on the topic, but they are all
limited in some way. For example, van
Sambeek and Lakens (2020) found that authors were more likely to sign their
reviews when making positive recommendations (accept/minor revisions) compared
with negative recommendations. The majority of the data, however, came from
manuscripts that were ultimately published, and thus their analyses were based
on a biased sample. Moreover, they only examined the recommendation made by the
editor or the reviewers, and not the content of the reviews. In a self-report
survey study, Lynam et al. (2019)
found that respondents believed that signing their reviews would lead to
reviews that are less critical, more civil, and take longer to complete, all
with rather small effects.
These studies collectively rely on self-report, a biased
sample of reviews, or incomplete information about the content of the reviews.
To address these limitations, I dug into my own archives to analyze the content
of my reviews. Since my first peer review in 2008 I have kept a word file of
every review I have completed. I am sure that I am missing some, but any
missing reviews would not be due to any kind of systematic bias, rather just
simple lack of organization. Whereas of course there are many limitations to
analyzing reviews from a single person, there are also many benefits. Indeed,
this analysis is a direct within-person behavioral test of the patterns
reported by Lynam et al.
My archive includes 203 reviews completed between 2008 and
2019. I started signing my reviews in 2015, which resulted in 147 unsigned
review and 56 signed reviews. I have fewer signed reviews because around the
time I started signing is also when I became an Associate Editor and then
Editor, and thus conducted a smaller number of ad-hoc reviews for other
journals. My editorial letters are not included in this analysis, as those are
a different beast altogether.
I subjected my reviews to the Linguistic Inquiry and Word Count (LIWC)
program to analyze the content of my reviews, and whether the content changed
as a function of whether or not I signed the reviews. My version of LIWC is on
the older side, so it is based on the 2007 dictionary. The data file includes
everything from the dictionary, although clearly some features are more
relevant than others, so I do not describe all of the data here. I only examine
four features of the reviews that I thought were most relevant: word count,
positive emotion words, negative emotion words, and cognitive mechanism words.
The full LIWC results, R code, and the LIWC 2007 dictionary are freely
available at osf.io/uf63k/. The raw reviews
are not included in that data file since some people might feel that is a
violation of privacy and the peer review process, but I have an identical file
that includes the text of the reviews, which I am happy to send upon request.
I was interested in two questions:
Did signing my
reviews lead to a change in the content of the reviews?
The answer is a pretty clear no. There were no differences
between unsigned and signed reviews in terms of the length of my reviews (t = 0.41, p = .68), positive emotion words (t = -0.96, p = .33), negative
emotion words (t = 1.07, p = .29), or cognitive mechanism words (t = -0.49, p = .62). These non-differences are clear in the violin plots:
It is worth noting, I think, that the frequency of positive emotion
words is higher than the frequency of negative emotion words.
Signing my reviews is confounded with time; I started
signing at a specific time and have signed almost every review since, with the
exception of a couple I did not sign when I co-reviewed with a student. Thus,
in many ways not-signed/signed is a binary version of what is a potentially
interesting continuous variable: time.
Did the content of my
reviews change over time?
The answer here is also a pretty clear no. For the most
part. There was a negative correlation between time and word count (r = -.22, p = .002), indicating my reviews have gotten a little shorter over
time. This is consistent with oft-quoted remark that length of review is negatively
correlated with time in the field. However, looking at the scatterplot below
shows that the association is quite noisy. The correlations between time and positive
emotion words (r = .05, p = .45), negative emotion words (r = .002, p = .98), and cognitive mechanism words (r = .05, p = .48) were
all very small.
So there you have it. Signing my reviews did not seem to
change the content much, at least with respect to these few indicators that I
examined. Yes, I could have analyzed more data, or analyzed these data more
appropriately, but I did this work while avoiding several other more pressing
tasks so was not looking to put in maximal effort. The data are available, dear
reader, so have at it!