Thursday, May 7, 2020

Managing Preprints across the Publication Pipeline


(Note. This post was updated 05/10/20 to include the section, Be clear about the publication status)

The @PsyArXivBot Twitter account tweets out preprints posted to PsyArXiv (https://psyarxiv.com/), an OSF-hosted preprint server focused on psychological science. The bot was quiet for about 10 days in mid-April, and manual searches on the PsyArXiv site were not turning up new preprints. Something was wrong with the system, which was disruptive for a project I am working on, but more than that it brought a simple fact into stark relief: I have become dependent on preprints.

I felt like I was being shut out from learning about the latest research. In recent years I have switched from primarily learning about new articles via journal table of contents alerts to primarily learning about new articles via preprint servers. A large majority of papers of interest that I see in journals, or that colleagues send to me, I had read many months earlier as preprints. At the same time, for the last several years I have posted just about all of my own papers as preprints, and have (slowly) been posting older articles of mine on preprint servers. Preprints have really changed the way I interact with the scientific literature, and more simply, how I practice science.

But I realize that is not the case for everybody. Some of you reading this may not even know what I mean by this word preprint. Others don’t really know how to use them, as authors or as readers. Still others think they know how to use them, but don’t really follow best practices. This post is intended to be useful for all comers.

Preprints have become increasingly popular across the sciences (Abdill & Blekhman, 2019; Lin, 2018), and the COVID-19 pandemic has not only resulted in an uptick in use, but also heightened discussion of their benefits and potential liabilities (see this Nature Biotechnology editorial). There are many useful articles that elaborate on the myriad reasons why researchers should post preprints (e.g., Bourne et al., 2017; Publons, 2018; Sarabipour et al., 2019; Soderberg et al., 2020; Speidel, 2018). Accordingly, I will not rehash the motivational arguments. Rather, my intention here is to provide some practical advice on how to manage preprints across the publication pipeline. I focus on PsyArXiv, but most of my suggestions apply equally to other preprint servers.

First, a note on terminology. The term “preprints” traditionally referred to versions of manuscripts that were publicly posted or circulated prior to submission to a journal for peer review, with the primary purpose being to receive helpful comments and catch errors prior to submission. Preprints can also refer to manuscripts that are currently under review, and sometimes to author-formatted versions of manuscripts that have been accepted for publication. This latter category is more accurately referred to as “post-prints” or more colloquially as an “open-access (OA) version,” because preprint servers are not protected by paywalls. These technicalities aside, preprints have come to mean any document that is posted on a preprint server, and thus the term alone does not tell you much about the status of the paper (i.e., draft, under review, accepted). These variations also suggest that authors can make use of preprints across the publication pipeline.  

Posting a Preprint Prior to Submission to a Journal

Posting a preprint and circulating it for comment is a great way to receive critical feedback prior to submission. There is some debate about what shape the manuscript should be in when posted for feedback. Some researchers post complete drafts that have not been read by anyone outside of the authorship team, whereas others wait until they have a solid version that has been vetted by trusted colleagues prior to sending it out to the world. Not only do some vary in these practices, but they also vary in their endorsement of prescriptive norms about what researchers ought to do.

I recommend that you block out that noise; when to post a preprint is up to individual preferences and may depend on the paper in question. Do whatever feels right to you and works best for your workflow. Regardless of your view, if you post a preprint you should be ready and willing to receive comments, including critical ones. That said, the vast majority of preprints are posted silently in the night, never to be thought about again.

A common worry about posting a preprint prior to and/or during submission to a journal is that a journal will consider the preprint as a “published” version of the paper and thus ineligible for submission. For the majority of journals this is not the case. However, before posting a preprint you should absolutely consult Sherpa/RoMEO, which has a simple journal search function to determine what is permissible.

Posting a Preprint Upon to Submission to a Journal

An alternative approach is to post a preprint version of your manuscript at the same time as you submit to a journal. PsyArXiv now has a pilot program where authors can submit their preprints to certain APA journals, thereby integrating the posting and submitting steps.

Posting a preprint upon submission to a journal can still yield helpful comments and feedback that can be integrated during the revision process, but posting at this phase is a bit more focused on dissemination. The journal review process is slow, and posting the preprint version is one way to get work out more quickly. Again, you would be wise to consult Sherpa/RoMEO to see what is permissible by the journal.

Posting a Preprint (Post-print) Upon Acceptance to a Journal

Posting versions of accepted articles has many benefits, most notably that they allow for quicker dissemination and wider access. Be sure to check Sherpa/RoMEO before doing so, but just about every journal permits posting of author-formatted versions of articles (i.e., the final version created using your word-processing software), whereas very few will allow public posting of publisher-formatted versions (i.e., the final typeset version).

Uploading a post-print version of a book chapter is an excellent way to increase access. Book chapters are notoriously difficult to access, which is one of the reasons they are often maligned, but in my view the unconstrained format often leads to more interesting and stimulating reads. Again, you will want to check the contract to be sure that you are legally permitted to post an author-formatted version, but I have found that in nearly all cases it is allowed. So post your chapters and let your wild ideas run free!

Suggestions for Managing Preprints across the Publication Pipeline

Keep it current - Regardless of when you first post a preprint, it is good practice to post a new versions with every substantive update—as new drafts are prepared, following a revision resubmitted to a journal, upon acceptance, etc. Preprints are easily editable, so you can be sure that the most current version is always available. Moreover, preprint servers have version control built in, so the earlier versions remain available and preserve the history of the paper.

Put the date on it – The version date of the manuscript should always be included on the cover page. This is especially important if you will be updating the preprint across the publication pipeline. One potential downside of updating preprints as the paper develops is that there could be multiple versions floating out there in the world. Someone sees a preprint, downloads and saves it to their reference library (side note, use Zotero), and does not keep up with how the paper has been updated. There is little that authors can do to control this problem, but including the version date on the cover page would allow readers to know what version they are working with.

 

















Add the DOI for the published version – Preprints are assigned their own DOI (Digital Object Identifier) upon posting, and this DOI is maintained across updates. Once the article is published, however, a new DOI will be assigned by the publisher and associated with the published version. PsyArXiv includes a field labeled “Peer-reviewed Publication DOI” where this information can be entered. Taking this small step once the paper is published helps to keep the research record organized.
  





















Be clear about the publication status - There is much hand-wringing about the possibility of the media reporting on preprints that have yet to be peer-reviewed. Whereas I agree this could be a problem, the argument often implicitly suggests that peer-reviewed research can be trusted, which is clearly not the case. Nevertheless, it is good practice to have a clear statement about the publication status on the title page of the preprint. This can be changed at the same time as uploading an updated version, changing the date, etc: "Not yet submitted for publication," "Submitted for publication," "Re-Submitted for Publication," and "Published (including full citation)," are simple statements that can be added that increase transparency about the status of the work. 

Consider the preprint file name – I have come across preprints on PsyArXiv with file names such as “final draft.docx” or “dreaded.paper.FINAL.ACTUAL.REAL-ms-kcm-comments-FINAL.pdf.” These are not good file names. In general, you and you lab should have clear file naming conventions for all files. For preprints, this could include the first author last name, version date, and something brief but informative about the substance of the paper (e.g., Syed&Fish,2018-EriksonRace-05.07.20.pdf). Not long ago each version of the preprint on PsyArXiv was required to have the same file name, but this is no longer the case, so you should definitely embed the date in the file name.

Merge citations in Google Scholar – One of the advantages of preprints is getting the work out more quickly so that it can be of use to other scholars. This means that it could potentially be cited more quickly initially and gather more citations in the long run (Fraser et al., 2019), which can be of particular interest to graduate students and early career researchers. However, this also means that you will potentially have two sets of citations, one for the preprint and one for the published version. Thankfully this is easily managed. PsyArXiv is indexed by Google Scholar, so both versions should appear in your profile. Follow this guide to merge the two citations together.

Make a #prettypreprint – Some authors do not enjoy the unsightly appearance of a document created using default settings in word-processing software. Brenton Wiernik has created a set of templates for type-setting preprints to make them look similar to the layout used by major publishers (https://osf.io/hsv6a/). This is particularly useful for articles that have been accepted for publication, as the author-formatted version will look very similar to the publisher-formatted version, but will not violate copyright law. To be honest, I have been too lazy to do this myself, but you should definitely give it a try. If you do, be sure to cite Brenton for his work (the OSF page has a DOI) and include the #prettypreprint hashtag if sharing the paper on Twitter.

There are of course many other issues to consider, but following these suggestions will make effective use of preprints. Go post a paper RIGHT NOW!