The perception of reproducibility in a small cohort of scientists in Europe

Reproducibility is an essential feature of all scientific outcomes. Scientific evidence can only reach its true status as reliable if replicated, but the results of well-conducted replication studies face an uphill battle to be performed, and little attention and dedication have been put into publishing the results of replication attempts. Therefore, we asked a small cohort of researchers about their attempts to replicate results from other groups, as well as from their own laboratories, and their general perception of the issues concerning reproducibility in their field. We also asked how they perceive the venues, i.e. journals, to communicate and discuss the results of these attempts. To this aim we pre-registered and shared a questionnaire among scientists at diverse levels. The results indicate that, in general, replication attempts of their own protocols are quite successful (with over 80% reporting not or rarely having problems with their own protocols). Although the majority of respondents tried to replicate a study or experiment from other labs (75.4%), the median successful rate was scored at 3 (in a 1-5 scale), while the median for the general estimation of replication success in their field was found to be 5 (in a 1-10 scale). The majority of respondents (70.2%) also perceive journals as unwelcoming of replication studies. Related Objects: Dataset https://doi.org/10.17605/OSF.IO/T9P42


Introduction
Reproducibility, as a general concept of agreement among experimental outcomes, is a core component of science. General theories about how nature operates, informed on the outcomes of scientific discoveries, can only be appropriately evaluated if such discoveries are confirmed through replication. In a more narrow definition, reproducibility refers to precisely obtaining the same result, under the same conditions, and it is usually applied in computer sciences [1], while replicability refers to obtaining similar (or the same) results by repeating the research procedures [2]. However, reproducibility or replicability goes beyond the common idea of repeating experiments. As argued by Nosek and Errington, the purpose of replication is to advance a theory by confronting existing understanding with new evidence [3]. In this sense, communicating the outcomes of replication attempts is essential to allow comparisons and discussions about the generalizability of a theory, hypothesis or model. Our current model of scholarly communication relies heavily on the peer-review of articles published in a journal-based system. However, the same system of incentives promotes a restless seek for novelty, where scientists are pushed to pursue and publish new and impactful results. This scenario cre-ates an unfriendly environment for attempts to replicate previous results, since scientists, institutions and journals depend on and feed the current system. The results are observed in reproducibility issues in several fields, such as psychology [4], cancer biology [5,6], functional magnetic resonance imaging [7] and biomarkers in psychiatry [8] to cite some of the most evident.
Contributing to this scenario of little incentive to promote reproducibility, journals are not clear in their policies regarding replication studies. For instance, in Neuroscience only 6.6% of the journals (31/465) explicitly state if they accept or not submissions of replication studies [9]. In Psychology this number is even lower: 3% of the journals (33/1151) explicitly accept submissions of replication studies [10]. The outcome of this lack of incentives and information is the publication bias in favor of novel findings, creating another barrier for access to the results of replication studies.

Methods
We asked scientists what their perception of the issues of reproducibility in their field is. We pre-registered a questionnaire in the Open Science Framework platform (see Data Availability sec-tion). The questions were shared in social media platforms, such as the Twitter account of the Journal for Reproducibility in Neuroscience (@jrepneurosci), and directly through institutional email lists. The respondents were anonymous and the results were stored in the same platform. Since this was an anonymous survey and no data related to the participants was collected or stored, no approval by the ethics committee was required. The survey was distributed globally but, due to the number of responses per geographical region, we opted to compile only those from Europe in the present report as it comprises the biggest number of responses.

Have you ever had problems replicating protocols from your own lab/group?
As seen in Figure 1A, the majority of respondents did not report major issues replicating protocols from their laboratories, with 57.9% rarely having problems, 22.8% not having problems, while 19.3% reported having frequent problems replicating their own protocols.

Have you ever attempted to replicate a study or single experiment published by another group in your field?
The majority of respondents reported attempting to replicate studies or experiments from different research groups: 68.4% attempted more than once, 7% only once, while 24.6% never attempted to replicate a study from the literature, Figure 1C.

How was your replication attempt of results from other groups?
The respondents were asked to score the success in their replication attempts of studies from the literature using a scale from 1 to 5 (scale from 1= unsuccessful, to 5= successful), the results indicated a median of 3 (Q1= 2, Q3= 4), with none of the answers scoring 5 (Mean/SD= 2.84/0.97 of the 43 respondents who attempted replications), as seen in Figure 1B. No difference was found between the major groups [PIs, PDR and PhD students; one-way ANOVA F(2,37)= 0.2596, p=0.7728], however different interpretations of "success" between groups, due to different levels of experience and understanding of the field, need to be taken into consideration.

What is your estimation of the replication success in your field?
The respondents were asked to estimate the success of replication attempts in their field from 1 (very bad) to 10 (very good), and the results indicated a median of 5 (Q1= 3.5; Q3= 6; Mean/SD= 5.00/1.75, n=57), Figure 1D, with none of the answers scoring 9 or 10. Again, no difference was observed between the major groups [F(2,49)=1.8986, p=0.1611], and as in the previous question, any possible difference in the interpretation of "replication success" between groups could not be detected in this questionnaire, therefore this lack of inter-group difference needs to be considered with caution.

Have you ever tried to PUBLISH the replication of a study or experiment?
As seen in Figure 2A, the majority of respondents never attempted to publish the results of their replication studies (77.2%), or were unsuccessful when doing so (7.0%), while 15.8% succeeded in the process.

Do you think your field suffers from reproducibility issues?
The majority of the respondents see a 'crisis' in their field of work (56.1%), with 36.8% manifesting concerns although not perceiving a 'crisis'; 7% of the participants do not see reproducibility issues in their field of study, as seen in Figure 2B.

What is your perception about journals' policy for results of replication attempts?
The results indicate that respondents do not see journals as welcoming venues to publish the results of replication attempts, with 70.2% answering that journals are not friendly to publish such results. Only a small number of respondents (3.5%) answered that journals accept to publish the results of replication attempts, even when contradictory, and 1.8% see journals as welcoming only confirmatory results of replication studies. Surprisingly, almost a quarter of the respondents (24.6%) did not know about journals' policies ( Figure 2C).

Data Availability
The questions used in the survey, and the obtained dataset is available in OSF (https://osf.io/) under the DOI: 10.17605/OSF.IO/T9P42. All material is available under a CC BY license.

Discussion
In the present study we evaluated the response of a small cohort of scientists in Europe to questions related to reproducibility of experiments. The survey was shared via institutional mail lists and social media, leading to a potential exclusion of scientists not present on social media, and, therefore, a sample bias. Statistics in the literature suggest that the majority of scientists present on social media are within the age of 21 to 49 [11]. Thus, it is likely that the answers of this survey have an under-representation of more "senior" scientists, which fits our demographic data with the PhD students representing alone 45.6% of respondents.
The questionnaire was clearly advertised to be about "reproducibility", thus it is safe to assume that only people familiar with the topic answered the survey. However, since this questionnaire was not meant to assess the percentage of scientists replicating experiments but their perception about the topic, this potential bias should not represent a major limitation. We observed that although the majority of respondents did not report major issues replicating their own protocols, the replication of studies from the literature is less straightforward. The reason for this discrepancy between intra-and inter-group reproducibility might be explained by the differences in sources and level of methodological detail. When trying to replicate results from literature, one often has only the final article, or in very rare cases, a study protocol [12]. However, when trying to replicate results from the same lab, one usually has access to protocols, lab notebooks with notes and observations, often the exact same set up and same batch of reagents, and importantly, sometimes the expertise and knowledge of the researcher who obtained those results in the first place. It is important to consider that we did not address the frequency of replications in lab protocols or studies from literature, which may limit our conclusions of the successfulness of the attempts.
Hypothesizing a best-case scenario where every published result is genuinely produced by honest and well-conducted experi- ments, the inability to fully replicate these results from different laboratories has two possible implications. First, the originally published results are weak and need a highly specific and narrow context to occur, depending on variables that the authors themselves might not be aware of. This first implication generates questions on the actual relevance of these published results. If an observation requires unique conditions to be observable, it retains little relevance in proposing a generalizable theory behind a certain scientific phenomenon.
The second implication is that the low rate of replicability derives from poor description of materials and methods used. One important limitation to a satisfying protocol report is the word limit often imposed by scientific journals. Although this limit helps prevent long and wordy articles, there is no real reason not to exempt the methods section from this limit, allowing a detailed description of the methodologies used, especially in an online environment. It's important to highlight that these two implications are not mutually exclusive, they can both be true. However, if on one hand there's little we can do regarding the first scenario, there is a lot that can and must be done regarding the transparency and completeness of material and methods descriptions.
Nosek and Errington highlight that an exact (the authors call it 'direct') replication of a study is often very useful in case of results with weak predictability (which they define as 'immature') [3]. When a theory is immature, it can be quite difficult to predict in which conditions the observed results would re-occur and when they would not, because of a lack of a deep theoretical understanding of the phenomenon. In these cases, being able to make replications as close as the original experiment as possible can be crucial in understanding how generalizable specific results are, by identifying the minimum variables necessary to observe said results. Hence, the critical importance of detailed and meticulous reporting of protocols in promoting, even allowing replication studies.
Publication and registration of detailed protocols via protocol repositories such as Protocols.io [13] or Nature Protocols (ISSN:1750-2799) can certainly be expected to make direct replications much easier, and they might significantly improve the reproducibility rate of results providing a detailed description of procedures and variables to take into account and working as deterrent against questionable research practices by the authors.
However, if a set of results requires unique conditions to occur, the most detailed protocol won't be enough and every tentative replication will fail. And that is fine. Failed replications are as valid as successful ones in investigating the solidity of a scientific claim [3]. Of course, the very definition of a 'successful replication' is not easy to address. For instance, a rigorous, well-conducted experiment may not reach the same outcome of the original assay. In this sense, it is ambiguous if the result configures a successful or failed replication. Different scientists might interpret the definition of "successful replication" in different ways (successful replication of protocol regardless of outcome, or successful replication of results, etc.) therefore caution is needed when interpreting the results of our survey.
An additional fundamental strategy to improve reproducibility is to increase and standardize data sharing. Although many scientists declared to be more than willing to do so if provided with appropriate platforms [14][15][16], when this goodwill was put to test, the results showed a very poor outcome, with the majority of investigators denying a request to share their data [15]. Implementing mandatory data sharing as part of the submission process to scientific journals would not only allow better and easier replications, but would boost research as a whole, contributing to a "symbiotic research" [17] where researchers can build on each other's data.
The answers compiled in Figure 1C suggest that the problem with reproducibility is not necessarily the lack of replication attempts, or at least not the only one. Although it is hard to estimate what is an 'ideal number' of attempts, our results suggest that a large part of the problem lies in the lack of visibility for the results of replication attempts. The majority of respondents (75.4%) attempted to replicate a study from the literature at least once, but most of them never even tried to publish the outcomes. And the answer might be at least partially suggested by the data in Figure 2C, according to which, more than 70% of respondents think that journals do not welcome submission of replication attempts. And they might be right. Or at least, they might not be 'too wrong'. As of 2017, according to Yeung's study, on a sample of 465 neuroscience journals, only 6% explicitly stated that they welcomed replication studies, and 0.6% explicitly stated they reject them.
The remaining 93.3% just did not state their position on the matter, with a small percentage (8.6%) implicitly discouraging the submission of replication studies [9]. Although stating that "Journals do not welcome replication papers" might be an unjust generalization, it is also true that given the lack of a clear position from most of the journals, focusing on investigating, and publishing solely novel results is clearly the safest choice for authors. This lack of information contributes to a publication bias in favor of novel results. Having the journals clearly stating their positions about publishing replication studies not only would make the outcome less obscure for that majority of researchers, but would promote a process of normalization of publishing such results in the mainstream science journals.
Recently, several approaches have been developed to overcome the novelty-focused publishing system. The more direct way is posting preprints [18,19] and pre registering the attempts through registered reports [20][21]. Registered reports are detailed descriptions of the experiments and analyses to be conducted before data collection, which will be peer-reviewed and 'accepted in principle' by the journal. However, to this date less than 300 journals in all fields of science accept registered reports [22]. Both preprints and pre registrations can tackle the publication bias in favor of negative or contradictory results. However, the lack of peer review might be seen as a limitation, as peer reviews might be an additional level of rigor-check, especially in cases of direct replications. Micropublications (ISSN:2578-9430) can represent a valid compromise, allowing independent non-novelty focused publications without renouncing a peer review phase but in a simplified and speedy process. Special issues from existing journals allow, from time to time, publication of replication studies in journals that wouldn't normally welcome them. This, however, might promote a vision of replication studies as something "exceptional", out of the ordinary, a type of attitude that may actually be contributing to the so-called "reproducibility crisis". Lastly, the appearance of new journals entirely dedicated to publishing replication studies [23] may hopefully give visibility to all the replications made but never considered to be shared. The most likely scenario is that improving the communication of reproducibility results is not a 'one-size fits all' solution and different but complementary approaches must be tested, paving the way to more reliable science.

Declarations
Within the scientific community, many are aware of issues with data reproducibility within and outside of laboratories. However, the current publication system biases scientists to report novel experimental findings and makes it challenging to publish replication studies. The present study explored the perception of reproducibility among scientists, primarily in Europe, by sharing a questionnaire through social media platforms. Most scientists reported rarely having issues replicating data within their laboratory. Many attempt to replicate data from other laboratories and have moderate success. The majority of those surveyed in this study have not attempted to publish replications of a study and are not aware of the journals' policies about publishing replications. Overall, the majority of those surveyed agree that there are issues with data reproducibility in science.
The main weakness of this article is the small sample size, which resulted in a lack of diversity among respondents. That is, respondents were mainly from European institutions. Thus, the generalizability of this article's findings to other countries would be interesting to investigate in the future. There are other matters that the present paper could improve upon such as administering a more detailed questionnaire to determine, for example, reproducibility issues in different scientific fields. Nevertheless, these limitations do not diminish the value of this present study but rather demonstrate the interesting avenues that can be addressed in future studies.
This article does a great job describing the importance of reproducibility in science. For example, it is important for understanding the generalizability of theories. I particularly enjoyed the fact that the authors discussed why reproducibility issues may arise and that they provided suggestions for how such issues could be overcome. For instance, the details of experimental procedures are often limited in journal articles, which can pose a challenge for researchers from other groups to replicate methods and results. The simple act of registering detailed protocols could aid in enhancing data reproducibility. Many scientists are familiar with the issues of reproducibility and publishing replication studies in our respective fields. At times it seems these issues have been normalized. This article describes that these problems need be addressed and can be fixed.