Why aren’t articles on arXiv —or any other open access repository— formally credited as publications? What is it exactly that separates open access repositories from publishers? The simple answer is that publications in journals come with an amorphous quality indicator associated with the journal’s perceived prestige. Articles posted on a repository on the other hand, are considered to be “provided at the reader’s own risk”, as they are not accompanied by any measurable guarantee of their scientific merit. We think the time has come to change all that.
With the support of OpenAIRE, Open Scholar will coordinate a consortium of five partners to develop the first Open Peer Review Module (OPRM) for open access repositories. Our partners that will deliver this work include:
1. The institutional repository of the Spanish National Research Council (DIGITAL.CSIC)
2. The repository of the Spanish Oceanographic Institute (e-IEO)
3. The Artificial Intelligence Research Institute (IIIA) in Catalonia
4. The Multidisciplinary Laboratory of Library and Computer Sciences (SECABA) in Granada, and
5. A company of DSpace professional development and services (ARVO)
Our project envisions the gradual conversion of existing open access repositories into fully-functional evaluation platforms that will provide the capacity needed to bring back research quality control to the community of scholars and help bridge the gap between academic institutions and publishers. The OPRM will initially be developed as a plugin for repositories using the Dspace software package, but will be designed in a way that will facilitate subsequent adaptation to other repository software suites such as EPrints and Invenio.
The installation of the OPRM on an institutional or other open access repository will enable the formal review of any digital research work hosted in this repository —including data, software code and monographs— by an unlimited number of peers. Reviews of this digital content will consist of a qualitative assessment in the form of text, and quantitative measures that will be used to build the work’s reputation. Importantly, this evaluation system will be open and transparent. Open means that the full text of the reviews will be publicly available along with the original research work. Transparent means that the identity of the reviewers will be disclosed to the authors and to the public.
The OPRM will also incorporate a reviewer reputation system based on the assessment of reviews themselves, both by the community of users and by other reviewers. This will allow a sophisticated scaling of the importance of each review on the overall assessment of a research work, based on the reputation of the reviewer. The complex issue of creating reliable reputation metrics for research works, authors, reviews and reviewers will be tackled by the combined expertise of two prominent research groups with ample experience in opinion-based reputation modelling (IIIA) and group decision making with non-homogeneous experts (SECABA). Both approaches of reputation assessment, a) as a probabilistic modelling of opinions, and b) as group decision making, consider that the peer reviewers do not have the same confidence and expertise in the topic they offer their opinion on. The aggregation process of reviews must therefore take into account these heterogeneous situations and larger expertise must weigh more on the global aggregation. The reputation assessment model will thus be based on the concept that the reputation of the opinion source impacts the reliability of the opinion itself.
In addition, the model will be flexible with its opinion sources: it will use both explicit opinions (offered by peers in the form of formal reviews) and implicit opinions that can be extracted from user behaviour (such as indirect quality indicators encoded in the number of visits and downloads), in situations where explicit opinions are sparse. This will partly address the cold-start issue associated with the latency of expert reviews starting to accumulate in the system. Furthermore, our reputation model will include consensus measures to further strengthen the validity of aggregation outcomes. This means that greater consensus on the evaluation of a research work will count positively for its reputation —five reviewers agreeing a paper is “good” is different than two saying it is “poor”, two “excellent” and one that is “good” despite the fact that the aggregated average is the same.
The OPRM will initially be tested on the two repositories that form our consortium:
- e-IEO, which is the institutional open access repository of the Spanish Institute of Oceanography, a public research organization with 10 research centres and 56 million euros annual budget, and
- DIGITAL.CSIC, which is the largest institutional repository of a research performing organization in Spain —the Spanish National Research Council (CSIC)— and ranked in the 5th position of the European classification in the latest edition of Ranking Web of Repositories.
The potential impact of this testing is high for the sheer volume and diversification of research outputs typologies available in the platform, its multidisciplinarity, and the repository’s track record in enriching its infrastructure with value added services to measure research impact along traditional and emerging lines. In fact, with over 110,000 works DIGITAL.CSIC offers an ideal platform to test the OPRM prototype on a wide digital collection comprising publications, grey literature, datasets, software code, conference objects, working papers and reports, policy documents, theses, blog postings, preprint articles and many more. Further, such variety of research outputs spans across 8 broad scientific areas ranging from hard sciences to social sciences and humanities, which will allow to experiment with the emerging review approaches on very diverse disciplines and thus identify likely discipline- specific behavior and community patterns.
All the code generated in this project will become available in public code repositories (github), documented and made configurable enough so that others can change it to match their own configurations, and following accepted best-practices regarding code contributions in open source projects. We strongly advocate release of the code under the same license as that used for DSpace general code, i.e. a BSD License, explained here: http://owl.li/NPB6V. It is important to note that the code will not include third-party software, libraries or code dependencies which are not compatible with these licenses.
In summary, the OPRM aims to capitalise on the existing infrastructure offered by open access repositories and to enable their conversion into fully-functional evaluation platforms with associated quality metrics. This repository-based evaluation process can run in parallel to traditional journal peer review and will:
- enable the peer review of any research work deposited in a repository, including data, code and monographs
- provide novel metrics for the quantitative assessment of research quality
- create a sophisticated reputation system for reviewers
- allow the weighting of reviews based on the quality of previous reviewer contributions
- facilitate the selection of relevant content from digital repositories by distinguishing material that has been validated by reviewers using tags and advanced search filters
- engage the research community in an open and transparent dialogue over the soundness and usefulness of research material
While the OPRM is expected to be ready in December 2015, we already welcome interested repositories to contact us in order to discuss the prospect of future collaboration.