Crossref Similarity Check news: iThenticate v2.0 ready for launch
Last year, we announced the upcoming launch of a new version of iThenticate, the product from Turnitin that powers Crossref Similarity Check. We know some of you have been waiting a long time for this upgrade and we are very happy to share with you that we are now ready to release it.
We will be rolling out this new version in stages, so not everyone will be able to upgrade to the new version immediately.
TL;DR We missed an error that led to resource resolution URLs of some 500,000+ records to be incorrectly updated. We have reverted the incorrect resolution URLs affected by this problem. And, we’re putting in place checks and changes in our processes to ensure this does not happen again.
How we got here Our technical support team was contacted in late June by Wiley about updating resolution URLs for their content. It’s a common request of our technical support team, one meant to make the URL update process more efficient, but this was a particularly large request.
Crossref Conversations is an audio blog we’re trying out that will cover various topics important to our community. This conversation is between colleagues Anna Tolwinska and Rosa Morais Clark, discussing how we can make research happen faster, with fewer hurdles, and how Crossref can help. Our members have been asking us how Crossref can support open science, and we have a few insights to share. So we invite you to have a listen.
We’ve just added to our input schema the ability to include affiliation information using ROR identifiers. Members who register content using XML can now include ROR IDs, and we’ll add the capability to our manual content registration form, participation reports, and metadata retrieval APIs in the near future. And we are inviting members to a Crossref/ROR webinar on 29th September at 3pm UTC.
The background We’ve been working on the Research Organization Registry (ROR) as a community initiative for the last few years.
Crossref strives for balance. Different people have always wanted different things from us and, since our founding, we have brought together diverse organizations to have discussions—sometimes contentious—to agree on how to help make scholarly communications better. Being inclusive can mean slow progress, but we’ve been able to advance by being flexible, fair, and forward-thinking.
We have been helped by the fact that Crossref’s founding organizations defined a clear purpose in our original certificate of incorporation, which reads:
“To promote the development and cooperative use of new and innovative technologies to speed and facilitate scientific and other scholarly research.”
As Crossref prepares to turn 20 in January 2020, it’s an opportunity to reflect on achievements and highlights from 2018-19 and also ponder the preceding decades. Change is a constant at Crossref but the organization has never strayed from its initial defined purpose. Our services and value now extend well beyond persistent identifiers and reference linking, and our connected open infrastructure benefits our 11,000+ membership as well as all those involved in scholarly research. This expansion is exactly what was envisioned to meet the goal of “speeding and facilitating” research.
This year’s annual report is different from previous years’; it has been expanded into a ‘fact file’ so that we can invite comments on the path ahead, based on transparent access to data about our membership, activities, and finances. As we were pulling together the charts and tables for this annual report we noticed stark differences in where Crossref is today compared to years past.
The rate of membership growth has accelerated and we now have over 180 new members joining every month, leading to one of the most striking changes we found. The lowest three membership tiers now account for 46% of revenue (up from 25% in 2011) while the highest three tiers account for 36% (down from 56% in 2011).
Today, the typical Crossref member has just a few hundred registered content items.
One way we have been able to accommodate this growth efficiently is by collaborating with sponsors in different countries. Very small members can join via a local sponsor that is able to provide technical, financial, language, and administrative support. We now have more members joining via sponsors, who otherwise would largely not be able to join at all. While you’d need to be a millionaire by US standards to join directly from Indonesia in our lowest fee tier (calculated using Purchasing Power Parity), the sponsor program—supported often by government investment in science and education—has enabled Indonesian organizations to join Crossref in large numbers, supporting their aim to become one of the fastest-growing nations in open research, and to help that research be discovered.
Crossref has repeatedly stayed ahead of developments in the community
In 2007, when the Similarity Check working group discussions and pilot started, there was disagreement on the board about whether Crossref should provide such a service and whether it was a strategic priority for members. By the end of the pilot, when the decision came to launch a production service, it was seen as essential and a top priority. This conclusion has been borne out in recent research into the value of Crossref; Similarity Check is one of the services of most importance to members.
Adding preprints as a content type was controversial at the time. The board discussed the topic of “duplicative works” for about two years with strong opinions on all sides. The working group delivered a good set of policies and technical specifications and in the July 2015 board meeting there was a majority—but not 100%—agreement on the motion to approve. We implemented preprints as a content type just in time to accommodate the snowballing of preprint servers emerging from existing and new members.
Another example of a former—and current—area of contention is the approach to metadata. When Crossref first launched, there were lengthy discussions about what metadata we should collect. The initial focus was on the minimal set of metadata to enable reference matching in support of reference linking. In the beginning, neither article titles, lists of authors, references, nor abstracts were included in the minimal metadata set. We supported them as optional but most members opted out. However, the huge set of metadata that Crossref collects and disseminates now is seen as essential, providing a lot of value for members in terms of discoverability.
Today, Crossref enables metadata retrieval on a large scale—an average of more than 600 million queries per month—through a variety of interfaces, most notably the REST API (Public, Polite, and Plus versions). The metadata is used by thousands of organizations and services—both commercial and not-for-profit—increasing the discoverability of member content. In fact, members of all stripes have long initiated projects to expand the metadata Crossref is able to collect and disseminate: from facilitating text mining (through license and full-text URLs); to enabling better connections with and evidence of contributions (through Funder IDs, ORCID iDs, and soon CRediT roles and ROR IDs).
These are all examples of where Crossref has successfully “promoted the cooperative use of new and innovative technologies” and where we are meeting our mission to make scholarly communications a little bit better. As ever, we need to thank our brilliant staff for their unfailing resilience, balance, and diligence, in these times of dynamic change.
Considering the value and future of Crossref
Research is global, and supporting a diverse global community is a challenge. This year, we conducted our first wide-ranging investigation into what people value from Crossref. This involved telephone interviews with over 40 community members as well as an online survey of 600+ respondents.
“The Crossref of 2040 could be an even more robust, inclusive, and innovative consortium to create and sustain core infrastructures for sharing, preserving, and evaluating research information.”
But only if Crossref is not:
“held back, and its remit circumscribed, by legacy priorities and forces within the industry that may perceive open data and infrastructure as a threat to their own evolving business interests.”
We welcome this public commentary and encourage others in the community to respond and report what value Crossref offers as community-owned infrastructure, and how they’d like to see the organization evolve.
More than ever, we need to have this discussion with a broad and representative group. So please, read the value research report and the annual report/fact file, and get ready to voice your opinions!