Contributing a Citation to DataCite - "IsCitedBy"

Contributing a Citation to DataCite:

In my role as the Data Workflows Specialist at the University of Michigan - Library in addition to reviewing large datasets and code deposits, I also support various aspects of our research data repository Deep Blue Data, https://deepblue.lib.umich.edu/data, based on Samvera Hyrax. We have been making efforts to improve connections between our system and other systems to gather various metrics for our datasets.

In early 2021, I was trying to verify whether the DataCite Data Metrics badge, https://support.datacite.org/docs/displaying-usage-and-citations-in-your-repository a tool for displaying usage and citation information, was working or not. However, I had no easy way of knowing whether any of our researchers had actually cited any of the data sets we host in Deep Blue Data in their published articles, let alone whether other researchers had. So, I decided to begin the process of adding citations to our datasets via the DataCite API, based on information we have in our “Citations to related material” field. I was using the instructions on https://support.datacite.org/docs/contributing-data-citations#.

 

The following is my process and the results of that process.

 

Beginning with this dataset from Deep Blue Data:

Arbic, B., Luecke, C. (2020). Repository of current meter archive data used in Luecke et al. 2020 Journal of Geophysical Research Oceans paper [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/dbfp-s644

From the work description metadata field “Citation to Related Material”, I pulled the DOI, “10.1029/2019JC015306” for the related publication (see Figure 1). The dataset was not actually listed in the “References” section of the paper.

Figure 1 Citations to related material field in Deep Blue Data
Figure 1 Citations to related material field in Deep Blue Data

Armed with the DOI for the dataset and the DOI for the referencing publication, I  created the JSON payload, (see Figure 2), to be pushed to DataCite, https://support.datacite.org/docs/contributing-data-citations. Note: do not include the “https://doi.org/” part, I made this mistake and found out that DataCite does not process this well. Notice that this JSON is more complete than what was displayed in the the “Contributing Citations” page, this is from “Updating Metatadata with the REST API”, https://support.datacite.org/docs/updating-metadata-with-the-rest-api, using the brief/incomplete JSON will give you “status: 400 Bad Request” and “{"errors":[{"status":"400","title":"You need to provide a payload following the JSONAPI spec"}]}%“ error message.

Figure 2 JSON payload indicating citing publication DOI.
Figure 2 JSON payload indicating citing publication DOI.

Next  I entered the “curl” command at the termina to actually push the information to DataCite via the DataCite API. Note this needs to be run from the same folder location as the aforementioned JSON payload.

 

Curl command (run from the Mac Terminal):

curl -v -X PUT -H "Content-Type: application/vnd.api+json" --user[redacted user]:[redacted passwd] -d @doi_update.json https://api.datacite.org/dois/10.7302/dbfp-s644

 

The results!

In the DataCite API view now shows “IsCitedBy” as a  RelatedIdentifier, (see Figure 3):

Figure 3 DataCite API view https://api.datacite.org/dois/10.7302/dbfp-s644
Figure 3 DataCite API view https://api.datacite.org/dois/10.7302/dbfp-s644

However, the new relatedIdentifier “10.1029/219jc015306” is showing in “references” NOT in “citations” (see Figure 4):

Figure 4 DataCite API view indicating one "reference" and zero "citations"
Figure 4 DataCite API view indicating one "reference" and zero "citations"

Because this information does not appear in the “citations” section, there is no indication (see Figure 7 for an example) of any publications citing this dataset in DataCite Commons (see Figure 5), https://commons.datacite.org/doi.org/10.7302/dbfp-s644.

*** Note I’m not clear on how the citation is supposed to appear in Figure 4.

Figure 5 DataCite Commons search result for 10.7302/dbpf-s644
Figure 5 DataCite Commons search result for 10.7302/dbpf-s644

Or in DataCite Search (see Figure 6), https://search.datacite.org/works/10.7302/dbfp-s644:

Figure 6 DataCite Search results for 10.7302/dbfp-s644
Figure 6 DataCite Search results for 10.7302/dbfp-s644

Critically, DataCite Search is what should be feeding into the DataCite Data Metrics badge. As Data Cite Search is not picking up citation information from our selected data set no citations are showing in our system, even though we know that this data set has in fact been cited by the authors in their paper.

 

Interestingly, I was able to get DataCite’s tools to partially work with another dataset from Deep Blue Data: Almazroa, A. (2018). Retinal fundus images for glaucoma analysis: the RIGA dataset [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/Z23R0R29

Note here the citation submitted using the API showed up in DataCite Commons (see Figure 7), https://commons.datacite.org/doi.org/10.7302/Z23R0R29, unfortunately I have no idea what the difference was between this submission and the previous example.

Figure 7 DataCite Commons search results for 10.7302/Z23R0R29 showing 1 reference and 2 citations
Figure 7 DataCite Commons search results for 10.7302/Z23R0R29 showing 1 reference and 2 citations

But, as with the previous example, no citations to this data set are indicated in DataCite Search or the DataCite badge  (see Figures 8, 9 and 10), despite DataCite Commons recognizing that a citation for this data set exists. https://search.datacite.org/works/10.7302/z23r0r29:

Figure 8 DataCite Search results for 10.7302/Z23R0R29 indicating zero citations
Figure 8 DataCite Search results for 10.7302/Z23R0R29 indicating zero citations

HTML for the DataCite Badge (see Figure 9), https://support.datacite.org/docs/displaying-usage-and-citations-in-your-repository:

Figure 9 HTML for DataCite Data Metrics badge
Figure 9 HTML for DataCite Data Metrics badge

DataCite Badge front-end (Figure 10) showing 0 citations.

Figure 10 DataCite Data Metrics badge frontend indication zero citations
Figure 10 DataCite Data Metrics badge frontend indication zero citations

A bit of an update, this is a DataCite BUG https://github.com/datacite/datacite/issues/1416, it seems that “IsCitedBy” is not working correctly, but interestingly “Cites” is. I submitted a new citation for a different dataset and now DataCite Search shows 1 citation (Figures 11 and 12)!!

Figure 11 DataCite Search now shows 1 citation for DOI 10.7302/pa6y-fb55
Figure 11 DataCite Search now shows 1 citation for DOI 10.7302/pa6y-fb55

 

Figure 12 DataCite Search API view for 10.7302/pa6y-fb55 “relationType: “Cites””
Figure 12 DataCite Search API view for 10.7302/pa6y-fb55 “relationType: “Cites””

And here it is in the DataCite Data Metrics badge (Figure 13):

Figure 13 DataCite Data Metrics badge showing 1 citation
Figure 13 DataCite Data Metrics badge showing 1 citation

 

I would be happy to discuss this with anyone who is interested in this process!

Next blog post will follow a dataset citation from the “References” section of an article through to DataCite, it’s very interesting, stay tuned!!

***Note: I have verified this with datasets from Dryad using “Cites” instead, following through the process with ttps://doi.org/10.5061/dryad.3mg69k5 in DataCite API viewer :https://search.datacite.org/works/10.5061/dryad.3mg69k5 indicates 2 citations as and in Dryad there is 1 listed. I believe this 2 vs 1 also may be a DataCite bug.

For more citation fun: https://apps.lib.umich.edu/blogs/bits-and-pieces/following-data-citation-through-publishing-process