Giter Site home page Giter Site logo

Comments (3)

bnewbold avatar bnewbold commented on May 24, 2024 1

Ah, I didn't notice that. The services are on different domains so I didn't realize they were the same project, but now I see the "User Guide" link.
I guess the next step would be to find alternative sources of retraction metadata with persistent identifiers (eg, DOI or PubMed identifier). Some sources I can think of are:

  • PubMed/MEDLINE itself (we already have a parser for this, could update the import pipeline to allow "updates" to existing entries when the publication_stage does not match or has changed to "retracted")
  • publisher-specific corpuses, like SciGraph
  • heristics, like finding publications with the title "Retraction of TITLE", then finding the prior publication from the same journal ("container") and the given title

from fatcat.

hs2361 avatar hs2361 commented on May 24, 2024

I would like to work on this. Could you provide some more details? What kind of mechanism can be used to fetch the data from their database? They have clearly mentioned that scraping the website is prohibited (https://retractionwatch.com/retraction-watch-database-user-guide/).

from fatcat.

bnewbold avatar bnewbold commented on May 24, 2024

Here is an open corpus of ~100k retractions: http://openretractions.com/

we only know about retractions and other updates that publishers have properly reported to CrossRef or PubMed. That's currently 114596 papers.

I see only a couple thousand retracted "releases" in fatcat today. We do import from crossref and pubmed, so in theory we should have comparable numbers, but we don't run updates automatically yet, so if most of these are from the past couple years we are probably missing them. Also there might be bugs in our crossref and pubmed importers. I don't think we have tests for that code path, so a good first contribution would be adding tests for both crossref and pubmed retractions.

from fatcat.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.