Giter Site home page Giter Site logo

ni-presidents-wikipedia's Introduction

Note: This repo is largely a snapshop record of bring Wikidata information in line with Wikipedia, rather than code specifically deisgned to be reused.

The code and queries etc here are unlikely to be updated as my process evolves. Later repos will likely have progressively different approaches and more elaborate tooling, as my habit is to try to improve at least one part of the process each time around.


Step 1: Check the Position Item

The Wikidata item: https://www.wikidata.org/wiki/Q373548 contains all the data expected already, although Jeanine Áñez was not yet at preferred rank.

Step 2: Tracking page

PositionHolderHistory already exists; current version is https://www.wikidata.org/w/index.php?title=Talk:Q852448&oldid=1238279644 with 21 dated memberships and 79 undated; and 104 warnings.

Step 3: Set up the metadata

The first step in the repo is always to edit add_P39.js script to configure the Item ID and source URL.

Step 4: Get local copy of Wikidata information

wd ee --dry add_P39.js | jq -r '.claims.P39.value' |
  xargs wd sparql office-holders.js | tee wikidata.json

Step 5: Scrape

Comparison/source = https://en.wikipedia.org/wiki/President_of_Nicaragua

wb ee --dry add_P39.js  | jq -r '.claims.P39.references.P4656' |
  xargs bundle exec ruby scraper.rb | tee wikipedia.csv

Small tweaks needed, but trivial to scrape.

I had to manually combine the three terms of Evo Morales, though, as Wikidata already had those as a single statement.

Step 6: Create missing P39s

bundle exec ruby new-P39s.rb wikipedia.csv wikidata.json |
  wd ee --batch --summary "Add missing P39s, from $(wb ee --dry add_P39.js | jq -r '.claims.P39.references.P4656')"

11 new additions as officeholders -> https://tools.wmflabs.org/editgroups/b/wikibase-cli/d228ffe137064

Step 7: Add missing qualifiers

bundle exec ruby new-qualifiers.rb wikipedia.csv wikidata.json |
  wd aq --batch --summary "Add missing qualifiers, from $(wb ee --dry add_P39.js | jq -r '.claims.P39.references.P4656')"

52 additions made as https://tools.wmflabs.org/editgroups/b/wikibase-cli/6169791a8bd87/

Step 8: Refresh the Tracking Page

New version at https://www.wikidata.org/w/index.php?title=Talk:Q852448&oldid=1238512968

I was deliberately only taking the low hanging-fruit of easily-accessible start+end dates here, so there's still quite a bit of work to be done after this.

ni-presidents-wikipedia's People

Contributors

tmtmtmtm avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.