Giter Site home page Giter Site logo

flickr-foundation / flickypedia Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 1.0 26.28 MB

A tool to copy CC-licensed images from Flickr to Wikimedia Commons

Home Page: https://www.flickr.org/tools/flickypedia/

License: Apache License 2.0

Python 80.53% HTML 8.88% CSS 2.24% JavaScript 4.35% SCSS 4.00%
flickr structured-data wikimedia-commons

flickypedia's Introduction

Flickypedia, by Flickr.org

Flickypedia is a bridge between photos on Flickr and files on Wikimedia Commons. It includes:

  • A web app for copying openly-licensed photos from Flickr to Wikimedia Commons ("uploadr")
  • A bot for improving the structured data for Flickr photos which are already on Wikimedia Commons ("backfillr")
  • A tool for getting data and statistics about Flickr photos on Wikimedia Commons ("extractr")

Our goal is that it results in higher quality records on Wikimedia Commons, with better connected data, better descriptive information, and makes it easier for Flickr photographers to see how their photos are being used.

Flickypedia was built by the US 501(c)(3) Flickr Foundation in 2023 in partnership with the Culture and Heritage team at the Wikimedia Foundation.

Usage

If you want to copy some photos from Flickr to Wikimedia Commons, you can use the web app at https://www.flickr.org/tools/flickypedia/

If you want to use the backfillr or extractr tools, you need to clone the repo and install dependencies:

$ git clone https://github.com/Flickr-Foundation/flickypedia.git
$ cd flinumeratr
$ python3 -m venv .venv
$ source .venv/bin/activate
$ pip install -e .

Then run the flickypedia CLI, which has help text that will explain what to do next:

$ flickypedia --help

Architecture

The Flickypedia code lives in the src folder, which is split into four main components:

  • External APIs (apis) – code for interacting with external services, including Flickr, SDC, and Wikimedia Commons. This is reasonably generic and not be too specific to Flickypedia. If you're looking for pieces to reuse, this is a good place to start looking.

  • Uploadr – a web app for copying openly-licensed photos from Flickr to Wikimedia Commons. This is a Flask app.

    The app is organised into a series of screens ("get photos", "select photos", "prepare info", and so on) – all the files associated with a screen are named to match the URL. e.g. the "prepare info" screen is at /prepare_info, so the associated files are prepare_info.py, prepare_info.html, prepare_info.scss, and so on.

  • Backfillr – a CLI for updating SDC for Flickr photos that have already been uploaded to Commons. You can either run on a single file or many files at once.

    It fetches the existing SDC for a photo, finds the Flickr ID, then calculates the "new" SDC for the photo (if it was uploaded with Flickypedia today). It compares the new and existing SDC, and writes any missing statements back into Commons.

  • Extractr – a CLI for getting data about existing Flickr photos on Commons from SDC snapshots.

Development

You can set up a local development environment by cloning the repo and installing dependencies:

$ git clone https://github.com/Flickr-Foundation/flickypedia.git
$ cd flinumeratr
$ python3 -m venv .venv
$ source .venv/bin/activate
$ pip install -e .

If you want to run tests, install the dev dependencies and run coverage:

$ source .venv/bin/activate
$ pip install -r dev_requirements.txt
$ coverage run -m pytest tests
$ coverage report

flickypedia's People

Contributors

alexwlchan avatar dependabot[bot] avatar george08 avatar edwardbetts avatar jessamynwest avatar

Stargazers

Sikander Iqbal avatar Waldir Pimenta avatar framawiki avatar  avatar Nelson Chu Pavlosky avatar

Watchers

 avatar  avatar

Forkers

edwardbetts

flickypedia's Issues

Write some ADRs for what's happened so far

Possible topics:

  • Using Python and Flask
  • OAuth 2.0 and how tokens are stored
  • Looking up users from Wikidata
  • Looking up duplicates
  • Background tasks with Celery (vs Flask-Executor)
  • Flickr API response cache

Write the "proper" implementation of the auth code

I've got a rough auth implementation working, but it's a bit ropey – good enough for prototyping, but not much else. I should tidy it up and make it something that we can use in the real app.

Still to do:

  • Write proper tests for it
  • Add support for refresh tokens

Add support for "No Known Copyright Restrictions" (maybe)

Right now it looks like we're not going to support NKCR – but in case we reverse that decision later, this is a tracking ticket for all the places we'll need to make changes to support it.

  • Add support for the {{Flickr-no known copyright restrictions}} template when we render Wikitext
  • Add support in the structured data
  • Add it to the list of supported licenses in config
  • Tweak the copy

Add a <title> element to all the pages

Currently every page has a <title>Flickypedia</title>, which is a bit annoying when you have lots of Flickypedia entries in your history and you don't know which is which!

Test cases for 'pick from a list of photos'

  • Single photo, license okay and not in WMC
  • Single photo, license okay and in WMC
  • Single photo, license not okay
  • Multiple photos, all license okay and none in WMC
  • Multiple photos, all license okay and some in WMC
  • Multiple photos, all license okay and all in WMC
  • Multiple photos, some license okay and none in WMC
  • Multiple photos, some license okay and some in WMC
  • Multiple photos, some license okay and all in WMC
  • Multiple photos, no license okay

Drop support for Python 3.7

At least for the very first prototype, we may end up deploying this into a Glitch environment to match flinumeratr – but Glitch only supports Python 3.7 (even though it's EOL): https://support.glitch.com/t/upgrade-python-version-from-3-7-which-is-now-eol-to-something-more-recent/63011

Either Glitch gets its act together and bumps the Python version, or we look elsewhere – whatever the case, we'll be dropping the 3.7 support eventually. This is a ticket to track all the stuff we'll do when that happens.

  • Remove 3.7 from the testing matrix in GitHub Actions
  • Remove the flake8 --ignore=E231 lint in GitHub Actions
  • Relax some of the version constraints in requirements.in

Set up some credentials with the beta cluster for uploading images

There is a testing environment for Wikimedia Commons, as described in Commons:Guide to batch uploading:

If you want to test uploading and safely experiment with using tools or new templates in a safe environment, you can set up an account on the beta cluster. This is a mirror of Wikimedia Commons where if things go wrong you will not cause any disruption to the live environment. See http://commons.wikimedia.beta.wmflabs.org/ and this explanation.

This would be very useful for our purposes!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.