Giter Site home page Giter Site logo

myndighetsdata's Introduction

Myndighetsdata

A wise owl that knows everything about government agencies

Myndighetsdata is an attempt to make data about the Swedish government agencies (myndigheter) more accessible. By data, I mean name and basic information such as contact details, address... It downloads the data from various sources, converts it to structured JSON files with a consistent format and even attempts to merge all these data points in one big list.

There are many government agencies in Sweden, they get called by various names and several hundred agencies have disappeared over the past decades. This data will hopefully be of some help to those who try to study public sector and build services building on government data. It's not a finished product, it's not 100% clean and exact but feel free to reuse it and contribute to make it even better! 😊

Where is the data?

It's in the data folder:

And merged.json is an attempt at merging all these files together by matching agencies by organisation numbers and by name (using fuzzy matching and some wild rules). It is not 100% correct as the underlying data is too unconsistent. But it can be used in order to complete Wikidata and improve the quality of government sources so that future merge attempts are easier.

How to run the code

You can use the code yourself to download the source files, extract the information from them and merge it.

For this, you need Python 3 and to install dependencies:

pip install -r requirements.txt

Once that is done, you can run the following commands:

# Download the source files (if DOWNLOAD is set to True) and extract the information from them
python run.py
# OBS: Arbetsgivarverket's data has to be downloaded manually

# Try to merge the lists into one
python smart_merge.py

# Rule-based cleaning to remove the biggest anomalies in the merged file
python manual_cleaning.py

License

The code is licensed under AGPLv3, which means you can reuse as long as you attribute, and that you can modify as long as you published what you make.

The data comes from a number of sources but they are all licensed as CC0, either explicitly or through praxis (allmänna handlingar can usually be considered CC0). So feel free to reuse as you please!

myndighetsdata's People

Contributors

pierremesure avatar

Stargazers

Samuel Plumppu avatar  avatar

Watchers

 avatar

myndighetsdata's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.