Giter Site home page Giter Site logo

mastaal / parlhist Goto Github PK

View Code? Open in Web Editor NEW
1.0 3.0 0.0 65 KB

parlhist is a Python application intended to enable more empirical and statistical academic studies of (Dutch) parliamentary minutes and documents

License: European Union Public License 1.2

Python 97.68% Shell 2.32%
open-data the-netherlands staten-generaal parliamentary-data handelingen kamerstukken

parlhist's Issues

Support crawling complete dossier with more than 1000 results

def get_kamerstukken_in_kamerstukdossier(dossiernummer: str) -> list[tuple[str, str]]:
"""Using the SRU api, get all kamerstukken in a kamerstukdossier
Returns strings in the same formatting as denoted in other metadata, e.g. '35899;7', '35925-VII;31' or '35979;F'
"""
# overheid SRU documentation:
# https://data.overheid.nl/sites/default/files/dataset/d0cca537-44ea-48cf-9880-fa21e1a7058f/resources/Handleiding%2BSRU%2B2.0.pdf
base_query = f"https://repository.overheid.nl/sru?query=(w.dossiernummer=={dossiernummer})&maximumRecords=1000&startRecord=1"
try:
query_response = get_url_or_error(base_query)
except CrawlerException as exc:
logger.error("Received exception %s when trying to query the kamerstukken in dossier %s", exc, dossiernummer)
raise CrawlerException(f"Received exception when trying to query the kamerstukken in dossier {dossiernummer}") from exc
xml: ET.Element = ET.fromstring(query_response.text)
number_of_records = int(xml.find("{http://docs.oasis-open.org/ns/search-ws/sruResponse}numberOfRecords").text)
if number_of_records > 1000:
logger.critical("More than 1000 results in a kamerdossier!")
# TODO, implement this case

Create experiment template

Based on this template, users (who can already program in Python) should be able to easily create new experiments.

  • The template should store its inputs and results in an appropriate fashion in the database (also see #8)
  • The template should make effective use of parallelization (also see #9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.