Giter Site home page Giter Site logo

artoonie / rcvformats Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 0.0 270 KB

Validating, migrating, and converting to standard RCV tabulated formats

Home Page: https://rcvformats.readthedocs.io/en/latest/?badge=latest

License: MIT License

Python 97.88% Shell 2.12%
rcv ranked-choice elections election-analysis

rcvformats's Introduction

Python package Documentation Status

RCV Formats

RCV Formats helps programmers and researchers build tools that analyze the results of a Ranked-Choice Voting election without having to support the many file formats used to report RCV results.

RCV Formats converts data from several sources into a standardized format. It supports both python and command-line tools

Currently supported input formats are:

  1. The Universal RCV Tabulator JSON format
  2. The Opavote JSON format
  3. The ElectionBuddy CSV format
  4. The Dominion XLSX format
  5. The Dominion TXT format

As well as the Dominion first-round-only XML format (used in Alaska), which contains the first rounds of several elections. All other converters contain the results of just one election per file.

The standardized output format is the Universal RCV Tabulator JSON. To understand this format, look at examples or the jsonschema.

Demo

Command-line

rcvformats convert -i <input-filename> -o <output-filename>

Python

from rcvformats.conversions.automatic import AutomaticConverter

standardized_data = AutomaticConverter().convert_to_ut(input_filename)

Installation

Install the library via pip:

pip3 install rcvformats

Convert to Standardized Format

You can convert from any of the supported formats. Use this functionality to support a wide array of input data while only writing code to support a single format.

Command-line

rcvformats convert -i <input-filename> -o <output-filename>

The bash script always uses the automatic converter.

Python

from rcvformats.conversions import electionbuddy

converter = electionbuddy.ElectionBuddyConverter()
try:
  converter.convert_to_ut(filename)
except Exception as e:
  print("Errors: ", e)

Valid converters are:

from rcvformats.converters.automatic import AutomaticConverter
from rcvformats.conversions.dominion_txt import DominionTxtConverter
from rcvformats.conversions.dominion_xlsx import DominionXlsxConverter
from rcvformats.conversions.electionbuddy import ElectionBuddyConverter
from rcvformats.conversions.opavote import OpavoteConverter

The AutomaticConverter checks if the file matches any of the available schemas, and if it finds a matching schema, it runs the corresponding conversion (if a conversion is needed at all).

Schema Validation

Validate that your file is supported by RCVFormats.

Validation is only on the structure of the data, not on its contents: it is possible for a validly-formatted file to still contain invalid data.

Command-line

rcvformats validate -i <input-filename> -s <schema-type>

Valid schema validators on the command line are 'eb' (for electionbuddy files), ov10 (for opavote files pre-2022), ov11 (for opavote files post-2022), ut (for universal tabulator files). Dominion does not have a schema validation currently.

Python

from rcvformats.schemas import universaltabulator

schema = universaltabulator.SchemaV0()
is_valid = schema.validate('/path/to/file.json')

if not is_valid:
  print(schema.last_error())

Valid schema validators for python are:

from rcvformats.schemas.electionbuddy import SchemaV0
from rcvformats.schemas.opavote import SchemaV1_0
from rcvformats.schemas.universaltabulator import SchemaV0

Fill in missing transfer data

Transfer data is useful to determine where votes went when a candidate was eliminated, or when a candidate was elected and had surplus votes (in STV).

If you have a file format that does not have transfer data, there are three options: you can leave it out entirely, you can assign transfers proportionally to each eliminated candidate, or you can assign only the transfers that are unambiguous. We recommend the last option, which prepares transfer data for any round that does not involve batch elimination. The second option results in fake data which cannot be relied upon for any results reporting or analyses.

Multi-converters

Call DominionMultiConverter.explode_to_files(fileObject), which will return a dictionary mapping election names to NamedTemporaryFiles.

Command-line

rcvformats transfer -i <input-filename> -o <output-filename>

Python

from rcvformats.conversions.ut_without_transfers import UTWithoutTransfersConverter

converter = UTWithoutTransfersConverter()
try:
  converter.convert_to_ut(filename)
except Exception as e:
  print("Errors: ", e)

Upcoming plans

In addition to data normalization for RCV Summary formats, we would like similar functionality for cast vote records.

Running test suite

pip3 install -r requirements-test.txt, then run pytest rcvformats/test in the root directory, and ./scripts/lint.sh to run the linter.

rcvformats's People

Contributors

artoonie avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

rcvformats's Issues

Migrate to new RCTab format

The new RCTab format supports per-round thresholds and breakdowns of inactive votes. Use it as the new default format, and migrate all formats to the new format. This includes the previous RCTab format, meaning we'll need a converter from today's RCTab JSON to the new format.

Here is an example of the new format: https://github.com/BrightSpots/rcv/blob/develop/src/test/resources/network/brightspots/rcv/test_data/generic_csv_test/generic_csv_test_expected_summary.json

Support ES&S Data

ESS Output updated-summary-report-CD2.xls

Support conversion from ES&S format to the RCTab standardized format. Example file above.
Note: the standard format was previously called the Universal Tabulator format, but is now called the RCTab format. The codebase may not be fully up-to-date on the new name.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.