Giter Site home page Giter Site logo

fecfile's People

Contributors

anthonydb avatar chadday avatar esonderegger avatar hodgesmr avatar jsfenfen avatar sblack4 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fecfile's Issues

Mappings for F10 and F105

It looks like the F10 and F105 appeared in version 5.0 and existed through version 6.3, being discontinued in version 6.4.

These will need mappings and types.

Correctly handle form 99 text

Form 99- miscellaneous text takes the form of:

[BEGINTEXT]
Sample text
[ENDTEXT]

on lines 3-n of the filing. Currently this text gets ignored. It should be added to the text field in the top level of the returned dictionary.

Where are you getting the mappings data?

Hi!

I'm working on a port of this to rust. I'm trying to decide where to source the schema mappings. Possible options I've found are:

I found that FastFEC chose to use this repo as their upstream.

Could you explain what you see as the pros/cons of each of these?

So far, what I see are:

  • The CSVs don't seem to be complete, eg they don't contain a schema for F3N forms
  • The CSVs don't contain type info

Are there other considerations I'm missing?

I would love to make it so that there was one complete and accurate listing of schemas so that the wide range of parsers would not have to duplicate this effort. Any idea what would be required to make that happen?

CC @esonderegger @dwillis @freedmand

Documentation: encoding?

Hey @esonderegger I'm finally running some stuff with this library! Thanks for all your work.

One thing I noted is that some filings appear to not be UTF-8. This is external to your library, but causes it to crash and burn. Example: 1260488.fec fails with: 'utf-8' codec can't decode byte 0x92 in position 1062: invalid start byte.

This works fine if one just opens with the qwarg encoding = "ISO-8859-1", but I don't have good metrics yet on what encoding works best for all filings. Can update with more complete stats in a bit.

A list of forms that need mappings

This is generated by python tests.py AllFormsHaveMappings.test_request with that test uncommented. These are the missing mappings found from the first 5000 .fec files. Eventually the goal is to get through all filings without raising any FecParserMissingMappingError exceptions.

  • version 2.02 of form F6A
  • version 2.02 of form F3PA
  • version 1.02 of form H1
  • version 2.02 of form F6N
  • version 2.02 of form F3PN
  • version 2.02 of form SF25
  • version 2.02 of form H4
  • version 2.02 of form H3
  • version 2.02 of form H2
  • version 2.02 of form H1

And a list of newer missing mapping:

  • version P3.4 of form F3PT
  • version 8.3 of form F3A
  • version 8.3 of form F99
  • version 8.3 of form F3T
  • version 8.3 of form F1N
  • version P3.4 of form F6N
  • version 8.3 of form F5N
  • version P3.4 of form F1N
  • version 8.3 of form F3XA
  • version 8.3 of form F3N
  • version 8.3 of form F1MN
  • version P3.4 of form F1A
  • version 8.3 of form F3XN
  • version 8.3 of form F6N
  • version 8.3 of form F3XT
  • version 8.3 of form F1A
  • version P3.4 of form F6A
  • version P3.4 of form F3XN
  • version 8.3 of form F6A
  • version 8.3 of form F24N
  • version 8.3 of form F24A
  • version P3.4 of form F3XA
  • version 8.3 of form F3XN
  • version P3.4 of form F1M

More forms to add to the list (9/28/2018: 1250000-1259000)

  • version P3.4 of form F1M
  • version P3.4 of form F1S
  • version P3.4 of form F2A
  • version P3.4 of form F2N
  • version P3.4 of form F3LN
  • version P3.4 of form F3S
  • version P3.4 of form F5A
  • version P3.4 of form F5N
  • version P3.4 of form F65
  • version P3.4 of form F7N
  • version P3.4 of form F9N
  • version P3.4 of form SC1
  • version P3.4 of form SC2/9
  • version P3.4 of form SE
  • version P3.4 of form SF
  • version P3.4 of form H1
  • version P3.4 of form H2
  • version P3.4 of form H4

Follow redirects

if r.status_code == 404:

It looks like the code only allows for redirects on 404s but doesn't have a condition for 300s

Issue

I keep getting a FilingUnavailableError:

[ERROR] FilingUnavailableError: The requested FEC file number (FEC-1305714) is unavailable.
Traceback (most recent call last):
  File "/var/task/src/FECFileLoader.py", line 312, in lambdaHandler
    options={'filter_itemizations': [FILING_TYPE]}):
  File "/var/task/fecfile/__init__.py", line 134, in iter_http
    raise FilingUnavailableError({'file_number': file_number})

But when I curl that filing I get a 307 and 200 so it seems the filing is available:

>  curl -IL http://docquery.fec.gov/dcdev/posted/1305714.fec
HTTP/1.1 307 Temporary Redirect
Cache-Control: no-cache
Content-length: 0
Location: https://docquery.fec.gov/dcdev/posted/1305714.fec

HTTP/1.1 200 OK
Server: Apache
Last-Modified: Sun, 20 Jan 2019 20:49:01 GMT
Cache-Control: max-age=21600
Expires: Wed, 02 Sep 2020 20:43:10 GMT
Content-Type: mime.types:application/fecprn
Content-Length: 705
Accept-Ranges: bytes
Date: Wed, 02 Sep 2020 14:43:10 GMT
X-Varnish: 111590184
Age: 0
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: nosniff
Strict-Transport-Security: max-age=63072000

Missing mappings for FEC version 8.4

Describe the bug
fecfile is unable to parse some independent expenditure filings since the update for version 8.4 due to missing mappings.

To Reproduce
import fecfile
ie_itemizations = {'filter_itemizations': ['SE','F57'], 'as_strings': True}
filing = fecfile.from_http('1616360', options=ie_itemizations)
FecParserMissingMappingError: cannot parse version 8.4 of form F5N - no mapping found

Expected behavior
The above code should have loaded a dictionary.

Environment
Python 3

python setup.py install

Describe the bug
documentation bug

To Reproduce
Getting started with local dev. Clone, create and source venv, then comes the install...

>  python setup.py                                                                                                                                                                                                                 
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: setup.py --help [cmd1 cmd2 ...]
   or: setup.py --help-commands
   or: setup.py cmd --help

Expected behavior
A clean install

Additional context
It should be python setup.py install

:)

parser should be able to ignore quotation marks

In some filings, fields are enclosed in quotation marks even though they don't need to be. That means the parser sees values like "4247.66" and says "that doesn't look like a number to me".

I think if a value that is supposed to be numeric begins and ends with " after we call strip() on it, then we should try again with value[1:-1]

Feature: include HTTP status message in error response

I've been getting a log of FilingUnavailableError with origins here:

raise FilingUnavailableError({'file_number': file_number})

This doesn't give me much information about the underlying problem. It's doubly problematic because when I curl some of these FEC files I get 200. I would really like to see more information like the HTTP status code in the error message.

[ERROR] FilingUnavailableError: The requested FEC file number (FEC-1305734) is unavailable.Traceback (most recent call last):  File "/var/task/src/FECFileLoader.py", line 171, in lambdaHandler    options={'filter_itemizations': [FILING_TYPE]}):  File "/var/task/fecfile/__init__.py", line 134, in iter_http    raise FilingUnavailableError({'file_number': file_number}) | [ERROR] FilingUnavailableEr

I'm pretty handy in python if you're up for this PR

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.