Giter Site home page Giter Site logo

bbd's People

Contributors

darksideofthemat avatar jjackson12 avatar nprezant avatar pauldingus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

bbd's Issues

Pytest PermissionError for windows when unlinking during test_make_map_joins_properly

In tests/gis/test_map.py there is a Windows Permission Error in the test_make_map_joins_properly test during Path(save_path).unlink()

Full pytest readout:
`====================================================== FAILURES =======================================================
____________________________________________ test_make_map_joins_properly _____________________________________________

def test_make_map_joins_properly():
    data = standard_data

    data_map = gis.make_map(
        shapefile_path,
        data,
        join_on="name",
    )

    geojson = data_map.data
    assert geojson["type"] == "FeatureCollection"

    features = geojson["features"]
    assert len(features) == 36

    feature0_props = features[0]["properties"]
    assert feature0_props["name"] == "Bay Colony, Port Royale"
    assert feature0_props["Demographic 1"] == "bay col dem1"
    assert feature0_props["Demographic 2"] == "bay col dem2"

    feature1_props = features[1]["properties"]
    assert feature1_props["name"] == "NE 48th"
    assert feature1_props["Demographic 1"] == "NE 48th dem1"
    assert feature1_props["Demographic 2"] == "NE 48th dem2"

    feature2_props = features[2]["properties"]
    assert feature2_props["name"] == "Linden Pointe Apartments"
    assert feature2_props["Demographic 1"] is None
    assert feature2_props["Demographic 2"] is None

    data_map.add_to(m)

    _, save_path = tempfile.mkstemp(suffix=".html")
    m.save(save_path)
  Path(save_path).unlink()

tests\gis\test_map.py:58:


self = WindowsPath('C:/Users/Matthew/AppData/Local/Temp/tmpa998mnwr.html'), missing_ok = False

def unlink(self, missing_ok=False):
    """
    Remove this file or link.
    If the path is a directory, use rmdir() instead.
    """
    if self._closed:
        self._raise_closed()
    try:
      self._accessor.unlink(self)

E PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\Users\Matthew\AppData\Local\Temp\tmpa998mnwr.html'`

Can probably be solved by changing save_path to the pytest format tmp_path which has built in cleanup. (Opened for posterity)

Update Readme and Improve Documentation

  • Update the Readme.md to further explain library contents (currently missing geocoder module and some explanation for gis module).

  • The library needs some kind of documentation to add to the project for full description of project contents (readthedocs.io/github wiki).

VAN data extraction - PDF conversion

Motivation

To meet the need for extracting data from VAN in PDF format, we want to create a comprehensive tool that takes that information and converts it into a more consumable file (i.e. CSV, JSON) for analysis.

Workflow Example:

Connect to downloaded VAN PDF with a data table.
Transform the data table in the pdf into data frame.
Create a file in the user-preferred file format.

Proposal

Make a utility to extract data tables from PDFs into consumable formats (i.e. potentially build a database).
You could specify which format to look for, what output you want, and more.

Remove all but a subset of a multipolygon shapefiles

Motivation

A lot of shapefiles come in much bigger than you really want (e.g. for the whole country or state, but you are only interested in a specific district).

This is particularly relevant for shapefiles retrieved from the census ftp site. but it seems applicable to anytime you're working with shapefiles. I know that almost every time I want to make a map I end up having to open it in QGIS and delete all the extra shapes I have that aren't relevant.

Workflow Example:

  • Get data from census API (e.g. block groups in Harris County, Texas)
  • Get shapefile from census FTP (only shapefile available is for ALL block groups Texas, which is way more info than we want)
  • Use this new feature to trim the shapefile to be the same "size" as the data you have, based on the GEOIDs that you got from the API call.
  • make a map with the data and the trimmed shapefile.

Proposal

Make a utility to remove all the unwanted shapes from the shapefile
You could specify which shapefile property to look at, and then remove all that shapefiles that don't have a property in that list.

>>> # Example of how it might be used
>>> include_these_GEOIDs = [“1234”, “5678”, ….] 
>>> bbd.trim_shapefile(
        original_path=”...”,
        join_on=GEOID”,
        include=include_these_GEOIDs,
        new_path=None, # perhaps would be original_path + “_trimmed” by default
    )

GeocodeLocations fails to geocode locations after a geopy TimeOutError

RateLimiter caught an error, retrying (0/2 tries). Called with (*({'street': '521 W Voorhis Ave ', 'city': 'Deland', 'state': 'FL', 'postalcode': 32720, 'country': 'United States'},), **{}).
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connection.py", line 159, in _new_conn
    conn = connection.create_connection(
  File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\util\connection.py", line 84, in create_connection
    raise err
  File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\util\connection.py", line 74, in create_connection
    sock.connect(sa)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
  File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 381, in _make_request
    self._validate_conn(conn)
  File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 976, in _validate_conn
    conn.connect()
  File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connection.py", line 308, in connect
    conn = self._new_conn()
  File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connection.py", line 164, in _new_conn
    raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x0000025C63B2CC10>, 'Connection to nominatim.openstreetmap.org timed out. (connect timeout=1)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\requests\adapters.py", line 439, in send
    resp = conn.urlopen(
  File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 752, in urlopen
    return self.urlopen(
  File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 752, in urlopen
    return self.urlopen(
  File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 724, in urlopen
    retries = retries.increment(
  File "C:\ProgramData\Anaconda3\lib\site-packages\urllib3\util\retry.py", line 439, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Max retries exceeded with url: /search?street=521+W+Voorhis+Ave+&city=Deland&state=FL&postalcode=32720&country=United+States&format=json&limit=1 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x0000025C63B2CC10>, 'Connection to nominatim.openstreetmap.org timed out. (connect timeout=1)'))

Need to catch these errors so connection will retry until no timeout occurs. Otherwise will randomly lose rows of information requiring a reset to reinsert.

Geocoding

Motivation

It is often desirable to be able to transform addresses in VAN/Census data into latitude and longitude coordinates, thereby allowing a user to map voter information on top of geographical info.

Workflow Example:

Input dataset with address information as a column
Transform the address column into latitude and longitude coordinates
Overlay the latitude and longitude coordinates on top of a map (with ability to label lat/long coordinates according to other desired information, such as democrat/republican info)

Proposal

Create a functionality through which a specified column can be transformed into geocoding data and then overlaid onto a map to visualize voters in a given district/area

Making calls to OpenFEC API

Motivation

The FEC tracks a ton of data about contributions to campaigns and political committees. While it's illegal to use this data for certain purposes (e.g., solicitation of funds), it can be useful for other purposes. The simplest way that I can find to access FEC data is through the OpenFEC API.

Workflow Example

Find all contributions made by a certain individual in a given election cycle
Determine whether that individual donates to both parties
Determine the highest amount that that individual has given to a senate campaign
Etc.

Proposal

This will be pretty similar to Noah's census data tool, but adapted to the various functionalities of OpenFEC.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.