Giter Site home page Giter Site logo

gtfs-collector's People

Contributors

znmeb avatar

Stargazers

 avatar

Watchers

 avatar

gtfs-collector's Issues

Limit 1.0.0 scope to real-time data collector

The only piece of the puzzle that needs to be operating continuously on a server in the cloud is collecting the real-time data as it's published. Everything else can be done on the desktop or another server. This also reduces the space requirement dramatically.

gtfsrdb mysterious Python crashes

Describe the bug
executing gtfsrdb sometimes works and sometimes doesn't. It's obviously an environment issue.

To Reproduce
Steps to reproduce the behavior:
Clone the repo and execute it. The error is something to do with the order in which it creates tables. If it tries to create one with a foreign key before the table with that key has been created, it crashes.

Expected behavior
Collecting data begins.

Additional context
It usually works with the native Python on Arch. But it needs to run in containers.

Explore Docker hosting options

Ideally, there would be something that could work in a "forever free" mode. The biggest hurdle is likely to be disk space; most free tier services are small SSDs.

Tools for managing disk space for the real-time data collection

The intended use case is that the user starts the container collecting data, then accesses it via a PostgreSQL client library from R, Python, Julia, or some other data science language. Since collection is supposed to be continuous (we wouldn't need a server otherwise), there needs to be a way to periodically back up the database and truncate it, or some kind of round-robin scheme with roll-ups, perhaps daily or weekly. Monthly seems too long - weekly feels right to me.

Rebuild using R!!

There's a lot of good transit code out there in R, and it would relieve me of having to troubleshoot other peoples' Python code. And it would make a cool workshop for R and TriMet nerds in Portland. ;-)

Tools to configure and start the real-time data collection

The intended use case is that a user deploys this container, tells it the URLs for the GTFS real-time feed and any authentication secrets required to connect. Then the container starts collecting data into the PostgreSQL database.

create a Windows 10 host-side version

Not sure where to prioritize this, but there needs to be a bare metal version at least to create the database populated by gtfsdb. I'm building the Linux version with Conda so it will be Windows-ready.

Memory killer??

On a Digital Ocean mini-droplet with 1 GB of RAM, the gtfsdb (non-real-time) collector crashes with "Killed". It runs on the workstation, and there appears to be a steady-state 2.5 GB requirement and a spike over 5 GB! So the plan for deployment is to just collect real-time data.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.