Comments (6)
Hi @mpadge . Quick question. Why did you choose to use fst
instead of parquet
?
from m4ra.
Hiya Rafa! Chose just because it's more lightweight, and arrow for this workflow is overkill. Everything here is internally indexed by city name; arrow would enable city to be used an index, but then you'd lose oversight of what individual files were. Does that make sense?
from m4ra.
Yes, it makes perfect sense! Thanks for clarifying! I'm following your work on m4ra with great interest!
from m4ra.
Have to re-open because fst
still produces unreliable values. The package is clearly going to have to be dumped here, because the unreliability breaks all analyses here. It seems that the row-wise ordering in different columns is randomly rearranged, so that rows are not recovered, but include random data mixed from other rows.
from m4ra.
Can't use arrow, because even with current dev version, it errors because
Error in rawToChar(out): long vectors not supported yet: raw.c:68
That's on a data frame with 6,5 million rows, 21 columns. I guess arrow still primarily envisions numerical data ... so it's back to plain old saveRDS
. That will slow these routines down somewhat, but at least everything will once again be reliable.
from m4ra.
Confirmed that full accuracy has been regained, and load times even for the data.frame above with 6.5M rows are still well under 1 minute, which is okay
from m4ra.
Related Issues (20)
- bike car ratios
- multi-modal algorithm HOT 6
- Automobile times HOT 3
- Use fst to cache networks HOT 3
- Include options to extract data from local file with osmium
- m4ra_times_mm_car function HOT 3
- Separate prepare_data function and use output of that as input to all main functions HOT 1
- Return data on number of transfers HOT 2
- Parallelise add_net_to_gtfs
- Change storage mode of Rcpp matrices to <int>
- Estimate frequency of fastest connections HOT 3
- Estimate quality of bicycle infrastructure HOT 1
- Restrict main MM routines to distance threshold HOT 1
- Add elevation to networks generated in "prepare_data" phase HOT 1
- Loop over subsets of stops in m4ra_gtfs_traveltimes
- Reduce sizes of initial times,transfers,interval matrices
- Use arrow to read/write main GTFS matrices
- running m4ra_gtfs_traveltimes on windows is impossible due to mc.cores>1 HOT 1
- Move atfutures -> UrbanAnalyst
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from m4ra.