Comments (8)
Streaming parsers are hard enough on there own - imagine trying to streaming readers + writers in 5 file types.
:^)
You can find the full file for e.g. Baden-Württemberg here. They offer data in the only two capable data formats this world has come up with, semicolon-separated comma separated values and XLSX. You can follow the "csv Datei" link (or click here directly).
The query is in my last comment, it just takes the json array, filters the entries to the ones in a specific municipality and then trims down the json object to the relevant elements.
from dasel.
Hey @I-Al-Istannen,
Just for your information I got some time to dig at this. This is possible without hitting the panic you experienced:
./dasel -r csv -f <(iconv -f ISO-8859-1 -t UTF-8 ~/Downloads/file.csv) --csv-comma ';' -w json 'all().filterOr(equal(Gemeinde,Stuttgart),equal(Gemeinde,Mahlstetten)).mapOf(Gemeinde,Gemeinde,Haltestelle,Haltestelle,Haltestelle_lang,Haltestelle_lang,globaleID,globaleID)'
6.79s user 0.10s system 164% cpu 4.185 total
This shows just the fields:
- Gemeinde
- Haltestelle
- Haltestelle_lang
- globaleID
For any stations in:
- Stuttgart
- Mahlstetten
I have also merged the panic fix to master.
from dasel.
Hey @I-Al-Istannen,
Thanks for raising this. I'll do a bit of digging.
Out of interest, what were you actually trying to get from the CSV?
from dasel.
It seems this panic could occur when appending to a list of objects. I have fixed it here: #393 however it requires the deletion of some code that I need to verify is OK to be removed.
from dasel.
Thanks for raising this. I'll do a bit of digging.
Thanks :)
Out of interest, what were you actually trying to get from the CSV?
I was parsing stop data for the German train network, which of course only offers data in a large CSV file :P The full command for finding stops I was interested in looked something like this:
dasel -r csv -f <(iconv -f ISO-8859-1 -t UTF-8 haltestellen.csv) --csv-comma ';' -w json | jq 'map(select(.Gemeinde | contains("Name"))) | map({Gemeinde, Haltestelle, Haltestelle_lang, globaleID})'
I was trying to do the select with dasel
first, but it was sufficiently different from jq
that I didn't manage to cook something up quickly and then ran into the crash. dasel+jq worked just fine. Though the above command takes around 7 seconds, with dasel taking up 5 of them, but I don't really care about the performance :D
from dasel.
Yeah dasel unfortunately loads the entire file into memory because of the change in file type. Streaming parsers are hard enough on there own - imagine trying to streaming readers + writers in 5 file types.
Performance aside, I'd be interested in a subset of your file that I can create some tests against to:
A: be sure no panic will occur
B: Potentially help you with your query without the need for jq (I'm always looking for ways to increase the general usefulness of dasel)
from dasel.
Ah, I see. You need to also duplicate all the fields in the map as it is key,value,key,value,…
. I was trying more of a jq-like approach for the filter apparently 'equal([].Landkreis, "Sigmaringen")'
, but that already maps the array values so you do not need to further deal with the outer array here.
from dasel.
That's right - I want a neater version of that map function for exactly this purpose. It works for now though
from dasel.
Related Issues (20)
- Delete selector can't find keys with null values HOT 2
- Special characters inside string literals are replaced with Unicode escape sequences in JSON HOT 3
- contains/includes HOT 2
- -w xml is not working when -r is not xml HOT 3
- Support multiple files HOT 1
- CSV file format: Support other characters as separator HOT 2
- Output formatting is not right in some arrays HOT 6
- [BUG] Dasel crash when reading an empty file HOT 5
- Suppress header generation HOT 1
- Is it possible to get a raw value, i.e. without quotes? HOT 3
- Preserve order of keys in JSON lists HOT 1
- How to iterate? HOT 2
- Something is wrong with the new update using the Windows binaries HOT 5
- Crash on Windows 7 x64 HOT 4
- Supported toml version? HOT 1
- Null value causes incorrect error "property not found" HOT 1
- Empty string becoming null when modifying YAML HOT 4
- Windows installation instructions do not work HOT 1
- Compile builds with tinygo for reasonable system file sizes HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dasel.