Comments (7)
Are you using the command-line utility, or serd as a library?
For code, it should be relatively easy to hook up serd to whatever compression library using custom read and write functions. Since serd is a lightweight library with no external dependencies, I don't think it's appropriate to add dependencies for this (and don't want to step on the feature creep treadmill of whatever archive format somebody wants this week). If the API makes this too difficult (there's a ton of archive libraries I have never tried), the reasons why should be addressed. I have already revised that heavily in the upcoming major version (serd1
branch) and imagine it should hook up nicely to more or less anything, but it'd be good to double-check various popular libraries before committing to the API.
That said, I'd be more open to adding it to the command-line utilities since that should be easy to make optional and doesn't add a dependency to the library itself. That code would also serve as an example to steal for other programs/libraries that want to do it. On the other hand, there you could just set up a pipeline...
from serd.
Just using serdi at the command line to convert to N-triples right now, so I can examine the output of serd. Agree I was thinking maybe a compile time option to add support into serdi might be reasonable. I understand the goal of lightweight with no external dependencies (I like that too).
My next step is to try using the library (and I could add decompress support, understood). Agree examples are very helpful, it looks like the code for serdi itself might be the best example to start with? I hope to read any one of the supported formats (but compressed), do some minimal processing/filtering of the triples as they fly by, and then store subsets in a few separate files. Simple use, but it needs to be performant, and streaming, hence my interest in this library.
from serd.
I see. For things like that, if you want to dig into the code, you might want to start with the aforementioned serd1
branch, even though it's not out yet. There is a lot more there around processing streams (including a utility specifically for filtering) and the API is quite a bit friendlier and more polished. You can do it with the current stable branch too though, there's always been facilities for custom functions. Unfortunately there's not much example-based documentation (niche within a niche here, never seemed worth my time), but you should be able to figure it out from serdi or just poking through serd.h
.
If you're a Python fan, I'm working on Python bindings in the serd1
branch as well. They're not quite done yet though (I think the current tip doesn't even build, bit of a mess right now). Earlier WIP of the documentation here, for example: https://drobilla.net/files/pyserd_docs/ . I hope to finish this stuff up shortly, but have a lot of balls in the air right now... if you're interested in this I can ping this issue when they're ready(ish), feedback would be helpful.
As for the issue at hand, we can ponder whether built-in support is worth it for convenience, but you can always just throw some UNIX at the problem, e.g.
zcat mydbdump.ttl.gz | serdi -
from serd.
I'm fine with C (likely faster), but python is fine if easier to use and as nearly as fast. I use any language as needed, if I had to pick a language I'd identify as a K&R C fan. Yes, I am interested to hear when serd1 is done. One question (maybe this is a can of worms?), why do you use a dash for stdin, instead of just reading stdin when no file is given like common file commands do, e.g. cat, sort, uniq, cut, etc. If I'm not mistaken using a dash is a niche behavior used only by a select few apps. It feels unnatural to need to add a dash parameter if piping data to serdi... I keep forgetting serdi needs it...
from serd.
Okay, I was must guessing from the python libraries comment.
The -
thing is a pretty universal convention for tools that are usually used with file inputs (which are friendlier in this case because then a base URI and syntax can be determined), but I suppose it could perhaps work without. In any case, please open separate tickets for unrelated issues to keep the tracker on point.
from serd.
You can close this ticket. Thank you for the notes, agree this is not core, and there are more important things to work on
from serd.
Okay. I will keep it around for now as a reminder, since I would like to make sure that at least, for example, it's easy to wire up libarchive to the read/write APIs.
I probably won't add support to the tools themselves for initial release (I'm really struggling to finally get this out, so non-API-affecting feature creep in general is out), but it should be easy enough to add as a feature in a minor release.
from serd.
Related Issues (20)
- Reports syntax error on blank node statements HOT 2
- Colliding generated blank nodes during TriG import HOT 6
- How to apply a base URI? HOT 4
- Resolution for base URIs with empty path HOT 2
- Cannot parse a valid TriG document HOT 1
- ShEx support HOT 6
- Error parsing 'a' without whitespace HOT 1
- Build error HOT 3
- Parsing from a string in python HOT 11
- Compile failure on OSX (gcc) due to deprecated attributes message HOT 1
- serd 0.30.8 build failure on mojave and catalina HOT 9
- Unable to parse triple-quoted literal HOT 7
- Write canonical NTriples 1.1 by default HOT 6
- pkg-config file should container -DSERD_STATIC on static build HOT 11
- Debian / Archlinux package: Available ? HOT 1
- Does serdi support named pipe input/output ? HOT 5
- Add support for reading RDF* HOT 2
- [master/0.30.16] Statc build (-Dstatic=true) fails with link error: attempted static link of dynamic object `libserd-0.so.0.31.0' HOT 9
- Bug: serd_reader_read_chunk does not support NQuads HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from serd.