Comments (8)
Very painful process of understanding it and turning it into Julia for registry https://tools.ietf.org/html/rfc1952#page-6 (GZ format specification)
https://gist.github.com/hpoit/007ace3657b94fffd68530464a7d5ad9
If anyone knows of an existing code block representative of this please let me know! Otherwise I'm continuing on my gist.
About the magic number, not sure if it's true
"There is a magic number at the beginning of the file. Just read the first two bytes and check if they are equal to 0x1f8b."
From the spec
"ID2 (IDentification 2)
These have the fixed values ID1 = 31 (0x1f, \037), ID2 = 139
(0x8b, \213), to identify the file as being in gzip format."
I believe this is what I have to look for in io
to detect GZ.
from fileio.jl.
Wait, what?
julia> using DataFrames
julia> airline_df = readtable("/Users/randyzwitch/airline/1987.csv.gz");
julia> size(airline_df)
(1311826,29)
julia> typeof(airline_df)
DataFrame (use methods(DataFrame) to see constructors)
I think I just have to see how readtable
identifies it, if it does.
from fileio.jl.
# (2) Path is GZip file
elseif endswith(pathname, ".gz")
io = gzopen(pathname, "r")
nbytes = 2 * filesize(pathname)
Hi @SimonDanisch, would this pass for registry and identifying a file is a GZIP?
from fileio.jl.
@SimonDanisch don't worry about it, I actually tried julia> df = readtable("/file.csv.gz")
and it takes forever. Back to binaries.
from fileio.jl.
Okay, I think I've got the right question now: how do I read the first two bytes of a file to check if they are equal to 0x1f
and 0x8b
. That's what I'm trying to answer.
If the first two bytes are equal to those, then the file is in GZIP format.
from fileio.jl.
Hi @SimonDanisch, can I PR this? Is it correct?
add_format(format"GZIP", [0x1f, 0x8b], ".gz", [:Libz])
Let me test it.
from fileio.jl.
Hi @SimonDanisch, here goes the test, not sure I passed it
julia> Pkg.test("FileIO")
INFO: Testing FileIO
these tests will print warnings:
Library "NotInstalled" is not installed but is recommended as a library to load format: ".not_installed"
Should we install "NotInstalled" for you? (y/n):
Library "NotInstalled" is not installed but is recommended as a library to load format: ".not_installed"
Should we install "NotInstalled" for you? (y/n):
invalid is not a valid choice. Try typing y or n
Should we install "NotInstalled" for you? (y/n):
this test will print warnings:
Errors encountered while loading "test.multierr".
All errors:
ErrorException("1")
ErrorException("2")
Fatal error:
Test Summary: | Pass Total
FileIO | 108 108
INFO: FileIO tests passed
julia>
Should I just PR for you to see what I've done?
from fileio.jl.
A PR would be great!
from fileio.jl.
Related Issues (20)
- Proposal: FileIO.save keyword argument syntax (FileIO.kwsave ?) HOT 3
- FIle collision on case sensitive file systems after recent release.
- House AVSfldIO.jl under JulioIO ? HOT 9
- FileIO is not threadsafe HOT 4
- Error in magic function detect_rdata HOT 4
- There was an error in magic function detect_rdata_single HOT 5
- No applicable_loaders found for UNKNOWN HOT 3
- RData can't load .rda files with [email protected] but can with [email protected] HOT 2
- Confusing error when saving the wrong thing
- AstroIO required when it shouldn't be HOT 5
- Too many open files HOT 4
- FileIO not loading / requiring dependency AVSfldIO properly HOT 2
- @maybe macro
- registering zarr files HOT 1
- register Matlab's .mat HOT 1
- Incorrectly export GIF image
- Unhelpful output `#27 (generic function with 1 method)` HOT 2
- [FR] suppert for .webp format HOT 3
- Test failure in CSVFiles
- Base.download deprecation warning in tests
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fileio.jl.