Comments (4)
Just below they suggest name spacing the value passed there: https://arrow.apache.org/docs/format/Columnar.html#extension-types. That bit reads to me like it is not a placeholder, but rather the customization is in the value (not the key).
from arrow-julia.
But the metadata is a dict, so the namespacing they suggest would be pointless if only applied to values since they will be overwritten.
from arrow-julia.
Maybe we can check against another implementation by seeing the metadata produced by an extension type there, e.g. following https://arrow.apache.org/docs/python/generated/pyarrow.ExtensionType.html#pyarrow.ExtensionType.extension_name.
It doesn't seem clear if a column can have more than one extension type though. It could be there's only 1 key on purpose so that different implementations can share that key to define an extension to the arrow spec overall (e.g. we if we all agree what a foo
is, we define an extension name for that, serialize that metadata, and then read it in as a foo
when possible). Which maybe then means your suggestion is that arrow-julia shouldn't be using "extension types" specifically for metadata that is only used by that implementation, and should use other keys for that.
Would appreciate any feedback from someone who understands the spec better
from arrow-julia.
After searching through the arrow repo after arrow:extension
is seems like you might be right. Here it is defined as a const in the cpp code for example and I could not find any trace of it being manipulated or changed anywhere (which would be a quite strange thing to do as well).
The c code seems to accept the metadata as a vector of pairs through, so it would in theory allow for multiple identical keys, but python, java and Julia use dicts so there is no way to have it through any of them.
It is a bit unclear to me what the ExtensionType stuff does in arrow and what one gets for buying in to it. However, most things points to it being a mechanism for your foo
example.
I close the issue now since my initial understanding of the extension metadata was incorrect and I don't think there is any action needed here. Just reopen or open another issue if there is a point about late conversion stuff (e.g. String <-> Symbol) not fitting into the definition of extensions.
from arrow-julia.
Related Issues (20)
- Unhandled sentinel value for len in compression causes invalid Array dimensions HOT 5
- Failure to read compressed empty table from java implementation HOT 3
- Release document misses how to register ArrowTypes to the Julia General Registry
- Arrow.jl 2.6 breaks Legolas.jl's tests HOT 11
- Incorrect syntax in ArrowTypes code HOT 2
- Error with v2.6.0 HOT 9
- Issue with `Union{Missing, VersionNumber}` HOT 6
- GitHub Pages build error HOT 8
- Use https://arrow.apache.org/julia/ as the official Website URL HOT 7
- html comment tag at the top of main documentation page may have one too many dashes at the beginning
- explanation of Arrow.Stream vs. Arrow.Table seems ambiguous HOT 3
- `Arrow.write` performance on large DataFrame HOT 3
- Bus errors when writing `DataFrame` HOT 8
- Arrow stream writer and reader implementation questions
- [feature request] support run-end encoded layout
- Custom type cannot round trip (Colors.jl) HOT 1
- colmetadata does not read custom metadata with multiple writes
- `getindex` broken with `SVector{3, UInt}` in the presence of missing data HOT 2
- Removing .arrow files without closing Julia seems impossible in Windows HOT 18
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arrow-julia.