invenia / axissets.jl Goto Github PK
View Code? Open in Web Editor NEWConsistent operations over a collection of KeyedArrays
License: MIT License
Consistent operations over a collection of KeyedArrays
License: MIT License
Currently, our container types includes the following:
OrderedSet
sLittleDict
sTuple
sPattern
sKeyedArray
s / NamedDimsArray
This can make it hard to follow what operations will return what data types, so maybe we can simplify this in someway.
At the very least, we should probably try to minimize how much of this is exposed through the documentation and avoid too many examples of how they interact with other packages like Tables.jl, DataFrames.jl, etc.
Some design decisions seem to be a bit unintuitive for folks coming from DataFrames.jl. This issue should serve as a list of things to include:
Design:
KeyedArray
s?Tuple
s?_
and __
?Gotchas:
KeyAlignmentError
s?This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
We almost always want to run validate
when constructing a new dataset, so maybe we should have an inner constructor for that. We'll probably still want a validate=true
keyword for specific cases where we don't need to check.
Should be able to apply arbitrary feature transforms to some subselection of a dataset.
Unless we're performing flatten
with a KeyedDataset
then maybe it should live elsewhere?
Currently, all "paths" are just tuples of symbols. This is pretty efficient because we can do ===
in the pattern matching code, but we may have a use case for custom types that folks may want to dispatch on. In that case, loosening the constraint may be worth it.
Tuple{Vararg{Symbol}}
with Tuple
in
code in patterns.jl to use ==
The bright side is that this make me happy we chose Tuple
over Vector
:)
This came up in an internal implementation.
When we introduce a new element to a KeyedDataset the setindex!
method assigns a constraint if one isn't already associated with it's dimpath
.
Lines 122 to 130 in eb33c8c
However, this isn't necessarily desirable as the new element might share a fieldname with an existing element and introduce and so introduce a conflicting constraint.
MWE
julia> using AxisSets, AxisKeys
julia> using AxisSets: Pattern
# assume the following
julia> ds = KeyedDataset(
(:train, :input) => KeyedArray(ones(5, 5); time=1:5, id='a':'e'),
(:predict, :input) => KeyedArray(ones(5); id='a':'e'),
(:train, :output) => KeyedArray(ones(5); time=1:5);
constraints=Pattern[
(:train, :_, :time),
(:predict, :_, :time),
(:__, :input, :id), # offending constraint
]
)
# assign a new element with `:id` dimname - introduces `(:__, :id)` constraint
julia> ds[(:train, :weights)] = KeyedArray(ones(5), id='a':'e')
julia> ds.constraints
OrderedCollections.OrderedSet{Pattern} with 4 elements:
Pattern((:train, :_, :time))
Pattern((:predict, :_, :time))
Pattern((:train, :input, :id))
Pattern((:__, :id))
julia> ds
Error showing value of type KeyedDataset:
ERROR: ArgumentError: Collection has multiple elements, must contain exactly 1 element
Stacktrace:
[1] only
@ ./iterators.jl:1327 [inlined]
[2] _only(x::Vector{Int64})
@ AxisSets ~/.julia/packages/AxisSets/27klG/src/AxisSets.jl:27
[3] (::AxisSets.var"#20#22"{Vector{Pattern}, Tuple{Symbol, Symbol}})(dimname::Symbol)
@ AxisSets ~/.julia/packages/AxisSets/27klG/src/dataset.jl:97
[4] map(f::AxisSets.var"#20#22"{Vector{Pattern}, Tuple{Symbol, Symbol}}, t::Tuple{Symbol, Symbol})
@ Base ./tuple.jl:214
[5] show(io::IOContext{Base.TTY}, ds::KeyedDataset)
@ AxisSets ~/.julia/packages/AxisSets/27klG/src/dataset.jl:95
...
In this instance one would be better off redefining the offending constraint as (:__, :id)
but this might not always be possible/desirable.
Bit of an edge case, but seems valid.
If using map
on a KeyedDataset
, and the block returns nothing
, it errors, because of this line trying to assign nothing
as a KeyedArray
.
MWE:
julia> ds = KeyedDataset(:input => KeyedArray([1, 2]; a=[1, 2]));
julia> map(ds) do A
println(typeof(A))
end
KeyedArray{Int64, 1, NamedDimsArray{(:a,), Int64, 1, Vector{Int64}}, Base.RefValue{Vector{Int64}}}
ERROR: MethodError: Cannot `convert` an object of type
Nothing to an object of type
KeyedArray
Closest candidates are:
convert(::Type{T}, ::LinearAlgebra.Factorization) where T<:AbstractArray at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/factorization.jl:58
convert(::Type{T}, ::T) where T<:AbstractArray at abstractarray.jl:14
convert(::Type{T}, ::T) where T at essentials.jl:205
Stacktrace:
[1] setindex!(dd::OrderedCollections.LittleDict{Tuple, KeyedArray, Vector{Tuple}, Vector{KeyedArray}}, value::Nothing, key::Tuple{Symbol})
@ OrderedCollections ~/.julia/packages/OrderedCollections/cP9uu/src/little_dict.jl:219
[2] map!(f::var"#9#10", dest::KeyedDataset, src::KeyedDataset)
@ AxisSets ~/.julia/packages/AxisSets/IddYk/src/functions.jl:49
[3] map(::Function, ::KeyedDataset)
@ AxisSets ~/.julia/packages/AxisSets/IddYk/src/functions.jl:41
[4] top-level scope
@ REPL[16]:1
If the point of map(::KeyedDataset)
is only to modify the dataset, that may be fair, but I think it should either have a fallback, or a clearer error.
We currently have a big jump from the docstring examples in the API docs and the one very large example that assumes you're already familiar with the API. This documentation structure results in a lot of cognitive load when learning the package, particularly since it pulls in syntax from half a dozen other packages (e.g., Tables, DataFrames, AxisKeys)
Currently, KeyedArray
s support the tables interface and wrapdims
can be used to create a KeyedArray
from an existing table. I'm not entirely sure if we have a use-case for it yet, but it might be nice if we could ingest and produce tables. Some tricky parts of this include:
In some datasets, we may want to define a custom alignment function other than simply ==
. One use case, is when we have both hourly and daily data which aren't the same size (obviously), but we want to perform the same operations over both (i.e. filtering out a day should also remove the corresponding hours). If we restrict ourselves to Interval
queries, then we should be able to define a looser alignment function which simply checks that the days in the hours axis corresponds to the other daily data.
NOTE: I'm not sure how much of a priority this is, as in most cases we could probably just convert the daily data to hourly... either through interpolation or a sparse matrix.
There's a few benefits to having a KeyedDataset
subtype a Dictionary
:
KeyedArray
s should be easier with only(values(ds(pattern...).data))
becoming only(ds(pattern...))
.keys
, pairs
would be better than directly accessing the data
field.I thought this would throw an error.
MWE:
julia> ds = KeyedDataset(
:train => KeyedArray(zeros(5, 2); target=1:5, id=[:a, :b]),
:predict => KeyedArray(zeros(3, 2); target=6:8, id=[:a, :b]),
constraints=Pattern[(:_, :id)]
)
KeyedDataset with:
2 components
(:train,) => 5x2 KeyedArray{Float64} with dimension target, id[1]
(:predict,) => 3x2 KeyedArray{Float64} with dimension target, id[1]
1 constraints
[1] (:_, :id) ∈ 2-element Vector{Symbol}
julia> map(A -> A(id=:a), ds, (:predict, :_))
KeyedDataset with:
2 components
(:train,) => 5x2 KeyedArray{Float64} with dimension target, id[1]
(:predict,) => 3 KeyedArray{Float64} with dimension target
1 constraints
[1] (:_, :id) ∈ 2-element Vector{Symbol}
Whereas it throws an error if the dim preserved:
julia> map(A -> A(id=AxisKeys.Interval(:a, :a)), ds, (:predict, :_))
ERROR: KeyAlignmentError: Misaligned dimension keys on constraint Pattern((:_, :id))
Tuple[(:predict, :id)] ∈ 1-element view(::Vector{Symbol}, [1]) with eltype Symbol
Tuple[(:train, :id)] ∈ 2-element Vector{Symbol}
Stacktrace:
[1] validate(ds::KeyedDataset, constraint::Pattern{Tuple{Symbol, Symbol}}, paths::Set{Tuple})
@ AxisSets ~/.julia/packages/AxisSets/YTn0q/src/dataset.jl:293
[2] validate(ds::KeyedDataset)
@ AxisSets ~/.julia/packages/AxisSets/YTn0q/src/dataset.jl:267
[3] map!(f::var"#5#6", dest::KeyedDataset, src::KeyedDataset)
@ AxisSets ~/.julia/packages/AxisSets/YTn0q/src/functions.jl:52
[4] map(f::Function, ds::KeyedDataset, keys::Tuple{Symbol, Symbol})
@ AxisSets ~/.julia/packages/AxisSets/YTn0q/src/functions.jl:41
[5] top-level scope
@ REPL[9]:1
In AxisKeys.jl, NamedDimsArray{KeyedArray}
and KeyedArray{NamedDimsArray}
are equivalent. From the README:
A nested pair of wrappers can be constructed with keywords for names, and everything should work the same way in either order
This is not the case for AxisSets. MWE:
julia> A = NamedDimsArray(rand(2, 3), row=[:a, :b], col=1:3);
julia> KeyedDataset(:x => A)
ERROR: MethodError: Cannot `convert` an object of type
NamedDimsArray{(:row, :col), Float64, 2, KeyedArray{Float64, 2, Matrix{Float64}, Tuple{Vector{Symbol}, UnitRange{Int64}}}} to an object of type
KeyedArray
Closest candidates are:
convert(::Type{T}, ::Intervals.AnchoredInterval{P, T, L, R} where {L<:Intervals.Bounded, R<:Intervals.Bounded}) where {P, T} at /Users/bencottier/.julia/packages/Intervals/ua9cq/src/anchoredinterval.jl:181
convert(::Type{T}, ::Intervals.Interval{T, L, R} where {L<:Intervals.Bound, R<:Intervals.Bound}) where T at /Users/bencottier/.julia/packages/Intervals/ua9cq/src/interval.jl:253
convert(::Type{T}, ::LinearAlgebra.Factorization) where T<:AbstractArray at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/factorization.jl:58
...
Stacktrace:
[1] push!(a::Vector{KeyedArray}, item::NamedDimsArray{(:row, :col), Float64, 2, KeyedArray{Float64, 2, Matrix{Float64}, Tuple{Vector{Symbol}, UnitRange{Int64}}}})
@ Base ./array.jl:928
[2] (OrderedCollections.LittleDict{Tuple, KeyedArray, KS, VS} where {KS<:(Union{var"#s4", var"#s3"} where {var"#s4"<:Tuple, var"#s3"<:(Vector{T} where T)}), VS<:(Union{var"#s4", var"#s3"} where {var"#s4"<:Tuple, var"#s3"<:(Vector{T} where T)})})(itr::OrderedCollections.LittleDict{Tuple{Symbol}, NamedDimsArray{(:row, :col), Float64, 2, KeyedArray{Float64, 2, Matrix{Float64}, Tuple{Vector{Symbol}, UnitRange{Int64}}}}, Vector{Tuple{Symbol}}, Vector{NamedDimsArray{(:row, :col), Float64, 2, KeyedArray{Float64, 2, Matrix{Float64}, Tuple{Vector{Symbol}, UnitRange{Int64}}}}}})
@ OrderedCollections ~/.julia/packages/OrderedCollections/PRayh/src/little_dict.jl:73
[3] convert
@ ./abstractdict.jl:523 [inlined]
[4] KeyedDataset (repeats 2 times)
@ ~/.julia/packages/AxisSets/ullT8/src/dataset.jl:49 [inlined]
[5] KeyedDataset(pairs::Pair{Symbol, NamedDimsArray{(:row, :col), Float64, 2, KeyedArray{Float64, 2, Matrix{Float64}, Tuple{Vector{Symbol}, UnitRange{Int64}}}}}; constraints::Vector{AxisSets.Pattern})
@ AxisSets ~/.julia/packages/AxisSets/ullT8/src/dataset.jl:72
[6] KeyedDataset(pairs::Pair{Symbol, NamedDimsArray{(:row, :col), Float64, 2, KeyedArray{Float64, 2, Matrix{Float64}, Tuple{Vector{Symbol}, UnitRange{Int64}}}}})
@ AxisSets ~/.julia/packages/AxisSets/ullT8/src/dataset.jl:57
[7] top-level scope
@ REPL[63]:1
We should work a couple of the more complicated transforms into the example docs. It might be helpful for folks to see a few of the different combinations in practice.
Originally posted by @rofinn in #50 (comment)
Currently, you can create a KeyedDataset
in which a component dimension can match multiple constraint patterns. However, if dimension x must align with all axes in pattern a and b then that must mean that all axes in a and b must align with each other and can be described by a single, more general, Pattern
.
Per recent discussions we want to have readme's as well as docs
For example, we should be able to define patterns like:
Pattern(:train, :input, (:foo, :bar))
which would match (:train, :input, :foo)
or (:train, :input, :bar)
, but not (:train, :input, :baz)
Similarly, I could see an argument for something like:
Pattern(:train, :input, r"foo.*")
which would match (:train, :input, "foo.1")
and (:train, :input, "foo.2)
, but not (:train, :input, "bar.1")
NOTE: These should both be fallbacks, so that (:train, :input, (:foo, :bar))
or (:train, :input, r"foo.*")
would take priority if they matched first.
Should be able to apply filters, validation and imputation to subsets of a dataset.
Since it's an error we should probably include more details about what the alignment error actually is.
A filtered dataset had inconsistent constrained values across tables. The problem turned out to be that the constraint ignored one of the tables because it was using a multi-value wildcard. Switching to a single-value wildcard fixed it.
julia> using AxisSets: Pattern
julia> (:x, :a, :a) in Pattern(:x, :a, :a)
true
julia> (:x, :a, :a) in Pattern(:x, :_, :a)
true
julia> (:x, :a, :a) in Pattern(:x, :__, :a)
false
julia> (:x, :a, :a) in Pattern(:__, :a)
false
Include an example in the docs which demonstrates how to use the Dataset
type to perform batched / constrained operations over a collection of KeyedArray
s. This example should include:
I don't think this is necessary for an initial release, but it'd be nice if we could support batched concatenation of components.
merge
works fine with 2 datasets, but breaks with 3. The docstring claims merge
should work on multiple datasets.
julia> ds1 = KeyedDataset(
:a => KeyedArray(zeros(3); time=1:3),
:b => KeyedArray(ones(3, 2); time=1:3, loc=[:x, :y]),
);
julia> ds2 = KeyedDataset(
:c => KeyedArray(ones(3); time=1:3),
:d => KeyedArray(zeros(3, 2); time=1:3, loc=[:x, :y]),
);
julia> ds3 = KeyedDataset(
:e => KeyedArray(ones(3); time=1:3),
:f => KeyedArray(zeros(3, 2); time=1:3, loc=[:x, :y]),
);
julia> merge(ds1, ds2)
KeyedDataset with:
4 components
(:a,) => 3 KeyedArray{Float64} with dimension time[1]
(:b,) => 3x2 KeyedArray{Float64} with dimension time[1], loc[2]
(:c,) => 3 KeyedArray{Float64} with dimension time[1]
(:d,) => 3x2 KeyedArray{Float64} with dimension time[1], loc[2]
2 constraints
[1] (:__, :time) ∈ 3-element UnitRange{Int64}
[2] (:__, :loc) ∈ 2-element Vector{Symbol}
julia> merge(ds1, ds2, ds3)
ERROR: MethodError: no method matching KeyedDataset(::OrderedCollections.OrderedSet{AxisSets.Pattern}, ::Dict{Tuple, KeyedArray})
Closest candidates are:
KeyedDataset(::OrderedCollections.OrderedSet{AxisSets.Pattern}, ::OrderedCollections.LittleDict) at /Users/sam/.julia/packages/AxisSets/ullT8/src/dataset.jl:44
KeyedDataset(::OrderedCollections.OrderedSet{AxisSets.Pattern}, ::OrderedCollections.LittleDict, ::Any) at /Users/sam/.julia/packages/AxisSets/ullT8/src/dataset.jl:44
Stacktrace:
[1] merge(::KeyedDataset, ::KeyedDataset, ::KeyedDataset)
@ AxisSets ~/.julia/packages/AxisSets/ullT8/src/functions.jl:116
[2] top-level scope
@ REPL[22]:1
KeyedDataset
with a set of constraints.julia> expected = KeyedDataset(
constraints=Pattern[
(:_, :input, :id),
(:_, :output, :id),
(:train, :_, :target),
(:predict, :_, :target),
],
)
ERROR: StackOverflowError:
Stacktrace:
[1] KeyedDataset(; constraints::Array{Pattern,1}, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /Users/rory/.julia/packages/AxisSets/3RRAg/src/dataset.jl:58 (repeats 13779 times)
[2] top-level scope at REPL[6]:1
KeyedDataset
with variable key lengths.julia> expected = KeyedDataset(
(:a,) => KeyedArray(reshape(1:8, (4, 2)); target=1:4, id=[:a, :b]),
(:a, :b) => KeyedArray(reshape(1:8, (4, 2)); target=1:4, id=[:a, :b]),
)
ERROR: MethodError: no method matching KeyedDataset(::Pair{Tuple{Symbol},KeyedArray{Int64,2,NamedDimsArray{(:target, :id),Int64,2,Base.ReshapedArray{Int64,2,UnitRange{Int64},Tuple{}}},Tuple{UnitRange{Int64},Array{Symbol,1}}}}, ::Pair{Tuple{Symbol,Symbol},KeyedArray{Int64,2,NamedDimsArray{(:target, :id),Int64,2,Base.ReshapedArray{Int64,2,UnitRange{Int64},Tuple{}}},Tuple{UnitRange{Int64},Array{Symbol,1}}}})
Closest candidates are:
KeyedDataset(::Pair{T,B} where B...; constraints) where T<:Tuple at /Users/rory/.julia/packages/AxisSets/3RRAg/src/dataset.jl:29
Stacktrace:
[1] top-level scope at REPL[9]:1
This issue is to track any utilities / functions we don't seem to use that could potentially be deprecated at some point.
Since we're aiming for explicit delimiters we should probably have a section explaining some of the decisions.
⁻
: Common identifier for flattening arbitrary nested structuresˣ
: Flattening dimensions of an array (new dimensions and keys are the product of the flattened dimensions)⁺
: Arrays have been concatenated along that dimensionsFollowing on from #50 which just implemented the minimum required interface. It might be useful to also support apply!
and apply_append
.
Somewhat related to #48 .
I have data containing training sets for several features. Some have a single KeyedArray and others have multiple sub-components. It would be nice to be able to get all :train
data using a wildcard instead of merging (:_, :train, :__)
and (:_, :train)
.
julia> data=KeyedDataset(
(:f1, :train)=>KeyedArray([1], a=[1]),
(:f2, :train, :x)=>KeyedArray([1], b=[1]),
(:f2, :train, :y)=>KeyedArray([1], c=[1]),
)
KeyedDataset with:
3 components
(:f1, :train) => 1 KeyedArray{Int64} with dimension a[1]
(:f2, :train, :x) => 1 KeyedArray{Int64} with dimension b[2]
(:f2, :train, :y) => 1 KeyedArray{Int64} with dimension c[3]
3 constraints
[1] (:__, :a) ∈ 1-element Vector{Int64}
[2] (:__, :b) ∈ 1-element Vector{Int64}
[3] (:__, :c) ∈ 1-element Vector{Int64}
julia> data(:_, :train, :__)
KeyedDataset with:
2 components
(:f2, :train, :x) => 1 KeyedArray{Int64} with dimension b[1]
(:f2, :train, :y) => 1 KeyedArray{Int64} with dimension c[2]
2 constraints
[1] (:__, :b) ∈ 1-element Vector{Int64}
[2] (:__, :c) ∈ 1-element Vector{Int64}
julia> data(:_, :train)
KeyedDataset with:
1 components
(:f1, :train) => 1 KeyedArray{Int64} with dimension a[1]
1 constraints
[1] (:__, :a) ∈ 1-element Vector{Int64}
Related to #25 it's unclear what common Julia operations are applicable to a KeyedDataset
(e.g., first(ds)
, values(ds)
, etc). Maybe it's worth making this type an AbstractDict
with some extra functions?
The current comments in example.md
is a bit confusing and should probably be made clearer. For example, better clarification of what "matching" means in terms of dimension paths.
We currently use unicode separators in a few places where tuples are possible. This may contribute to the cognitive load of the API and documentation.
While aligning with glob
(*
/**
) and NamedDims.jl (:_
wildcard dim) is nice, :_
and :__
are a bit hard to distinguish depending on the font setting in an editor or terminal. We might want to consider using a different symbol or maybe our keys should just be path strings reusing the glob syntax?
I'm not yet sure if this should be handled by AxisSets.jl directly, but it's a workflow we should look into more.
https://www.tensorflow.org/api_docs/python/tf/data/Dataset
http://shashi.biz/FileTrees.jl/
For many operations we'll want to both mutate the data components and rename the keys.
For example, maybe we want to one-hot-encode a time feature for both train
/predict
inputs.
for (k, v) in pairs(ds(:_, :input, :temp))
# Rename our component from temp to hod
_k = replace(collect(k), :temp => :hod)
# Insert our one-hot-encode hour-of-day feature from the temperature times.
ds[k] = ohe(hod(v.time))
end
Unfortunately, this has two issues:
map
.validate
each time we insert into the dataset.A new error type was introduced in abstractarray.jl
in Julia1.8
. CanonicalIndexError is a new error type thrown when getting the index of an array in these tests. This error type didn't exist in Julia1.6
.
This error type is also stand-alone, and doesn't have any association with any error structs. I'm wondering if Base.abstractarray.jl
should be changed such that CanonicalIndexError
is subtype of ErrorException
so that these tests pass, or if the tests themselves should change to reflect the error "type".
It would be nice to have a convenient syntax for creating a new dataset with a subset of the variables. Currently this can be done with the following:
julia> using AxisKeys, AxisSets
julia> ds = KeyedDataset(
:val1 => KeyedArray(zeros(3, 2); time=1:3, foo=[:a, :b]),
:val2 => KeyedArray(ones(3, 2); time=1:3, bar=[:x, :y]),
:val3 => KeyedArray(ones(3, 2); time=1:3, baz=[:z, :w]),
);
julia> ds(in([(:val1,), (:val2,)]))
KeyedDataset with:
2 components
(:val1,) => 3x2 KeyedArray{Float64} with dimension time[1], foo[2]
(:val2,) => 3x2 KeyedArray{Float64} with dimension time[1], bar[3]
3 constraints
[1] (:__, :time) ∈ 3-element UnitRange{Int64}
[2] (:__, :foo) ∈ 2-element Vector{Symbol}
[3] (:__, :bar) ∈ 2-element Vector{Symbol}
I propose a simple overload like the following:
julia> (ds::KeyedDataset)(i::AbstractVector{Symbol}) = ds(in([(s,) for s in i]))
julia> ds([:val1, :val2])
KeyedDataset with:
2 components
(:val1,) => 3x2 KeyedArray{Float64} with dimension time[1], foo[2]
(:val2,) => 3x2 KeyedArray{Float64} with dimension time[1], bar[3]
3 constraints
[1] (:__, :time) ∈ 3-element UnitRange{Int64}
[2] (:__, :foo) ∈ 2-element Vector{Symbol}
[3] (:__, :bar) ∈ 2-element Vector{Symbol}
We're slowly adding support for more external packages like Impute.jl and FeatureTransforms.jl. It might be a good idea to preemptively start using Requires.jl to avoid exposing ourselves to too many dependencies. Currently, both package are pretty minimal, but that might now always be the case.
We currently reuse the show
method for each component KeyedArray
. This looks nice for small datasets, but if you have more than a few components this can be very verbose to output by default. Also, if you want to inspect a specific component you can always access it with getindex
.
Things we usually want to know:
eltypes
for each component and corresponding axiskeys
?Sample:
KeyedDataset with:
7 constraints:
(:train, :input, :_, :time) ∈ 145-element Vector{Dates.DateTime}
(:train, :output, :_, :time) ∈ 145-element Vector{Dates.DateTime}
(:predict, :input, :_, :time) ∈ 25-element Vector{Dates.DateTime}
(:predict, :output, :_, :time) ∈ 25-element Vector{Dates.DateTime}
(:__, :prices, :id) ∈ 4-element Vector{Symbol}
(:__, :temp, :id) ∈ 4-element Vector{Symbol}
(:__, :load, :id) ∈ 2-element Vector{Symbol}
8 components:
(:train, :input, :prices) => 145x4x4 KeyedArray{Union{Missing, Float64}, 3} with dimension names: (:time, :id, :lag)
...
(:predict, :output, :prices) => 25x4 KeyedArray(Union{Missing, Float64}, 2) with dimension names: (:time, :id)
In theory, I guess we could also generate random colors and use those to visually align the component dimension names to the corresponding constraint?
Currently the only way I can find to retrieve the names of the components of ds::KeyedDataset
is with keys(ds.data)
. As noted on slack, it's not safe to access ds.data
directly, as its implementation might change. Is there another way to get the components list with an API function? Alternatively, an API function like components(ds::KeyedDataset) = collect(keys(ds.data))
would be useful.
Similarly, am I correct that given a component name k::Symbol
, getproperty(ds::KeyedDataset, k)
is the intended way to access the corresponding KeyedArray
?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.