ctuavastlab / mill.jl Goto Github PK
View Code? Open in Web Editor NEWPrototype flexible hierarchical multi-instance learning models.
Home Page: https://ctuavastlab.github.io/Mill.jl/stable/
License: MIT License
Prototype flexible hierarchical multi-instance learning models.
Home Page: https://ctuavastlab.github.io/Mill.jl/stable/
License: MIT License
I am trying to speed up the training process by using GPU:
# implements the multiple-instance learning model using Neural Networks, as described in
# https://arxiv.org/abs/1609.07257
# Using Neural Network Formalism to Solve Multiple-Instance Problems, Tomas Pevny, Petr Somol
using FileIO, JLD2, Statistics, Mill, Flux
using Flux: throttle, @epochs
using Mill: reflectinmodel
using Base.Iterators: repeated
using CUDAapi, CUDAdrv, CUDAnative
gpu_id = 0
if has_cuda_gpu() && gpu_id >=0
device!(gpu_id)
device = Flux.gpu
@info "Training on GPU-$(gpu_id)"
else
device = Flux.cpu
@info "Training on CPU"
end
# load the musk dataset
fMat = load("example/musk.jld2", "fMat") # matrix with instances, each column is one sample
bagids = load("example/musk.jld2", "bagids") # ties instances to bags
x = BagNode(ArrayNode(fMat), bagids) # create BagDataset
y = load("example/musk.jld2", "y") # load labels
y = map(i -> maximum(y[i]) + 1, x.bags) # create labels on bags
y_oh = Flux.onehotbatch(y, 1:2) # one-hot encoding
# create the model
model = BagModel(
ArrayModel(Dense(166, 10, Flux.tanh)), # model on the level of Flows
SegmentedMeanMax(10), # aggregation
ArrayModel(Chain(Dense(20, 10, Flux.tanh), Dense(10, 2)))) |> device # model on the level of bags
# define loss function
loss(x, y_oh) = Flux.logitcrossentropy(model(x |> device).data, y_oh |> device)
# the usual way of training
evalcb = throttle(() -> @show(loss(x |> device, y_oh |> device)), 1)
opt = Flux.ADAM()
@epochs 10 Flux.train!(loss, params(model), repeated((x, y_oh), 1000), opt, cb=evalcb)
# calculate the error on the training set (no testing set right now)
mean(mapslices(argmax, model(x |> device).data, dims=1)' .!= y)
But an error raised:
ArgumentError: cannot take the CPU address of a CuArrays.CuArray{Float32,2,Nothing}
Stacktrace:
[1] unsafe_convert(::Type{Ptr{Float32}}, ::CuArrays.CuArray{Float32,2,Nothing}) at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/CuArrays/YFdj7/src/array.jl:226
[2] gemm!(::Char, ::Char, ::Float32, ::CuArrays.CuArray{Float32,2,Nothing}, ::Array{Float32,2}, ::Float32, ::Array{Float32,2}) at /home/buildbot/build-worker/worker/juliapro-release-centos7-0_6/build/tmp_julia/share/julia/stdlib/v1.4/LinearAlgebra/src/blas.jl:1167
[3] gemm_wrapper!(::Array{Float32,2}, ::Char, ::Char, ::CuArrays.CuArray{Float32,2,Nothing}, ::Array{Float32,2}, ::LinearAlgebra.MulAddMul{true,true,Bool,Bool}) at /home/buildbot/build-worker/worker/juliapro-release-centos7-0_6/build/tmp_julia/share/julia/stdlib/v1.4/LinearAlgebra/src/matmul.jl:597
[4] mul! at /home/buildbot/build-worker/worker/juliapro-release-centos7-0_6/build/tmp_julia/share/julia/stdlib/v1.4/LinearAlgebra/src/matmul.jl:169 [inlined]
[5] mul! at /home/buildbot/build-worker/worker/juliapro-release-centos7-0_6/build/tmp_julia/share/julia/stdlib/v1.4/LinearAlgebra/src/matmul.jl:208 [inlined]
[6] * at /home/buildbot/build-worker/worker/juliapro-release-centos7-0_6/build/tmp_julia/share/julia/stdlib/v1.4/LinearAlgebra/src/matmul.jl:160 [inlined]
[7] adjoint at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Zygote/1GXzF/src/lib/array.jl:310 [inlined]
[8] _pullback at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/ZygoteRules/6nssF/src/adjoint.jl:47 [inlined]
[9] Dense at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/layers/basic.jl:122 [inlined]
[10] Dense at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/layers/basic.jl:133 [inlined]
[11] applychain at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/layers/basic.jl:36 [inlined]
[12] Chain at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/layers/basic.jl:38 [inlined]
[13] #159 at /data/zhangzhi/nn-rebuttal/Mill.jl/src/modelnodes/arraymodel.jl:14 [inlined]
[14] mapdata at /data/zhangzhi/nn-rebuttal/Mill.jl/src/datanodes/datanode.jl:57 [inlined]
[15] mapdata at /data/zhangzhi/nn-rebuttal/Mill.jl/src/datanodes/arraynode.jl:18 [inlined]
[16] ArrayModel at /data/zhangzhi/nn-rebuttal/Mill.jl/src/modelnodes/arraymodel.jl:14 [inlined]
[17] _pullback(::Zygote.Context, ::ArrayModel{…}, ::ArrayNode{…}) at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Zygote/1GXzF/src/compiler/interface2.jl:0
[18] BagModel at /data/zhangzhi/nn-rebuttal/Mill.jl/src/modelnodes/bagmodel.jl:28 [inlined]
[19] _pullback(::Zygote.Context, ::BagModel{…}, ::BagNode{…}) at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Zygote/1GXzF/src/compiler/interface2.jl:0
[20] loss at ./In[12]:36 [inlined]
[21] _pullback(::Zygote.Context, ::typeof(loss), ::BagNode{…}, ::Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}) at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Zygote/1GXzF/src/compiler/interface2.jl:0
[22] adjoint at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Zygote/1GXzF/src/lib/lib.jl:179 [inlined]
[23] _pullback at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/ZygoteRules/6nssF/src/adjoint.jl:47 [inlined]
[24] #17 at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:89 [inlined]
[25] _pullback(::Zygote.Context, ::Flux.Optimise.var"#17#25"{typeof(loss),Tuple{BagNode{…},Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}}}) at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Zygote/1GXzF/src/compiler/interface2.jl:0
[26] pullback(::Function, ::Zygote.Params) at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Zygote/1GXzF/src/compiler/interface.jl:172
[27] gradient(::Function, ::Zygote.Params) at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Zygote/1GXzF/src/compiler/interface.jl:53
[28] macro expansion at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:88 [inlined]
[29] macro expansion at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Juno/f8hj2/src/progress.jl:134 [inlined]
[30] train!(::typeof(loss), ::Zygote.Params, ::Base.Iterators.Take{Base.Iterators.Repeated{Tuple{BagNode{…},Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}}}}, ::ADAM; cb::Flux.var"#throttled#20"{Flux.var"#throttled#16#21"{Bool,Bool,var"#24#25",Int64}}) at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:81
[31] top-level scope at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Flux/Fj3bt/src/optimise/train.jl:122
[32] top-level scope at /home/zhangzhi/.juliapro/JuliaPro_v1.4.1-1/packages/Juno/f8hj2/src/progress.jl:134
[33] top-level scope at In[12]:41
By checking the code, I found that the encapsulation of datanodes
, arraymodel
and arraynode
affects the operation of |> device
. I need to hack in the source code of Mill.jl to manually migrate the data to the gpu. In your architecture, do you have any other suggestions to implement cuda operations more simply?
Thanks!
Currently, we have LearnBase = "0.4, 0.5"
in compat. But we can see only LearnBase=0.4.1
is used in tests
https://github.com/CTUAvastLab/Mill.jl/runs/3668153165 and LearnBase=0.5
can not be used because MLDataPattern.jl
supports LearnBase=0.4
, but not LearnBase=0.5
. In order to prevent many problems once they will actually start supporting it, we should remove it until then. Similar mistake has already caused several headaches to our team, when we added it to compat regardless of JuliaML/MLDataPattern.jl#45.
The code has never ran on LearnBase=0.5
and thus should not be in compat.
cc @simonmandlik
julia> ds = BagNode(ArrayNode(rand(1, 1)), [1])
BagNode with 1 bag(s)
└── ArrayNode(1, 1)
julia> reflectinmodel(x, d -> Dense(d, 1, relu), d -> SegmentedMax(d))
ERROR: MethodError: no method matching typemin(::Type{Tracker.TrackedReal{Float32}})
Closest candidates are:
typemin(::Type{Bool}) at bool.jl:6
typemin(::Type{Int8}) at int.jl:665
typemin(::Type{UInt8}) at int.jl:667
...
Stacktrace:
[1] segmented_max_forw(::TrackedArray{…,Array{Float32,2}}, ::Array{Float32,1}, ::AlignedBags) at /home/cisco/.julia/packages/Mill/PcHi7/src/aggregations/segmented_max.jl:20
[2] (::SegmentedMax{Array{Float32,1}})(::TrackedArray{…,Array{Float32,2}}, ::AlignedBags, ::Nothing) at /home/cisco/.julia/packages/Mill/PcHi7/src/aggregations/segmented_max.jl:13 (repeats 2 times)
[3] (::getfield(Mill, Symbol("##95#96")){SegmentedMax{Array{Float32,1}},Tuple{AlignedBags}})(::TrackedArray{…,Array{Float32,2}}) at /home/cisco/.julia/packages/Mill/PcHi7/src/aggregations/segmented_max.jl:12
[4] mapdata(::getfield(Mill, Symbol("##95#96")){SegmentedMax{Array{Float32,1}},Tuple{AlignedBags}}, ::ArrayNode{TrackedArray{…,Array{Float32,2}},Nothing}) at /home/cisco/.julia/packages/Mill/PcHi7/src/datanodes/arrays.jl:16
[5] (::SegmentedMax{Array{Float32,1}})(::ArrayNode{TrackedArray{…,Array{Float32,2}},Nothing}, ::AlignedBags) at /home/cisco/.julia/packages/Mill/PcHi7/src/aggregations/segmented_max.jl:12
[6] (::BagModel{ArrayModel{Dense{typeof(relu),TrackedArray{…,Array{Float32,2}},TrackedArray{…,Array{Float32,1}}}},SegmentedMax{Array{Float32,1}},ArrayModel{typeof(identity)}})(::BagNode{ArrayNode{Array{Float64,2},Nothing},AlignedBags,Nothing}) at /home/cisco/.julia/packages/Mill/PcHi7/src/modelnodes/bagmodel.jl:29
[7] _reflectinmodel(::BagNode{ArrayNode{Array{Float64,2},Nothing},AlignedBags,Nothing}, ::Function, ::getfield(Main, Symbol("##28#30")), ::Dict{Any,Any}, ::Dict{Any,Any}, ::String) at /home/cisco/.julia/packages/Mill/PcHi7/src/modelnodes/modelnode.jl:18
[8] #reflectinmodel#133(::Dict{Any,Any}, ::Dict{Any,Any}, ::typeof(reflectinmodel), ::BagNode{ArrayNode{Array{Float64,2},Nothing},AlignedBags,Nothing}, ::Function, ::Function) at /home/cisco/.julia/packages/Mill/PcHi7/src/modelnodes/modelnode.jl:12
[9] reflectinmodel(::BagNode{ArrayNode{Array{Float64,2},Nothing},AlignedBags,Nothing}, ::Function, ::Function) at /home/cisco/.julia/packages/Mill/PcHi7/src/modelnodes/modelnode.jl
Mill version: v1.0.0
We use Flux.onecold
as an inversion to onehot encoding.
This works for OneHotMatrix, but not for MaybeHotMatrix. See
julia> t = Flux.onehotbatch(1:3, 1:10)
10×3 OneHotMatrix(::Vector{UInt32}) with eltype Bool:
1 ⋅ ⋅
⋅ 1 ⋅
⋅ ⋅ 1
⋅ ⋅ ⋅
⋅ ⋅ ⋅
⋅ ⋅ ⋅
⋅ ⋅ ⋅
⋅ ⋅ ⋅
⋅ ⋅ ⋅
⋅ ⋅ ⋅
julia> t2 = maybehotbatch(1:3, 1:10)
10×3 MaybeHotMatrix{UInt32, Int64, Bool}:
1 0 0
0 1 0
0 0 1
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
julia> Flux.onecold(t)
3-element Vector{Int64}:
1
2
3
julia> Flux.onecold(t2)
ERROR: LoadError: MethodError: no method matching _getindex(::MaybeHotMatrix{UInt32, Int64, Bool}, ::Int64, ::CartesianIndex{1})
Closest candidates are:
_getindex(::MaybeHotMatrix, ::Union{Integer, AbstractVector{T} where T}, ::Integer) at C:\Users\racinsky\.julia\packages\Mill\f48u2\src\special_arrays\maybe_hot_matrix.jl:32
_getindex(::MaybeHotMatrix, ::Integer, ::Colon) at C:\Users\racinsky\.julia\packages\Mill\f48u2\src\special_arrays\maybe_hot_matrix.jl:33
_getindex(::MaybeHotMatrix, ::CartesianIndex{2}) at C:\Users\racinsky\.julia\packages\Mill\f48u2\src\special_arrays\maybe_hot_matrix.jl:34
...
Stacktrace:
[1] getindex(::MaybeHotMatrix{UInt32, Int64, Bool}, ::Int64, ::CartesianIndex{1})
@ Mill C:\Users\racinsky\.julia\packages\Mill\f48u2\src\special_arrays\maybe_hot_matrix.jl:31
[2] findminmax!(f::typeof(Base.isgreater), Rval::Matrix{Bool}, Rind::Matrix{CartesianIndex{2}}, A::MaybeHotMatrix{UInt32, Int64, Bool})
@ Base .\reducedim.jl:928
[3] _findmax(A::MaybeHotMatrix{UInt32, Int64, Bool}, region::Int64)
@ Base .\reducedim.jl:1048
[4] #findmax#726
@ .\reducedim.jl:1038 [inlined]
[5] #argmax#728
@ .\reducedim.jl:1103 [inlined]
[6] _fast_argmax
@ C:\Users\racinsky\.julia\packages\Flux\ZnXxS\src\onehot.jl:211 [inlined]
[7] onecold(y::MaybeHotMatrix{UInt32, Int64, Bool}, labels::UnitRange{Int64}) (repeats 2 times)
@ Flux C:\Users\racinsky\.julia\packages\Flux\ZnXxS\src\onehot.jl:205
[8] top-level scope
@ c:\Projects\others\JsonGrinder.jl\examples\recipes.jl:70
in expression starting at c:\Projects\others\JsonGrinder.jl\examples\recipes.jl:70
It works as
julia> Flux.onecold(Flux.onehotbatch(t2))
3-element Vector{Int64}:
1
2
3
but that feels cumbersome.
Noting down some areas where significant speedups may be achieved:
vcat
in ProductNode
s leads to a lot of copyingNGramMatrix
multiplication)BagNode
s in a similar fashionIf we want to calculate gradients with respect to input, we should start by adding
Flux.@functor ArrayNode
Flux.@functor BagNode
Flux.@functor TreeNode
This allow Flux.params
to return data nodes and we can calculate gradient with respect to it.
Following code
using Mill, JsonGrinder
e1 = ExtractCategorical(["Olda", "Tonda", "Milda"])
node11 = e1("Olda")
n1 = BagNode(missing, AlignedBags([0:-1]))
n2 = WeightedBagNode(node11, [1:nobs(node11)], ones(4))
reduce(catobs, [n1, n2])
produces
ERROR: UndefVarError: B not defined Stacktrace: [1] reduce(::typeof(catobs), ::Array{AbstractBagNode,1}) at C:\Users\racinsky\.julia\packages\Mill\EkuQf\src\datanodes\datanode.jl:38 [2] top-level scope at none:0
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
We need to document the design of the library. Particuarly why do we have MillModel
, MillFunction
, and Aggregation
.
Also, we should describe, how one can add the custom data type and how to extend reflection in a model.
The question is, if we really need MillFunction & friends.
Allow model to accept AbstractVector{<:AbstractMillNode}
so user wouldn't have to call model(reduce(catobs, x))
and could just call model(x)
instead.
Add
using Zygote
(m::AbstractMillModel)(x::AbstractVector{<:AbstractMillNode}) = m(Zygote.@ignore(reduce(catobs, x)))
so reduction can be used in gradient computation.
Now it raises error and we don't need to derive through that, so we can stop gradient here and it simplifies the usage.
We have the problem that models cannot be edited, since they are all static. A following replaceinmodel adds functionality to replace parts of the model and update upstream parts. The current limitation is that we cannot change dimensionality.
replaceinmodel(x, oldnode, newnode) = x
replaceinmodel(x::Mill.ArrayModel, oldnode, newnode) = x == oldnode ? oldnode : Mill.ArrayModel(replaceinmodel(x.m, oldnode, newnode))
function replaceinmodel(x::Mill.BagModel, oldnode, newnode)
if x == oldnode
return(newnode)
else
return(BagModel(replaceinmodel(x.im, oldnode, newnode),
replaceinmodel(x.a, oldnode, newnode),
replaceinmodel(x.bm, oldnode, newnode)))
end
end
function replaceinmodel(x::Mill.ProductModel, oldnode, newnode)
if x == oldnode
return(newnode)
else
return(ProductModel(tuple([replaceinmodel(m, oldnode, newnode) for m in x.ms]...),
replaceinmodel(x.m, oldnode, newnode)))
end
end
While writing readme, I have found (again) extremely slow creation of model (again.)
using Mill, Flux
julia> ds = BagNode(
TreeNode(
(BagNode(ArrayNode(randn(4,10)),[1:2,3:4,5:5,6:7,8:10]),
ArrayNode(randn(3,5)),
BagNode(
BagNode(ArrayNode(randn(2,30)),[i:i+1 for i in 1:2:30]),
[1:3,4:6,7:9,10:12,13:15]),
ArrayNode(randn(2,5)))),
[1:1,2:3,4:5])
m, k = reflectinmodel(ds, d -> Dense(d, 3, relu), d -> SegmentedMeanMax())
It seems to be problem with inference, since the second time this is invoked, it is superfast.
So the problem is in printing. If we do
m, k = reflectinmodel(ds, d -> Dense(d, 3, relu), d -> SegmentedMeanMax());
then it is fast but then just printing the model
m
is slow.
When I have vector of bags [bag with treenodes, bag with treenodes, missing bag], the reduction gets broken, causing
┌ Error: cannot reduce Any
└ @ Mill ...\Mill\aKR6u\src\datanodes\datanode.jl:36
Any reason Flux.jl compat version should not be updated to latest 0.11.6?
We should add segmented_sum for completness and for the use-case where number of instances in a sample matters.
to AbstractMillModel
so that it is clear that it is an abstract type.
An intesting question, what should be the output of this?
julia> reduce(catobs, [Matrix{Union{Missing, Float64}}(undef,1,0),[2.3 1.0]])
1×2 Array{Union{Missing, Float64},2}:
2.3 1.0
Should it be a Matrix
of Unions
or that of Float64
?
This is part of testing from JsonGrinder
j2 = JSON.parse("""{"c": { "a": {"a":[2,3],"b":[5,6]}}}""")
j3 = JSON.parse("""{"b": {"a":[1,2,3],"b": 1}}""")
j4 = JSON.parse("""{"b": {}}""")
j5 = JSON.parse("""{"b": {}}""")
j6 = JSON.parse("""{}""")
sch = JsonGrinder.schema([j2,j3])
extractor = suggestextractor(sch)
dss = map(extractor, [j2,j3,j4,j5,j6])
dss = map(s -> s[:c], dss)
dss = map(s -> s[:a], dss)
dss = map(s -> s[:a], dss)
dss = map(s -> s.data, dss)
ds = reduce(catobs, dss)
end
which crashes julia. I think that the problem is with bad promotion to any in catobs, when we are handling missings
. I hope we have not opened can of worms with that.
It somehow crashes on reduce(catobs
of this
5-element Array{Array,1}:
Float32[2.0; 3.0]
[missing, missing]
[missing, missing]
[missing, missing]
[missing, missing]
but when I create the above type manually, it does not crashes (but produces a wrong output of concatenating vectors to a single vector.
When terseprint is false, methods
are broken.
function experiment(ds::LazyNode{T}) where {T<:Symbol}
@show ds
@show T
end
julia> Mill.terseprint(true)
true
julia> methods(experiment)
# 1 method for generic function "experiment":
[1] experiment(ds::LazyNode{…}) where T<:Symbol in Main at C:\Projects\others\Mill.jl\test\lazynode.jl:30
julia> Mill.terseprint(false)
false
julia> methods(experiment)
# 1 method for generic function "experiment":
[1] Error showing value of type Base.MethodList:
ERROR: type DataType has no field var
Stacktrace:
[1] getproperty at .\Base.jl:28 [inlined]
[2] show(::IOContext{Base.GenericIOBuffer{Array{UInt8,1}}}, ::Type{
SYSTEM (REPL): showing an error caused an error
ERROR: type DataType has no field var
Stacktrace:
[1] getproperty at .\Base.jl:28 [inlined]
[2] show(::IOContext{REPL.Terminals.TTYTerminal}, ::Type{
SYSTEM (REPL): caught exception of type ErrorException while trying to handle a nested exception; giving up
consistently, when terse printing, it works well, without terse print it breaks terribly
Such as "julia", "julia-package", "multi-instance learning", etc..
Now the unicode input is not tested by ngram matrix. It would be good to have this covered. Look e.g. at the tests in https://github.com/JuliaLogging/LoggingFormats.jl/blob/master/test/runtests.jl#L8-L12
We should create a special version of reduce for Bags to prevent creating a large number of specialized function for different number of parameters.
For following model, accessing nodes is sometimes broken.
Following code, using file in stored here.
using Mill, JLD2, FileIO
@load "broken_model.jld2" model
show_traversal(model)
model["zE"]
causes
ERROR: BoundsError: attempt to access String at index [1:2] Stacktrace: [1] checkbounds at .\strings\basic.jl:193 [inlined] [2] getindex at .\strings\string.jl:247 [inlined]
At the moment, Mill can handle missing values only in bags, but not in ArrayNodes, i.e. in terminal values.
The question is, if want (and should) add support for missing values in Strings, Categorical Arrays, and in Dense Arrays. Pevnak suggests
Missing in dense matrices will be stored in x = Matrix{Union{Missing, T}} where {T<:Number}
. Before the multiplication, we substitute missing with some values (trainable), by which x
will be converted to Matrix{T}
and can be handled as the usual multiplication. I propose to handle substitution and multiplication in ImputingMatrix
which will encapsulate regular matrix and the vector with values substituted for missing. Substituting parameters will be made trainable, therefore the network will have a freedom to insert whatever she wishes.
Missing in categorical matrices with k
categories (factors in R) will be treated as another k+1
value, which means that if categorical matrix x
of size k, n is multiplied from left with a weight matrix w
of size o,k, we effectively add one more dimension, i.e. w
would be of size o,k+1 and x would be of size k+1, n. The question is, if this "lifting" should be handled externally, e.g. in JsonGrinder or in Mill. For sake of consistency, I would recommend Mill, which means that we would have to create our own OneHotVector
, since the default cannot store missing and the conversion. Overloading constructor also does not make much sense, because of the following ambiguity. Assume that OneHotVector(missing, 10)
is converted to OneHotVector(11,11)
. What should I do with OneHotVector(1,10)
? Was the original or overloaded variant desired?
Missing strings in NgramMatrix
would be handled exactly the same as in categorical matrices. We will extend weight matrix by one more column, which will be used to signal missing.
Notice, that handling missing in Strings and Categorical matrices differs from that in dense matrices as in former cases we are substituting outputs whereas in the latter we are substituting inputs.
ArrayModel(::T) where T<:Union{Function, Chain, Dense}
doesn't allow custom functors for example. We should replace with Base.Callable
or with Any
.
I think we could remove https://github.com/CTUAvastLab/Mill.jl/blob/master/src/util.jl#L30-L41 since FluxML/Flux.jl#1756 has been merged and the speedup compared to Flux 0.12.8 may not be negligible for our datasets.
AlignedBags
to ContigousBags
Maybe we should get rid of wrapping outputs of models into ArrayNode
and return them as plain Arrays.
Currently, @inferred
test fails for ProductNodes
.
and also in docs
Verify that all layers can correctly handle computation in Float32
using Mill
x = ArrayNode([1f0 2f0; missing missing])
reflectinmodel(x, d -> Chain(Dense(d,10, selu), Dense(10, 10)))
crashes
but
using Mill
x = ArrayNode([1f0 2f0; missing missing])
reflectinmodel(x, d ->Dense(d, 10))
works, which suggest the problem in make_imputing
I will abuse the system of issues and post some questions for @pevnak for his attention implementation.
Why do we need a segmented sum? Why can't we just use a normal one?
https://github.com/pevnak/Mill.jl/blob/2cec5076a6350aa6a292edf59edd8be3e9e7f5b5/example/attention.jl#L11
Do the 4 in the Dense(d, 4, selu)
have some relation to the 4 in SegmentedSum(4)
?
https://github.com/pevnak/Mill.jl/blob/2cec5076a6350aa6a292edf59edd8be3e9e7f5b5/example/attention.jl#L34
reduce(catobs, nodes)
fails for following structure and data, where TreeNode contains vector instead of tuple.
using Mill, JsonGrinder
e1 = ExtractCategorical(["Olda", "Tonda", "Milda"])
e2 = ExtractCategorical(collect(1:10))
node11 = e1("Olda")
node12 = e2([1, 2, 5])
node21 = e1("Tonda")
node22 = e2(4)
t1 = TreeNode([node11, node12])
t2 = TreeNode([node21, node22])
reduce(catobs, [t1, t2])
produces
ERROR: MethodError: no method matching _cattuples(::Array{Array{ArrayNode{SparseMatrixCSC{Float32,Int64},Nothing},1},1})
In Mill 2.4.1 the model created by default cannot handle missing values unless they are in the sample on which we are creating the model. This is very confusing to almost everyone. It can be trivially fixed by adding these
Mill._make_imputing(x::MaybeHotVector, t::Dense) = Mill.postimputing_dense(t)
Mill._make_imputing(x::MaybeHotMatrix, t::Dense) = Mill.postimputing_dense(t)
Mill._make_imputing(x::NGramMatrix, t::Dense) = Mill.postimputing_dense(t)
therefore I vote for adding them asap and add a possible control later. Almost everyone is caught by this nuance, which requires super high understanding of Julia, which seems to be generally missing.
after commenting out our magic around Base.show, methods(Base.show)
is still broken
methods(Base.show)
works, but
using Mill
methods(Base.show)
is broken
ArrayNode([0.0f0 missing 0.0f0 0.0f0 1.0f0]) == ArrayNode([0.0f0 missing 0.0f0 0.0f0 1.0f0])
causes
ERROR: TypeError: non-boolean (Missing) used in boolean context
Stacktrace:
[1] ==(::ArrayNode{Array{Union{Missing, Float32},2},Nothing}, ::ArrayNode{Array{Union{Missing, Float32},2},Nothing}) at C:\Projects\others\Mill.jl\src\datanodes\arraynode.jl:45
[2] top-level scope at none:1
function Base.vcat(as::Mill.ArrayNode...)
data = vcat([a.data for a in as]...)
metadata = Zygote.@ignore reduce(vcat, [a.metadata for a in as])
Mill.ArrayNode(data, metadata)
end
For example
julia> using Mill
[ Info: Precompiling Mill [1d0525e4-8992-11e8-313c-e310e1f6ddea]
[ Info: CUDAdrv.jl failed to initialize, GPU functionality unavailable (set JULIA_CUDA_SILENT or JULIA_CUDA_VERBOSE to silence or expand this message)
julia> methods(ArrayNode)
# 2 methods for type constructor:
[1] Error showing value of type Base.MethodList:
ERROR: type UnionAll has no field name
Stacktrace:
[1] getproperty at ./Base.jl:15 [inlined]
[2] show(::IOContext{Base.GenericIOBuffer{Array{UInt8,1}}}, ::Type{
SYSTEM (REPL): showing an error caused an error
ERROR: type UnionAll has no field name
Stacktrace:
[1] getproperty at ./Base.jl:15 [inlined]
[2] show(::IOContext{REPL.Terminals.TTYTerminal}, ::Type{
SYSTEM (REPL): caught exception of type ErrorException while trying to handle a nested exception; giving up
julia> Mill.terseprint(false)
false
julia> methods(ArrayNode)
# 2 methods for type constructor:
[1] Error showing value of type Base.MethodList:
ERROR: MethodError: no method matching show_datatype(::IOContext{Base.GenericIOBuffer{Array{UInt8,1}}}, ::Type{
Stacktrace:
[1] show(::IOContext{Base.GenericIOBuffer{Array{UInt8,1}}}, ::Type{
SYSTEM (REPL): showing an error caused an error
ERROR: MethodError: no method matching show_datatype(::IOContext{REPL.Terminals.TTYTerminal}, ::Type{
Stacktrace:
[1] show(::IOContext{REPL.Terminals.TTYTerminal}, ::Type{
SYSTEM (REPL): caught exception of type MethodError while trying to handle a nested exception; giving up
We should include some basic performance checks
And especially how training works, and how are missing values then used during inference.
Convolution needs to implement gradient with respect to input.
The gradient computation crashes on following code, using following data:
https://ufile.io/8wl0eit1
using JLD2, FileIO, Flux, Mill
@load "weird_node.jld2" x1 y
model = reflectinmodel(x1, d -> Chain(Dense(d, settings.k, relu),),
d -> SegmentedMeanMax(d),
b = Dict("" => d -> Chain(Dense(d, 2),)))
ps = Flux.params(model)
loss = (model, x, y) -> Flux.logitcrossentropy(model(x).data,y)
loss(model, x1,y)
Flux.logitcrossentropy(model(x1).data,y)
gradient(() -> loss(model, x1,y), ps)
with error
ERROR: MethodError: no method matching zero(::Type{Any}) Closest candidates are: zero(::Type{Union{Missing, T}}) where T at missing.jl:105 zero(::Type{Missing}) at missing.jl:103 zero(::Type{LibGit2.GitHash}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\LibGit2\src\oid.jl:220
Use functionality made by @racinmat, also in ExplainMill and MillExtensions?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.