lcsb-biocore / gigasom.jl Goto Github PK
View Code? Open in Web Editor NEWHuge-scale, high-performance flow cytometry clustering in Julia
Home Page: http://git.io/GigaSOM.jl
License: Apache License 2.0
Huge-scale, high-performance flow cytometry clustering in Julia
Home Page: http://git.io/GigaSOM.jl
License: Apache License 2.0
@JuliaRegistrator register
These deprecation warnings are currently being thrown:
┌ Warning: `DataFrame(columns::AbstractMatrix)` is deprecated, use `DataFrame(columns, :auto)` instead.
│ caller = ip:0x0
└ @ Core :-1
Let's remove them 👍
The following statement is posted under "High-Level Overview" section of the doc:
And I tried with the following funciton:
function test()
addprocs(4)
d = [1 2 1 4 5;3 2 1 6 5;3 1 1 7 4]
som = initGigaSOM(d, 3, 3)
som = trainGigaSOM(som, d)
mapToGigaSOM(som, d)
e = embedGigaSOM(som,d)
end
And lead to a problem with "worker 2":
ERROR: On worker 2:
KeyError: key GigaSOM [a03a9c34-069e-5582-a11c-5c984cab887c] not found
getindex at .\dict.jl:467 [inlined]
root_module at .\loading.jl:968 [inlined]
deserialize_module at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Serialization\src\Serialization.jl:953
handle_deserialize at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Serialization\src\Serialization.jl:855
deserialize at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Serialization\src\Serialization.jl:773
deserialize_datatype at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Serialization\src\Serialization.jl:1251
handle_deserialize at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Serialization\src\Serialization.jl:826
deserialize at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Serialization\src\Serialization.jl:773
handle_deserialize at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Serialization\src\Serialization.jl:833
deserialize at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Serialization\src\Serialization.jl:773 [inlined]
deserialize_msg at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\messages.jl:99
#invokelatest#1 at .\essentials.jl:710 [inlined]
invokelatest at .\essentials.jl:709 [inlined]
message_handler_loop at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\process_messages.jl:185
process_tcp_streams at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\process_messages.jl:142
#99 at .\task.jl:356
Stacktrace:
[1] #remotecall_fetch#143 at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\remotecall.jl:394 [inlined]
[2] remotecall_fetch(::Function, ::Distributed.Worker, ::Distributed.RRID) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\remotecall.jl:386
[3] #remotecall_fetch#146 at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\remotecall.jl:421 [inlined]
[4] remotecall_fetch at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\remotecall.jl:421 [inlined]
[5] call_on_owner at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\remotecall.jl:494 [inlined]
[6] fetch(::Future) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Distributed\src\remotecall.jl:533
[7] distribute_array(::Symbol, ::Array{Float64,2}, ::Array{Int64,1}; dim::Int64) at C:\Users\zli3\.julia\packages\GigaSOM\QfzjJ\src\base\distributed.jl:88
[8] distribute_array at C:\Users\zli3\.julia\packages\GigaSOM\QfzjJ\src\base\distributed.jl:78 [inlined]
[9] trainGigaSOM(::Som, ::Array{Int64,2}; kernelFun::Function, metric::Distances.Euclidean, somDistFun::Function, knnTreeFun::Type{T} where T, rStart::Float64, rFinal::Float64, radiusFun::Function, epochs::Int64) at C:\Users\zli3\.julia\packages\GigaSOM\QfzjJ\src\analysis\core.jl:159
[10] trainGigaSOM at C:\Users\zli3\.julia\packages\GigaSOM\QfzjJ\src\analysis\core.jl:156 [inlined]
[11] test() at D:\...\runSOM.jl:27
[12] top-level scope at none:1
Any thoughts?
Especially the section on large files
@JuliaRegistrator register
RRID:SCR_019020
A PkgEval run for a Julia pull request which changes the generated numbers for rand(a:b)
indicates that the tests of this package might fail in Julia 1.5 (and on Julia current master branch).
Also, you might be interested in using the new StableRNGs.jl registered package, which provides guaranteed stable streams of random numbers across Julia releases.
Apologies if this is a false positive. Cf. https://github.com/JuliaCI/NanosoldierReports/blob/ab6676206b210325500b4f4619fa711f2d7429d2/pkgeval/by_hash/52c2272_vs_47c55db/logs/GigaSOM/1.5.0-DEV-87d2a04de3.log
@JuliaRegistrator register
Adding a sample_id column to the splitted files or in a separated vector in each worker to re-construct the relation between the training data after splitting.
The REQUIRE file could not be found.
cc: @laurentheirendt
@JuliaRegistrator register
Adds a user friendly feature for extracting sub-populations and re-clustering
@JuliaRegistrator register
Is your feature request related to a problem? Please describe.
I trained my SOM for 2000 epochs, and would like to store intermediate results (each 500 epochs), something like:
datainfo = loadCSVSet(:test,files,header=false)
som = initGigaSOM(datainfo, 20, 20, seed=seed)
radius_list = [10, 8.9, 7.8, 6.7, 5.6, 4.5, 3.4, 2.3, 1.2, 0.5, 0.1]
for i in 1:10:
som = trainGigaSOM(som, datainfo, rStart=radius_list[i], rFinal=radius_list[i+1], epochs=200, radiusFun=linearRadius)
e = embedGigaSOM(som, datainfo)
e2 = distributed_collect(e)
writedlm(string("GigaSOM_iker_1400k_embed_seed",seed,"_epochs",epochs,".tsv"),e2,'\t')
open(f -> serialize(f, som), ("partly_trained_%s.jls", i), "w");
end
I want to assess if I did enough training. Problem is that in with this strategy I can only use a linearRadius, or do a very ugly hack
inputing specific radius function.
Describe the solution you'd like
Perhaps one could input starting/ending epoch/iteration to the train function (here:
GigaSOM.jl/src/analysis/core.jl
Line 130 in f4e712b
Describe alternatives you've considered
Allow to "do something" (call a function) each X epochs in order to serialize the som object, or save the coordinates.
Additional context
none
@JuliaRegistrator register
Although GigaSOM was developed with its main focus on mass cytometry data, it could be used for any kind of multi-dimensional data clustering.
A commun file format as CSV would allow for other data to be imported.
@JuliaRegistrator register
Summary
fix seed for random initialization of the SOM
Expected behavior
re-producibility of the som clustering
@JuliaRegistrator register
Julia implementation of the consensus clustering to avoid the usage of RCall wrapper function to the ConsensusClusterPlus package.
@JuliaRegistrator register
@JuliaRegistrator register
@JuliaRegistrator register
@JuliaRegistrator register
It'd be nice if there were better support for different distance metrics for training the SOM and mapping winners. It looks like this can be implemented easily by pass-through of a metric argument directly to the NearestNeighbors::BruteTree
@JuliaRegistrator register
I would like to more precisely control the dynamic range of the expressionPalette. For example my expressions range from -2 to 7, but the interesting region lies from -1 to 1. I want to adjust my colour range so any expression values outside of the limits [-1,1] are clipped to the boundary colours.
It is possible to to this by pre-processing the input data, which would re-calculate the inputs each time I change the colour range which doesn't seem to be necessary. It makes more sense to have to colour representation change, not the underlying data.
As you can imagine, this will eventually be used in an interactive figure :)
@JuliaRegistrator register
@JuliaRegistrator register
@JuliaRegistrator register
Versions after 0.6.8 have the majority of functions erased from dataops.jl
Any
Steps to reproduce
1. Install latest GigaSOM
2. Import distributed dataset
3. Run dselect
Expected behavior
dselect works
Actual behavior
dselect won't be found
Additional information
@JuliaRegistrator register
@JuliaRegistrator register
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
@JuliaRegistrator register
@JuliaRegistrator register
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.