biddata / bidmach Goto Github PK
View Code? Open in Web Editor NEWCPU and GPU-accelerated Machine Learning Library
License: BSD 3-Clause "New" or "Revised" License
CPU and GPU-accelerated Machine Learning Library
License: BSD 3-Clause "New" or "Revised" License
in the end of https://github.com/BIDData/BIDMach/wiki/Quickstart
there is an accuracy formula
val p = ctest * cx + (1 - ctest) * (1 - cx)
it makes me confused. What does this formula means? and the dimensions of ctest and cx are same, how can them multipy?
Thanks!
When trying to create a dnn with DNN.learner
, the ADAGrad initialization step from Learner.init
gives the error
java.lang.NullPointerException
at BIDMach.updaters.ADAGrad$$anonfun$init$1.apply$mcVI$sp(ADAGrad.scala:34)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at BIDMach.updaters.ADAGrad.init(ADAGrad.scala:33)
at BIDMach.Learner.init(Learner.scala:45)
This seems to be due to DNN waiting to initialize the modelmats until the first call to forward.
It looks like the correct way to set a DNN to use ADAGrad is to set aopts
rather than adding an updater to the learner, is this correct?
Hi,
I'm trying to write unit tests with scalatest toolkit for my ML modules. When I switch CUDA on, I've got the following errors:
[info] - error. *** FAILED ***
[info] java.lang.RuntimeException: CUDA alloc failed global function call is not configured
[info] at BIDMat.GMat$.apply(GMat.scala:1094)
[info] at BIDMat.GMat$.zeros(GMat.scala:1064)
...
The error happens when I have more than 1 test classes. But there was no error when there was only one test class. Does it have something to do with using BIDMat to access CUDA simulaneously? I did not use the caching of BIDMat. Or the error happens when there were data coming in and going out of GPU memory at the same time. If so, is there any parameter that I can set so that there is no exception in such a situation? What should I do if I want to have multiple threads using BIDMat based modules? Definitely the data copy operations can happen any time in such as concurrent environment.
Lizhen
The benchmarks of this project are amazing! I would like to use it.
cd scripts and ./getdata.sh, datasets can get successfully, but cant convert. There are the errors:
/xxxx/BIDMach/scripts/../bin/tparse.exe: No such file or directory
/tmp/scalacmd9053744229355367880.scala:2: error: not found: value BIDMat
Before download and convert datasets, I run ./sbt package and ./bidmach, both look right.
Thank you!
Hi all,
I was using this AMI: http://tleyden.github.io/blog/2014/10/25/cuda-6-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/ for cuda environment on a aws gx2.2xlarge instance.
after download this bidmach bundle: http://bid2.berkeley.edu/bid-data-project/BIDMach_1.0.0-linux-x86_64.tar.gz, I scripts/getdata.sh download the data.
I stuck at nn.train method while training a sample data from movielens10M, see following:
Welcome to Scala version 2.11.2 (OpenJDK 64-Bit Server VM, Java 1.7.0_79).
Type in expressions to have them evaluated.
Type :help for more information.
scala> val a = loadSMat("data/movielens/train1.smat.lz4")
a: BIDMat.SMat =
( 172, 14574) 4
( 187, 14574) 4
( 195, 14574) 4
( 207, 14574) 4
( 215, 14574) 4
( 222, 14574) 3
( 224, 14574) 3
( 226, 14574) 3
... ... ...
scala> val (nn, opts) = NMF.learner(a)
nn: BIDMach.Learner = Learner(BIDMach.datasources.MatDS@36895c35,BIDMach.models.NMF@7404b78b,null,BIDMach.updaters.IncNorm@61ae422e,BIDMach.models.NMF$xopts$4@777b0c1b)
opts: BIDMach.Learner.Options with BIDMach.models.NMF.Opts with BIDMach.datasources.MatDS.Opts with BIDMach.updaters.IncNorm.Opts = BIDMach.models.NMF$xopts$4@777b0c1b
scala> nn.train
corpus perplexity=65134.014613
pass= 0
device is 0
java.lang.RuntimeException: Cuda error in GSMat() too many resources requested for launch
at BIDMat.GSMat$.apply(GSMat.scala:325)
at BIDMat.GSMat$.newOrCheckGSMat(GSMat.scala:480)
at BIDMat.GSMat$.newOrCheckGSMat(GSMat.scala:528)
at BIDMat.GSMat$.fromSMat(GSMat.scala:409)
at BIDMat.GSMat$.apply(GSMat.scala:330)
at BIDMach.models.Model$$anonfun$copyMats$1.apply$mcVI$sp(Model.scala:106)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at BIDMach.models.Model.copyMats(Model.scala:94)
at BIDMach.models.Model.doblockg(Model.scala:73)
at BIDMach.Learner.retrain(Learner.scala:83)
at BIDMach.Learner.train(Learner.scala:49)
... 33 elided
When the configure script in jni/src is run, it creates 'Makefile.incl'. On Ubuntu 14.04 it has the following issues:
val (nn,opts)=DNN.learnerX(loadSMat(indir+"trainData000.smat.lz4"),FMat(loadSMat(indir+"trainLabels000.smat.lz4")));
opts.aopts = opts;
opts.featType = 1; // (1) feature type, 0=binary, 1=linear
opts.addConstFeat = false; // add a constant feature (effectively adds a $\beta_0$ term to $X\beta$)
opts.batchSize=500;
opts.reg1weight = 0.0001;
opts.lrate = 0.2f;
opts.texp = 0.4f;
opts.npasses = 5;
opts.links = iones(132,1);
DNN.dlayers(3,100,0.25f,132,opts,2);
nn.train;
runs just fine, but if I don't cast the targets as an FMat, i.e.,
val (nn,opts)=DNN.learnerX(loadSMat(indir+"trainData000.smat.lz4"),loadSMat(indir+"trainLabels000.smat.lz4"));
opts.aopts = opts;
opts.featType = 1; // (1) feature type, 0=binary, 1=linear
opts.addConstFeat = false; // add a constant feature (effectively adds a $\beta_0$ term to $X\beta$)
opts.batchSize=500;
opts.reg1weight = 0.0001;
opts.lrate = 0.2f;
opts.texp = 0.4f;
opts.npasses = 5;
opts.links = iones(132,1);
DNN.dlayers(3,100,0.25f,132,opts,2);
nn.train;
gives
scala> nn.train;
pass= 0
scala.MatchError: ( 1 0.59643 1 0.54870 0.96072 1 1 0.97798...
1 0.00054444 4.6897e-12 0.93011 0.99481 0.98594 4.0819e-37 0.99795...
0.99987 1.1189e-14 0.0054939 0.96049 0.96663 0.00032491 0 0.010586...
1.2195e-06 0.00018742 3.1474e-09 0.28522 0.54711 5.9957e-10 0 0.92651...
0.00010903 1 1 0.31402 0.10718 0.016666 1 0.68628...
1 9.1819e-18 0.99999 0.032690 0.99763 1.0000 1 0.94152...
.. .. .. .. .. .. .. ..
,( 14, 0) 1
( 6, 1) 1
( 40, 2) 1
( 59, 3) 1
( 46, 4) 1
( 1, 5) 1
( 94, 6) 1
( 60, 7) 1
... ... ...
,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) (of class scala.Tuple3)
at BIDMach.models.GLM$.derivs(GLM.scala:475)
at BIDMach.networks.DNN$GLMLayer.backward(DNN.scala:347)
at BIDMach.networks.DNN$Layer.backward(DNN.scala:230)
at BIDMach.networks.DNN.dobatch(DNN.scala:161)
at BIDMach.models.Model.dobatchg(Model.scala:101)
at BIDMach.Learner.retrain(Learner.scala:87)
at BIDMach.Learner.train(Learner.scala:53)
... 33 elided
Running the getdata script for the tutorials doesn't work if there is a space in the filepath, e.g. if the BIDMach distribution is in a folder called /Users/myname/Desktop/Stress\ Project/bidmach
then running getdata.sh gives the following errors:
getdata.sh: line 17: /Users/myname/Desktop/Stress: is a directory
getdata.sh: line 19: /Users/myname/Desktop/Stress: is a directory
getdata.sh: line 21: /Users/myname/Desktop/Stress: is a directory
getdata.sh: line 25: /Users/myname/Desktop/Stress: is a directory
getdata.sh: line 27: /Users/myname/Desktop/Stress: is a directory
Is there a plan for publishing AWS image for 1.0.3? We have tried creating one but ran into much of problems.
One of the main thing I was looking for using predictor for KMeans. predictor method is missing in 1.0.So, I can use 1.0 to train the models and use 1.0.3 for prediction. Is there a way to save the model created using 1.0 and use it for prediction in 1.0.3? I was able to save the model in 1.0. I couldn't find any example of how to load an existing model (for KMeans).
Regards
When I try to run any of the get(...) scripts, I get "bad substitution" error:
~/software/BIDMach_0.9.0-linux-x86_64/scripts$ ./getdigits.sh
./getdigits.sh: 3: ./getdigits.sh: Bad substitution
Changing
BIDMACH_SCRIPTS="${BASH_SOURCE[0]}"
to
BIDMACH_SCRIPTS="${BASH_SOURCE}"
seems to help, although later I get the errors:
(...)
Scanning lyrl2004_tokens_test_pt0.dat.gz
1490963 lines
Scanning lyrl2004_tokens_test_pt1.dat.gz
1501165 lines
Scanning lyrl2004_tokens_test_pt2.dat.gz
1489662 lines
Scanning lyrl2004_tokens_test_pt3.dat.gz
1390993 lines
Scanning lyrl2004_tokens_train.dat.gz
171542 lines
Writing Dictionary
2606875 lines processed
./getrcv1.sh: 66: ./getrcv1.sh: bidmach: not found
Loading nips data
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (77) error setting certificate verify locations:
CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
Uncompressing docword.nips.txt.gz
gzip: docword.nips.txt.gz: No such file or directory
0 lines processed
./getuci.sh: 31: ./getuci.sh: bidmach: not found
clearing up
(...)
It seems like building a dnn learner using the learnerX method with addConstFeat=true
doesn't actually add a constant feature.
val (nn,opts)= DNN.learnerX(indir+"trainData%03d.smat.lz4",indir+"trainLabels%03d.fmat.lz4");
opts.aopts = opts;
opts.featType = 1;
opts.addConstFeat = true;
opts.batchSize=5000;
opts.reg1weight = 0.0001;
opts.lrate = 0.5f;
opts.texp = 0.4f;
opts.npasses = 1;
opts.links = iones(nrValidMoods,1);
DNN.dlayers(5,200,0.5f,132,opts,2);
After training using nn.train
, I run
size(nn.modelmats(0))
res17: (Int, Int) = (200,75000)
The input data has 75000 features, so with the const feature this should be (200,75001). Note that predict behaves as expected and adds the const feature, so with this option nn.predict
gives a dimension mismatch error.
This is not really an issue, but more of an unintuitive behavior.
When creating a (GLM) learner and predictor in the same function, we get by default: options.autoReset = false (for both learner opts and predictor opts).
On the other hand, when creating a GLM learner and GLM predictor independently, we get by default options.autoReset = true for both of them. This means that GLM predictor will reset the model weights (modelmat) to zero after running "predict" once. Should a predictor be able to reset the model weights resulting from a learner?
Should be able to just copy the lib directory from the tarball, but this needs to be documented?
I heard that u have integrated with Caffee, however, Caffee supports GPU as well.
Could you please list the reasons why need BIDMach for Caffee?
What's more, Support the Spark Integration?
How to integrate with Spark?
Please start a https://gitter.im/ Chat Room for BIDMach / BIDMat. Documentation for the projects is sparse, and adoption will remain limited unless the user community has an opportunity to interact with the project authors and other community members.
This approach is used very successfully by a large number of github-hosted projects.
for compute capacity 3.0 or above, the maximum of grid dimension are (x,y,z = 2147483647,65535,65535)
the code jni/src/GLM.cu
line 78,79
gridp->y = 1 + (nblocks-1)/65536;
gridp->x = 1 + (nblocks-1)/gridp->y;
if nblocks is very large, then gridp->y would exceed
I go through the quickstart example on Windows 7. When I try to call mm.train
the second time, I get the following error. I need to exit bidmach and run it anew to be able to train again.
scala> mm.train
corpus perplexity=5582,125391
pass= 0
2,00%, ll=-0,693, gf=0,116, secs=6,7, GB=0,02, MB/s= 2,86, GPUmem=0,03
16,00%, ll=-0,134, gf=0,630, secs=15,0, GB=0,12, MB/s= 8,10, GPUmem=0,03
30,00%, ll=-0,123, gf=0,825, secs=21,9, GB=0,22, MB/s=10,16, GPUmem=0,02
44,00%, ll=-0,102, gf=0,930, secs=28,7, GB=0,33, MB/s=11,31, GPUmem=0,02
58,00%, ll=-0,094, gf=0,995, secs=35,6, GB=0,43, MB/s=12,04, GPUmem=0,02
72,00%, ll=-0,074, gf=1,040, secs=42,4, GB=0,53, MB/s=12,49, GPUmem=0,02
87,00%, ll=-0,085, gf=1,075, secs=49,1, GB=0,63, MB/s=12,89, GPUmem=0,02
100,00%, ll=-0,069, gf=1,097, secs=55,8, GB=0,73, MB/s=13,02, GPUmem=0,02
Time=55,8000 secs, gflops=1,10
scala> mm.train
corpus perplexity=5582,125391
java.lang.RuntimeException: CUDA alloc failed initialization error
at BIDMat.GMat$.apply(GMat.scala:1094)
at BIDMat.GMat$.newOrCheckGMat(GMat.scala:1780)
at BIDMat.GMat$.newOrCheckGMat(GMat.scala:1814)
at BIDMat.GMat$.apply(GMat.scala:1100)
at BIDMach.models.RegressionModel.init(Regression.scala:29)
at BIDMach.models.GLM.init(GLM.scala:25)
at BIDMach.Learner.init(Learner.scala:37)
at BIDMach.Learner.train(Learner.scala:45)
at .(:26)
at .()
at .(:7)
at .()
at $print()
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:734)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:983)
at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:604)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:568)
at scala.tools.nsc.interpreter.ILoop.reallyInterpret$1(ILoop.scala:760)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:8
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:717)
at scala.tools.nsc.interpreter.ILoop.processLine$1(ILoop.scala:581)
at scala.tools.nsc.interpreter.ILoop.innerLoop$1(ILoop.scala:588)
at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:591)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILo
op.scala:882)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scal
a:837)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scal
a:837)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClass
Loader.scala:135)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:837)
at scala.tools.nsc.MainGenericRunner.runTarget$1(MainGenericRunner.scala
:83)
at scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:96)
at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:105)
at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)
Just downloaded and trying to run the Quickstart steps. I'm not clear on whether it's sufficient to just install the executable bundle download and have JRE or whether I need to install IPython/Scala as well.
I get this error when running ./bidmach:
Error: Could not find or load main class scala.tools.nsc.MainGenericRunner
I'm getting this in Cygwin on running getdata.sh:
Loading RCV1 v2 data
/cygdrive/e/BidMach/BIDMach_1.0.0-win-x86_64/data/rcv1
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getrcv1.sh: line 29: wget: command not found
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getrcv1.sh: line 29: wget: command not found
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getrcv1.sh: line 29: wget: command not found
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getrcv1.sh: line 29: wget: command not found
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getrcv1.sh: line 38: wget: command not found
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getrcv1.sh: line 44: wget: command not found
gzip: lyrl2004_tokens_test_pt0.dat.gz: No such file or directory
gzip: lyrl2004_tokens_test_pt1.dat.gz: No such file or directory
gzip: lyrl2004_tokens_test_pt2.dat.gz: No such file or directory
gzip: lyrl2004_tokens_test_pt3.dat.gz: No such file or directory
gzip: lyrl2004_tokens_train.dat.gz: No such file or directory
Error: Could not find or load main class scala.tools.nsc.MainGenericRunner
Loading nips data
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getuci.sh: line 26: wget: command not found
mv: cannot stat ‘docword.nips.txt.gz’: No such file or directory
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getuci.sh: line 30: wget: command not found
mv: cannot stat ‘vocab.nips.txt’: No such file or directory
Uncompressing docword.nips.txt.gz
gzip: docword.txt.gz: No such file or directory
Error: Could not find or load main class scala.tools.nsc.MainGenericRunner
mv: cannot stat ‘smat.lz4’: No such file or directory
mv: cannot stat ‘term.sbmat.gz’: No such file or directory
mv: cannot stat ‘term.imat.gz’: No such file or directory
clearing up
Loading nytimes data
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getuci.sh: line 26: wget: command not found
mv: cannot stat ‘docword.nytimes.txt.gz’: No such file or directory
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getuci.sh: line 30: wget: command not found
mv: cannot stat ‘vocab.nytimes.txt’: No such file or directory
Uncompressing docword.nytimes.txt.gz
gzip: docword.txt.gz: No such file or directory
Error: Could not find or load main class scala.tools.nsc.MainGenericRunner
mv: cannot stat ‘smat.lz4’: No such file or directory
mv: cannot stat ‘term.sbmat.gz’: No such file or directory
mv: cannot stat ‘term.imat.gz’: No such file or directory
clearing up
Loading arabic digits data
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getdigits.sh: line 25: wget: command not found
sed: can't read Train_Arabic_Digit.txt: No such file or directory
Error: Could not find or load main class scala.tools.nsc.MainGenericRunner
Loading movielens 10M data
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getmovies.sh: line 26: wget: command not found
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getmovies.sh: line 29: unzip: command not found
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getmovies.sh: line 30: cd: ml-10M100K: No such file or directory
e:/BidMach/BIDMach_1.0.0-win-x86_64/scripts/getmovies.sh: line 31: ./split_ratings.sh: No such file or directory
mv: cannot stat ‘r1.train’: No such file or directory
mv: cannot stat ‘r1.test’: No such file or directory
mv: cannot stat ‘r2.train’: No such file or directory
mv: cannot stat ‘r2.test’: No such file or directory
mv: cannot stat ‘r3.train’: No such file or directory
mv: cannot stat ‘r3.test’: No such file or directory
mv: cannot stat ‘r4.train’: No such file or directory
mv: cannot stat ‘r4.test’: No such file or directory
mv: cannot stat ‘r5.train’: No such file or directory
mv: cannot stat ‘r5.test’: No such file or directory
mv: cannot stat ‘ra.train’: No such file or directory
mv: cannot stat ‘ra.test’: No such file or directory
mv: cannot stat ‘rb.train’: No such file or directory
mv: cannot stat ‘rb.test’: No such file or directory
Error: Could not find or load main class scala.tools.nsc.MainGenericRunner
Hi,
i would like to process some large libSVM files which do not fit into memory. I am trying to use a stacked data source for that. But the program gets stuck on accessing the first chunk of my StackedDS object. Accessing the data sources unstacked seems to work fine
Here is a script demonstrating the problem. It creates a small libSVM file for input
https://gist.github.com/hcbraun/334d3e9c8da7959d0f37#file-svmstack-scala
This is what i am trying to do:
Any suggestions would be appreciated
Best,
Christian
This error originates from the script getuci.sh.
I verified that the URLs used in the wget requests were valid. The problem seems to be when running the tparse script on this line.
It might be a good idea to have tparse output some error messages for us to know what's going on.
I downloaded BIDMach 1.0.0 for 64-bit Mac OSX
I'm using OSX Yosemite 10.10.2 with iPython 2.4.1
After expanding, I run
./bidmach notebook
Shortly after opening any of the iScala tutorial notebooks I get an error message:
The kernel appears to have died. It will restart automatically."
and then eventually
"The kernel has died, and the automatic restart has failed. It is possible the kernel cannot be restarted. If you are not able to restart the kernel, you will still be able to save the notebook, but running code will no longer work until the notebook is reopened."
The command line output is
2015-02-20 14:01:07.815 [NotebookApp] Using existing profile dir: u'/Users/coryschillaci/.ipython/profile_scala'
2015-02-20 14:01:07.823 [NotebookApp] Using MathJax from CDN: https://cdn.mathjax.org/mathjax/latest/MathJax.js
2015-02-20 14:01:07.847 [NotebookApp] Serving notebooks from local directory: /Users/coryschillaci/Desktop/Stress Project/BIDMach_1.0.0-osx-x86_64
2015-02-20 14:01:07.847 [NotebookApp] 0 active kernels
2015-02-20 14:01:07.848 [NotebookApp] The IPython Notebook is running at: http://localhost:8888/
2015-02-20 14:01:07.848 [NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
2015-02-20 14:01:15.166 [NotebookApp] Kernel started: ef8b0c9b-34f5-4397-b931-67d0c5684209
Error: Could not find or load main class org.refptr.iscala.IScala
2015-02-20 14:01:18.169 [NotebookApp] KernelRestarter: restarting kernel (1/5)
WARNING:root:kernel ef8b0c9b-34f5-4397-b931-67d0c5684209 restarted
Error: Could not find or load main class org.refptr.iscala.IScala
2015-02-20 14:01:21.176 [NotebookApp] KernelRestarter: restarting kernel (2/5)
WARNING:root:kernel ef8b0c9b-34f5-4397-b931-67d0c5684209 restarted
Error: Could not find or load main class org.refptr.iscala.IScala
2015-02-20 14:01:24.185 [NotebookApp] KernelRestarter: restarting kernel (3/5)
WARNING:root:kernel ef8b0c9b-34f5-4397-b931-67d0c5684209 restarted
Error: Could not find or load main class org.refptr.iscala.IScala
2015-02-20 14:01:27.194 [NotebookApp] KernelRestarter: restarting kernel (4/5)
WARNING:root:kernel ef8b0c9b-34f5-4397-b931-67d0c5684209 restarted
Error: Could not find or load main class org.refptr.iscala.IScala
2015-02-20 14:01:30.207 [NotebookApp] WARNING | KernelRestarter: restart failed
2015-02-20 14:01:30.207 [NotebookApp] WARNING | Kernel ef8b0c9b-34f5-4397-b931-67d0c5684209 died, removing from map.
ERROR:root:kernel ef8b0c9b-34f5-4397-b931-67d0c5684209 restarted failed!
I write a src/main/scala/exp/lrexp.scala file:
////////////////////////////////////
package exp
import BIDMat.{CMat,CSMat,DMat,Dict,IDict,Image,FMat,FND,GMat,GIMat,GSMat,HMat,IMat,Mat,SMat,SBMat,SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.Plotting._
import BIDMach.Learner
import BIDMach.models.{FM,GLM,KMeans,KMeansw,LDA,LDAgibbs,Model,NMF,SFA}
import BIDMach.datasources.{DataSource,MatDS,FilesDS,SFilesDS}
import BIDMach.mixins.{CosineSim,Perplexity,Top,L1Regularizer,L2Regularizer}
import BIDMach.updaters.{ADAGrad,Batch,BatchNorm,IncMult,IncNorm,Telescoping}
import BIDMach.causal.{IPTW}
object LRExp {
def main(args:Array[String]) = {
println("LRExp")
val a = grand(20,30)
println(a)
}
}
///////////////////////////////////////
./sbt package
generate a jar
and then typing
./sbt "runMain exp.LRExp"
println("LRExp") can be executed successfully
then appear errors:
error java.lang.UnsatisfiedLinkError: Could not load the native library.
[error] Error while loading native library "bidmatmkl-linux-x86_64" with base name "bidmatmkl"
[error] Operating system name: Linux
[error] Architecture : amd64
[error] Architecture bit size: 64
[error] Stack trace from the attempt to load the library as a resource:
[error] java.lang.NullPointerException: No resource found with name '/lib/libbidmatmkl-linux-x86_64.so'
[error] at jcuda.LibUtils.loadLibraryResource(LibUtils.java:149)
however, in interactive shells, scala scripts can run successfully:
how can I run the program in a command not in interactive?
Thanks!
Running ./bidmach notebook
from the latest distribution, the terminal output is:
Error: Could not find or load main class getnativepath
2015-02-25 15:06:18.968 [NotebookApp] Using existing profile dir: u'/Users/coryschillaci/.ipython/profile_scala'
2015-02-25 15:06:18.978 [NotebookApp] Using MathJax from CDN: https://cdn.mathjax.org/mathjax/latest/MathJax.js
2015-02-25 15:06:19.008 [NotebookApp] Serving notebooks from local directory: /Users/coryschillaci/Desktop/StressProject/BIDMach_1.0.0-osx-x86_64
2015-02-25 15:06:19.008 [NotebookApp] 0 active kernels
2015-02-25 15:06:19.008 [NotebookApp] The IPython Notebook is running at: http://localhost:8888/
2015-02-25 15:06:19.008 [NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
^C2015-02-25 15:06:42.219 [NotebookApp] interrupted
Serving notebooks from local directory: /Users/coryschillaci/Desktop/StressProject/BIDMach_1.0.0-osx-x86_64
0 active kernels
The IPython Notebook is running at: http://localhost:8888/
When I run the BIDMach_intro.pynb
tutorial, the first cell output is
Cant find native CPU libraries
Cant find native HDF5 library
Couldnt load CUDA runtime
Out[1]:
()
The first error occurs when I run the section called "Transposed Multiplies," the first input gives
java.lang.UnsatisfiedLinkError: Could not load the native library.
Error while loading native library "bidmatmkl-apple-x86_64" with base name "bidmatmkl"
Operating system name: Mac OS X
Architecture : x86_64
Architecture bit size: 64
Stack trace from the attempt to load the library as a resource:
java.lang.NullPointerException: No resource found with name '/lib/libbidmatmkl-apple-x86_64.jnilib'
at jcuda.LibUtils.loadLibraryResource(LibUtils.java:149)
at jcuda.LibUtils.loadLibrary(LibUtils.java:83)
at edu.berkeley.bid.CBLAS.(CBLAS.java:8)
at BIDMat.FMat.Tmult(FMat.scala:565)
at BIDMat.FPair.Tx(FMat.scala:1113)
at BIDMat.Mop_TTimes$.op(Operators.scala:370)
at BIDMat.Mop$class.op(Operators.scala:37)
at BIDMat.Mop_TTimes$.op(Operators.scala:368)
at BIDMat.FMat.$up$times(FMat.scala:888)
at .(:48)
at .()
at .$result$lzycompute(:5)
at .$result(:5)
at $result()
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:739)
at org.refptr.iscala.Interpreter$$anonfun$9.apply(Interpreter.scala:206)
at org.refptr.iscala.Interpreter.withException(Interpreter.scala:101)
at org.refptr.iscala.Interpreter.loadAndRunReq(Interpreter.scala:206)
at org.refptr.iscala.Interpreter$$anonfun$interpret$1.apply(Interpreter.scala:245)
at org.refptr.iscala.Interpreter$$anonfun$interpret$1.apply(Interpreter.scala:245)
at org.refptr.iscala.Runner$Execution$$anonfun$1.apply$mcV$sp(Runner.scala:28)
at org.refptr.iscala.IOUtil$$anon$2.run(Util.scala:21)
at java.lang.Thread.run(Thread.java:745)
Stack trace from the attempt to load the library as a file:
java.lang.UnsatisfiedLinkError: no bidmatmkl-apple-x86_64 in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886)
at java.lang.Runtime.loadLibrary0(Runtime.java:849)
at java.lang.System.loadLibrary(System.java:1088)
at jcuda.LibUtils.loadLibrary(LibUtils.java:94)
at edu.berkeley.bid.CBLAS.(CBLAS.java:8)
at BIDMat.FMat.Tmult(FMat.scala:565)
at BIDMat.FPair.Tx(FMat.scala:1113)
at BIDMat.Mop_TTimes$.op(Operators.scala:370)
at BIDMat.Mop$class.op(Operators.scala:37)
at BIDMat.Mop_TTimes$.op(Operators.scala:368)
at BIDMat.FMat.$up$times(FMat.scala:888)
at .(:48)
at .()
at .$result$lzycompute(:5)
at .$result(:5)
at $result()
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:739)
at org.refptr.iscala.Interpreter$$anonfun$9.apply(Interpreter.scala:206)
at org.refptr.iscala.Interpreter.withException(Interpreter.scala:101)
at org.refptr.iscala.Interpreter.loadAndRunReq(Interpreter.scala:206)
at org.refptr.iscala.Interpreter$$anonfun$interpret$1.apply(Interpreter.scala:245)
at org.refptr.iscala.Interpreter$$anonfun$interpret$1.apply(Interpreter.scala:245)
at org.refptr.iscala.Runner$Execution$$anonfun$1.apply$mcV$sp(Runner.scala:28)
at org.refptr.iscala.IOUtil$$anon$2.run(Util.scala:21)
at java.lang.Thread.run(Thread.java:745)jcuda.LibUtils.loadLibrary(LibUtils.java:126)
edu.berkeley.bid.CBLAS.(CBLAS.java:8)
BIDMat.FMat.Tmult(FMat.scala:565)
BIDMat.FPair.Tx(FMat.scala:1113)
BIDMat.Mop_TTimes$.op(Operators.scala:370)
BIDMat.Mop$class.op(Operators.scala:37)
BIDMat.Mop_TTimes$.op(Operators.scala:368)
BIDMat.FMat.$up$times(FMat.scala:888)
Hi,
Are there any plans to build word2vec implementation?
Thanks
It seems that the scripts/getdata.sh
script has resurfaced this old issue with a bad path. I think this is an easy fix?
as in Quickstart write, bidmach can train 103 models all at once if the train data has 103 targets.
if I want to train a binary LR model, the label matrix is as follows:
0 1 1 0 ....
1 0 0 1 ....
rows = 2, cols = train instances
there is another problem:
i use loadLibSVM to load libsvm's train file,
the train file format is:
label index1:value1 index2:value2 ...
val(a,c,_)=loadLibSVM(train_file, feat_num)
val (nn, opts) =GLM.learner(a, c, 1)
nn.train
then collapse, the error smg is "scala.MatchError:"
i find in HMat.scala, loadLibSVM just put all the labels of train instances in a array, not a matrix
Is there any funtion to load libsvm train files?
Thanks!
When I run the first cell of BIDMach_intro I get the following result even though I have CUDA 6.5 installed:
Cant find native CPU libraries
Cant find native HDF5 library
Couldnt load CUDA runtime
Am I missing environment variables?
I'm attempting to save a model as a GoogleW2V. I've pulled the latest code (from June 2015) and compiled. However, when attempting to use saveGoogleW2V
I get a not a member
error.
I can see in the code I compiled the method in the Word2Vec companion object. I'm not that familiar with Scala, but thought I should have been able to call the method in a similar manner to a Java static:
Word2Vec.saveGoogleW2V(...)
Am I misunderstanding how companion object methods work? Or is this a bug of some sort?
Hi,
I'm new here. The benchmarks look amazing. I'm interested in trying PageRank and Matrix Factorization on GPU. Do you have any documentations about running these two applications? Could you also point to me the CUDA code of these two applications?
Thanks,
Cui
I am trying to get the precompiled BIDMach_1.0.3-linux-86_64/bidmach to work with RF binary classification on a sample dataset, but getting an error:
scala> val trg = loadIMat("../../RF_project/gisette/gisette_train.data.txt")
trg: BIDMat.IMat =
550 0 495 0 0 0 0 976 0 0 0 0 983 0 995 0 983 0 0 983 0 0 0 0...
0 0 0 0 0 0 0 976 0 0 0 0 0 0 584 0 0 0 0 0 0 0 0 0...
0 0 0 0 0 0 0 0 0 0 0 0 983 0 995 983 976 0 0 0 0 0 0 0...
0 0 742 0 0 0 0 684 0 956 0 0 983 0 991 816 983 0 0 0 0 0 0 0...
0 0 0 0 0 0 0 608 0 979 0 0 0 0 972 0 0 0 0 0 0 0 0 480...
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
scala> val (nn,mopts) =RandomForest.learner(trg,yg)
nn: BIDMach.Learner = Learner(BIDMach.datasources.MatDS@61d91a71,BIDMach.models.RandomForest@745722e6,null,BIDMach.updaters.Batch@4b465b6,BIDMach.models.RandomForest$RFSopts@5f819223)
mopts: BIDMach.models.RandomForest.RFSopts = BIDMach.models.RandomForest$RFSopts@5f819223
scala> nn.train
java.lang.RuntimeException: colslice index out of range 5000 1
at BIDMat.DenseMat$mcI$sp.gcolslice$mcI$sp(DenseMat.scala:469)
at BIDMat.IMat.colslice(IMat.scala:99)
at BIDMat.IMat.colslice(IMat.scala:7)
at BIDMach.datasources.MatDS$$anonfun$next$1.apply$mcVI$sp(MatDS.scala:43)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at BIDMach.datasources.MatDS.next(MatDS.scala:41)
at BIDMach.models.Model.bind(Model.scala:81)
at BIDMach.Learner.init(Learner.scala:42)
at BIDMach.Learner.train(Learner.scala:52)
... 33 elided
Train dataset is space separated integer data from UCI (gisette dataset). I can't believe that column slicing does not work for RF in BIDMach, so I must be doing something obviously wrong. Notably RF.learner does not take the same data in floating point, even though the BIDMach API doc says that discrete (integer) values are only needed for regression testing.
Any advice?
Hi. Is it possible to do parallel graph processing using BIDMach? Specifically I'm looking to implement belief propagation on factor graphs.
What should I do if I want to load a slightly bigger data like yahoo music and convert it to sparse matrix? Must I first split the data into pieces? Because if I directly convert the data as follows there will be a java heap problem.
Thanks a lot!
BIDMach_1.0.0-linux-x86_64/bidmach getdata.ssc
Loading /home/ubuntu/BIDMach_1.0.0-linux-x86_64/lib/bidmach_init.scala...
import BIDMat.{CMat, CSMat, DMat, Dict, FMat, FND, GMat, GDMat, GIMat, GLMat, GSMat, GSDMat, HMat, IDict, Image, IMat, LMat, Mat, SMat, SBMat, SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.Plotting._
import BIDMach.Learner
import BIDMach.models.{DNN, FM, GLM, KMeans, KMeansw, LDA, LDAgibbs, Model, NMF, SFA, RandomForest}
import BIDMach.datasources.{DataSource, MatDS, FilesDS, SFilesDS}
import BIDMach.mixins.{CosineSim, Perplexity, Top, L1Regularizer, L2Regularizer}
import BIDMach.updaters.{ADAGrad, Batch, BatchNorm, IncMult, IncNorm, Telescoping}
import BIDMach.causal.IPTW
4 CUDA devices found, CUDA version 6.5
Loading getdata.ssc...
nusers: Int = 1000990
nmovies: Int = 624961
nu: Int = 480189
nm: Int = 17770
ubuntu@ip-10-97-178-101:$ vim getdata.ssc$ BIDMach_1.0.0-linux-x86_64/bidmach getdata.ssc
ubuntu@ip-10-97-178-101:
Loading /home/ubuntu/BIDMach_1.0.0-linux-x86_64/lib/bidmach_init.scala...
import BIDMat.{CMat, CSMat, DMat, Dict, FMat, FND, GMat, GDMat, GIMat, GLMat, GSMat, GSDMat, HMat, IDict, Image, IMat, LMat, Mat, SMat, SBMat, SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.Plotting._
import BIDMach.Learner
import BIDMach.models.{DNN, FM, GLM, KMeans, KMeansw, LDA, LDAgibbs, Model, NMF, SFA, RandomForest}
import BIDMach.datasources.{DataSource, MatDS, FilesDS, SFilesDS}
import BIDMach.mixins.{CosineSim, Perplexity, Top, L1Regularizer, L2Regularizer}
import BIDMach.updaters.{ADAGrad, Batch, BatchNorm, IncMult, IncNorm, Telescoping}
import BIDMach.causal.IPTW
4 CUDA devices found, CUDA version 6.5
Loading getdata.ssc...
nusers: Int = 1000990
nmovies: Int = 624961
a: BIDMat.DMat =
1 507697 5.5000 1
1 137916 5.5000 1
1 22758 5.5000 1
1 120329 5.5000 1
.. .. .. ..
java.lang.OutOfMemoryError: Java heap space
at BIDMat.SparseMat$.sparseImpl$mFc$sp(SparseMat.scala:822)
at BIDMat.MatFunctions$.sparse(MatFunctions.scala:1238)
... 30 elided
:27: error: not found: value sa
sa.check
^
:27: error: not found: value sa
saveSMat("yahoo_train.smat.lz4", sa);
dav@mercury:~⟫ /opt/BIDMach_1.0.0-full-linux-x86_64/bin/xmltweet.exe -i /var/local/destress/lj-annex/data/events/aa/aanniieed.xml -o ./aanniieed
Scanning /var/local/destress/lj-annex/data/events/aa/aanniieed.xml
00002 linesCouldnt open output file ./aanniieed/var/local/destress/lj-annex/data/events/aa/aanniieed.xml.imat
terminate called without an active exception
Aborted (core dumped)
Is a predictor function planned for LDA?
Could you please show the support list for distributed Deep Learning networks?
How to do the Data Partition/Model Partition in BIDMACH
Steps:
Note: While that jar is included in the release tar, it isn't in the repo so building from the repo doesn't work.
Steps:
Note: At least on Ubuntu 14.04 /bin/sh is not bash and the script assumes that it is.
Hi,
I got three commodity computer with GPU, and I want to deploy bidmach on cluster of this three computer, is there a way to do this ? I go through the project home and I found no tutorial about this. Is bidmach works for a cluster? thank you.
I am just getting started. When I try to run BIDMach root/bidmach I get this error.
Error: Could not find or load main class scala.tools.nsc.MainGenericRunner
I have scala 2.10.5 installed and am on Ubuntu 14.04. Is there a flag I need to add to fix this error?
I was able to train a FM model, but how do I predict with it?
The GLM model has a learner() method to create two leaners, one for training and one for predicting. I can use that to train and test with out of sample data.
But the FM model does not have the same method and it is not clear from the source code how to proceed.
Any suggestions would be appreciated.
Best,
-Doug
In the wiki Data Wrangling section function cat2sparse( ) mentionde, but it can not be founded. where is it?
A minor annoyance: when opening the tutorials (via './bidmach notebook') and browsing to the tutorials then clicking any one tutorial, we get this message from IPython.
I looked at this and it might be a false alarm from IPython and there is a suggested solution here.
Installed latest cuda, have java 7.
Installed the tarball per instructions from the download site.
If I run the scripts/getdata.sh
dougs-mbp:BIDMach_0.9.5-osx-x86_64 dloyer$ ./scripts/getdata.sh
./scripts/getdata.sh: line 20: /Users/dloyer/Downloads/BIDMach_0.9.5-osx-x86_64/getrcv1.sh: No such file or directory
./scripts/getdata.sh: line 22: /Users/dloyer/Downloads/BIDMach_0.9.5-osx-x86_64/getuci.sh: No such file or directory
./scripts/getdata.sh: line 24: /Users/dloyer/Downloads/BIDMach_0.9.5-osx-x86_64/getuci.sh: No such file or directory
./scripts/getdata.sh: line 28: /Users/dloyer/Downloads/BIDMach_0.9.5-osx-x86_64/getdigits.sh: No such file or directory
If I cd to scripts and run getrcv1.sh, I get further, but get a different error message...
....
Scanning lyrl2004_tokens_train.dat.gz
171542 lines
Writing Dictionary
2606875 lines processed
/var/folders/3l/s60hgztj5_zc_chmj8hvy4gm0000gn/T/scalacmd2525326315882625671.scala:1: error: not found: value BIDMat
import BIDMat.{CMat,CSMat,DMat,Dict,IDict,Image,FMat,FND,GMat,GIMat,GSMat,HMat,IMat,Mat,SMat,SBMat,SDMat}
^
/var/folders/3l/s60hgztj5_zc_chmj8hvy4gm0000gn/T/scalacmd2525326315882625671.scala:2: error: not found: value BIDMat
import BIDMat.MatFunctions._
^
....
I get the following error while running getdata.sh. Execution continues afterwards. Then a similiar error happens while processing nytimes.
Do they have to do with "Couldt load JCuda", or are unrelated to that?
Loading nips data
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2234k 100 2234k 0 0 155k 0 0:00:14 0:00:14 --:--:-- 176k
Couldnt load JCuda
Processing nips.
java.lang.ArrayIndexOutOfBoundsException: 0
at BIDMat.DenseMat$mcI$sp.ggReduceOp$mcI$sp(DenseMat.scala:907)
at BIDMat.IMat.iiReduceOp(IMat.scala:120)
at BIDMat.SciFunctions$.maxi(SciFunctions.scala:520)
at BIDMat.MatFunctions$.cols2sparse(MatFunctions.scala:1338)
at BIDMach.NYTIMES$.preprocess(Experiments.scala:30)
at Main$$anon$1.(scalacmd7501012979133857457.scala:16)
at Main$.main(scalacmd7501012979133857457.scala:1)
at Main.main(scalacmd7501012979133857457.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at scala.tools.nsc.util.ScalaClassLoader$$anonfun$run$1.apply(ScalaClassLoader.scala:71)
at scala.tools.nsc.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.asContext(ScalaClassLoader.scala:139)
at scala.tools.nsc.util.ScalaClassLoader$class.run(ScalaClassLoader.scala:71)
at scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.run(ScalaClassLoader.scala:139)
at scala.tools.nsc.CommonRunner$class.run(ObjectRunner.scala:28)
at scala.tools.nsc.ObjectRunner$.run(ObjectRunner.scala:45)
at scala.tools.nsc.CommonRunner$class.runAndCatch(ObjectRunner.scala:35)
at scala.tools.nsc.ObjectRunner$.runAndCatch(ObjectRunner.scala:45)
at scala.tools.nsc.ScriptRunner.scala$tools$nsc$ScriptRunner$$runCompiled(ScriptRunner.scala:171)
at scala.tools.nsc.ScriptRunner$$anonfun$runCommand$1.apply(ScriptRunner.scala:218)
at scala.tools.nsc.ScriptRunner$$anonfun$runCommand$1.apply(ScriptRunner.scala:218)
at scala.tools.nsc.ScriptRunner$$anonfun$withCompiledScript$1.apply$mcZ$sp(ScriptRunner.scala:157)
at scala.tools.nsc.ScriptRunner$$anonfun$withCompiledScript$1.apply(ScriptRunner.scala:131)
at scala.tools.nsc.ScriptRunner$$anonfun$withCompiledScript$1.apply(ScriptRunner.scala:131)
at scala.tools.nsc.util.package$.trackingThreads(package.scala:51)
at scala.tools.nsc.util.package$.waitingForThreads(package.scala:35)
at scala.tools.nsc.ScriptRunner.withCompiledScript(ScriptRunner.scala:130)
at scala.tools.nsc.ScriptRunner.runCommand(ScriptRunner.scala:218)
at scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:94)
at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:105)
at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)
clearing up
Loading nytimes data
...
Couldnt load JCuda
Processing nytimes.
java.lang.ArrayIndexOutOfBoundsException: 0
at BIDMat.DenseMat$mcI$sp.ggReduceOp$mcI$sp(DenseMat.scala:907)
at BIDMat.IMat.iiReduceOp(IMat.scala:120)
at BIDMat.SciFunctions$.maxi(SciFunctions.scala:520)
at BIDMat.MatFunctions$.cols2sparse(MatFunctions.scala:1338)
at BIDMach.NYTIMES$.preprocess(Experiments.scala:30)
at Main$$anon$1.(scalacmd7752450377490059554.scala:16)
at Main$.main(scalacmd7752450377490059554.scala:1)
at Main.main(scalacmd7752450377490059554.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at scala.tools.nsc.util.ScalaClassLoader$$anonfun$run$1.apply(ScalaClassLoader.scala:71)
at scala.tools.nsc.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.asContext(ScalaClassLoader.scala:139)
at scala.tools.nsc.util.ScalaClassLoader$class.run(ScalaClassLoader.scala:71)
at scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.run(ScalaClassLoader.scala:139)
at scala.tools.nsc.CommonRunner$class.run(ObjectRunner.scala:28)
at scala.tools.nsc.ObjectRunner$.run(ObjectRunner.scala:45)
at scala.tools.nsc.CommonRunner$class.runAndCatch(ObjectRunner.scala:35)
at scala.tools.nsc.ObjectRunner$.runAndCatch(ObjectRunner.scala:45)
at scala.tools.nsc.ScriptRunner.scala$tools$nsc$ScriptRunner$$runCompiled(ScriptRunner.scala:171)
at scala.tools.nsc.ScriptRunner$$anonfun$runCommand$1.apply(ScriptRunner.scala:218)
at scala.tools.nsc.ScriptRunner$$anonfun$runCommand$1.apply(ScriptRunner.scala:218)
at scala.tools.nsc.ScriptRunner$$anonfun$withCompiledScript$1.apply$mcZ$sp(ScriptRunner.scala:157)
at scala.tools.nsc.ScriptRunner$$anonfun$withCompiledScript$1.apply(ScriptRunner.scala:131)
at scala.tools.nsc.ScriptRunner$$anonfun$withCompiledScript$1.apply(ScriptRunner.scala:131)
at scala.tools.nsc.util.package$.trackingThreads(package.scala:51)
at scala.tools.nsc.util.package$.waitingForThreads(package.scala:35)
at scala.tools.nsc.ScriptRunner.withCompiledScript(ScriptRunner.scala:130)
at scala.tools.nsc.ScriptRunner.runCommand(ScriptRunner.scala:218)
at scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:94)
at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:105)
at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)
clearing up
Loading arabic digits data
$
I'm using Mac OS X 10.9.5 and have run into the same problems I outlined in BIDData/BIDMat#18. Here's an example of a problem for quick reference:
scala> val diag = BIDMat.GMat(1 on 2 on 3)
diag: BIDMat.GMat =
1
2
3
scala> mkdiag(diag)
java.lang.RuntimeException: mkdiag requires a vector argument, but dims= 3 1
at BIDMat.GMat.mkdiag(GMat.scala:643)
at BIDMat.MatFunctions$.mkdiag(MatFunctions.scala:1442)
... 33 elided
scala>
The issue is that while I've added some fixes to BIDMat that resolve the above problem (plus others), they are not reflected in the latest BIDMat.jar that ships with BIDMach. The problem above happens when I downloaded the fresh 1.0.0 Mac bundle and typed in ./bidmach and then the commands above. Thus, the solution is simply to recompile BIDMat to get BIDMat.jar, and then copy that over to BIDMach's "lib" file in the 1.0.0 bundle, and that should fix things.
Also, the BIDMat API docs need to be updated since they do not have recent fixes and updated documentation.
Hi All,
I'm getting the following error when I'm trying to run clustering with kmeans with GPU.
The problem occurs in the following use cases:
1)input 1M records/10attributes/k=100/10iterations
2)input 30M records/10attributes/k=10/10iterations
but works for:
1)input 10M/10attributes/k=10/10iterations
You reported running kmeans with 100M dataset with gtx680 gpu
that according to the specifications has 2GB of RAM so I think
it should also work in my case - gtx860m/2GB or am I missing something?
Besides - do you know why my card is being reported as :
1 CUDA device found, CUDA version 5.5
while I'm running CUDA 6.0?
Regards,
Marek
java.lang.RuntimeException: CUDA alloc failed initialization error
at BIDMat.GMat$.apply(GMat.scala:1094)
at BIDMat.GMat$.newOrCheckGMat(GMat.scala:1780)
at BIDMat.GMat$.newOrCheckGMat(GMat.scala:1814)
at BIDMat.GMat$.apply(GMat.scala:1100)
at BIDMach.models.ClusteringModel.init(Clustering.scala:21)
at BIDMach.models.KMeans.init(KMeans.scala:34)
at BIDMach.Learner.init(Learner.scala:37)
at BIDMach.Learner.train(Learner.scala:45)
at .(:26)
at .()
at .(:7)
at .()
at $print()
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:734)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:983)
at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:604)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:568)
at scala.tools.nsc.interpreter.ILoop.reallyInterpret$1(ILoop.scala:760)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:805)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:717)
at scala.tools.nsc.interpreter.ILoop.processLine$1(ILoop.scala:581)
at scala.tools.nsc.interpreter.ILoop.innerLoop$1(ILoop.scala:588)
at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:591)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:882)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:837)
at scala.tools.nsc.MainGenericRunner.runTarget$1(MainGenericRunner.scala:83)
at scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:96)
at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:105)
at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)
xmltweet indexes tokens from 1, and also stores these tokens in a dictionary file. However, when scala reads in the dictionary, it indexes from 0. So, if you look up the index of a token in the dictionary, it will be one less than the value used in the parsed imat file.
This is easy enough to address by adding or subtracting 1, but we'd like to do this in a way that minimizes error. We talked about perhaps inserting a junk entry as the 0th element when you read the dictionary into Scala. But, we wanted to check with you to see if you had a different idea.
John,
It looks like I didn't catch a few test cases when we were checking multinomial2. I found some odd behavior this morning. Just to recap, the definition of multinomial2 is:
int multinomial2(int nrows, int ncols, float *A, int *B, int nvals)
Where
nrows
is the number of rows of "matrix" A
ncols
is the number of columns of "matrix" A
A
is a pointer to a data matrix where columns correspond to an un-normalized probability distributionB
is a GIMat
of the same dimension as A
and holds the sampling resultsnvals
is the number of samples we want to get, where one sample should make a k-way decision.Unfortunately, from what I can tell, the multinomial2 sampling is ignoring a row and putting the results in a different row. To demonstrate:
scala> import edu.berkeley.bid.CUMACH._
import edu.berkeley.bid.CUMACH._
scala> val test4 = grand(2,1000)
test4: BIDMat.GMat =
0.40568 0.11102 0.97801 0.96959 0.83026 0.40141 0.41202 0.61166 0.20048 0.22633 0.12287 0.18022 0.32572 0.40397 0.86652 0.31198 0.17791 0.56108 0.59852...
0.59785 0.32611 0.46033 0.17353 0.79053 0.91425 0.43692 0.55500 0.49750 0.96141 0.81040 0.62431 0.81548 0.96281 0.011548 0.73854 0.26540 0.56230 0.50416...
scala> val out4 = gizeros(2,1000)
out4: BIDMat.GIMat =
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0...
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0...
scala> multinomial2(2,1000,test4.data, out4.data,1000)
res12: Int = 0
scala> out4
res13: BIDMat.GIMat =
1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000...
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0...
You can see that out4 allocates all of the 1000 samples to be the first value (first row), whereas since we randomized it, when we sum across each of the two rows, they should have roughly the same number of elements.
As another example with a larger matrix, which shows a more interesting result, we get:
scala> val test6 = grand(100,1000)
test6: BIDMat.GMat =
0.64936 0.32890 0.80571 0.060403 0.46713 0.11794 0.35396 0.51864 0.55048 0.31779 0.11267 0.84305 0.063050 0.16121 0.13459 0.15608 0.23206 0.75661 0.76700...
0.30496 0.52751 0.28042 0.15433 0.15183 0.54877 0.97555 0.73333 0.86240 0.63230 0.41277 0.32537 0.57536 0.73076 0.44918 0.69297 0.33638 0.043051 0.22322...
0.37311 0.080803 0.53221 0.45509 0.65039 0.92046 0.16711 0.18513 0.28504 0.79746 0.015594 0.57462 0.45046 0.76908 0.48837 0.58818 0.99301 0.95240 0.72313...
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
scala> val out6 = gizeros(100,1000)
out6: BIDMat.GIMat =
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0...
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0...
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0...
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .....
scala> multinomial2(100,1000,test6.data,out6.data,10000)
res16: Int = 0
scala> out6
res17: BIDMat.GIMat =
173 174 177 44 143 153 264 231 258 221 101 235 129 177 108 162 98 169 222 97 238 131 323 219 305 211 250 254 241 179 276 199 262 105 34 274 248 243 133...
87 14 87 111 128 172 31 38 38 155 4 99 96 147 89 123 194 171 133 220 47 110 116 80 120 155 85 186 80 115 163 120 64 195 8 8 59 49 157...
135 135 49 4 28 159 14 152 190 117 20 212 134 18 15 179 96 185 71 26 19 15 175 204 95 131 129 138 10 113 72 36 28 193 31 105 162 135 154...
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ...
scala> val sums = sum(IMat(out6),2)
sums: BIDMat.IMat =
202950
100015
99611
..
scala> sums.t
res18: BIDMat.IMat = 202950,100015,99611,100062,98564,98280,101494,99784,101114,99836,100234,100544,97378,98849,
99934,104187,101520,100814,100156,100427,100484,101568,99160,98448,102099,101257,100595,10
1254,99316,101159,101845,99685,99266,99291,100698,98455,98149,97856,103206,98644,99495,101
699,102006,98426,100417,99976,99024,102430,99809,99613,100951,102319,98455,101879,103332,1
02684,100135,97815,101072,101160,98194,100579,100051,98683,97189,102728,100597,98409,98188
,97562,98516,99912,98672,97695,100228,101165,95438,101559,98860,99721,100042,99101,100166,1
01560,100659,99598,97924,100269,103065,98137,99056,99702,100359,98133,98068,102242,98344,9
8157,100587,0
What happens here is that it looks like the first row "took" the samples from the last row. The first row sums up to be about 200k, and the other rows (except the last one) are about 100k. Also, even if you disregard the last row, some of the output doesn't make sense. The fourth column, for instance, has 111 samples in the second row and 4 samples in the third. But when we look at the test6 matrix, we see un-normalized probabilities of 0.15 and 0.45, respectively, so how come the ratio is 111-to-4 despite the probability ratio being 0.15-to-0.45?
I strongly suspect that there is an edge case problem that's causing one row to lose its samples, and maybe fixing that will resolve the ratio difference I just observed.
Hi,
the BID project is great! Could you put the project in any maven repository so that people could include them easily via adding the dependencies in pom.xml file?
Lizhen
When I pulled the latest version and then tried to compile, sbt gives the following errors:
~/BIDMach$ sbt package
[info] Set current project to BIDMach (in build file:/home/schillaci/BIDMach/)
[warn] Credentials file /home/schillaci/.ivy2/.credentials does not exist
[info] Compiling 16 Scala sources and 2 Java sources to /home/schillaci/BIDMach/target/scala-2.11/classes...
[error] /home/schillaci/BIDMach/src/main/scala/BIDMach/models/DNN.scala:568: value blockGemm is not a member of BIDMat.Mat
[error] inputs(0).data.blockGemm(1, 0, nr, nc, reps,
[error] ^
[error] /home/schillaci/BIDMach/src/main/scala/BIDMach/models/DNN.scala:580: value blockGemm is not a member of BIDMat.Mat
[error] inputs(1).data.blockGemm(0, 1, nrows, nc, reps,
[error] ^
[error] /home/schillaci/BIDMach/src/main/scala/BIDMach/models/DNN.scala:585: value blockGemm is not a member of BIDMat.Mat
[error] inputs(0).data.blockGemm(0, 0, nrows, nr, reps,
[error] ^
[error] three errors found
[error] (compile:compile) Compilation failed
[error] Total time: 12 s, completed Apr 9, 2015 12:20:39 PM
I tried copying the lib files from the latest executable bundles, but it didn't help. Any idea what's up?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.