cmu-phil / tetrad Goto Github PK
View Code? Open in Web Editor NEWRepository for the Tetrad Project, www.phil.cmu.edu/tetrad.
License: GNU General Public License v2.0
Repository for the Tetrad Project, www.phil.cmu.edu/tetrad.
License: GNU General Public License v2.0
Dear Experts,
I’m trying to load 12 text files into the data box and after I press "load all”, the loading log remains blank. After pressing save, I see all 12 tabs but they all have the same data (subject 1’s data set, as opposed to each subject’s unique data). I am using version 5.2.1-3 on mac. Has anyone else experienced this? I would greatly appreciate any advice.
Thanks,
Eleni
PerformanceTests is a collection of tests for a variety of algorithms, but the code is a bit of a jumble right now and hard to read, let alone use. Needs to be cleaned up.
Setting minimize to true for shade resulted in problems with loading the configuration in the Tetrad GUI. However, setting it to false increases the size of the ejar from 16G to 30+G. Need to find a way to split the difference. Maybe leave out specific jars from the build.
Attempting to build project, while running tests will get the following error:
Tests in error:
test8(edu.cmu.tetrad.test.TestStandradizedSem): non symmetric matrix: the difference between entries at (1,2) and (2,1) is larger than 0
This results in build failure
Hi,
I am using "tetradcmd-5.1.0-10.jar" in Windows by making batch file to run for multiple input and output files. I am searching causal variables from my input data using FCI algorithm.
How can I measure the program runtime? I like to measure the program runtime for each search in each input data set, because some input data sets have very complex models.
Thank you,
Sanghoon
See the TODO in the file.
Currently, the original GES algorithm and the newer FGS algorithm are both in the interface. One feature of FGS is that the user can assume that if X and Y are uncorrelated then X is not adjacent to Y in the graph. Assuming this kind of faithfulness speeds up the search considerably but is not always helpful. So the user should be able to choose whether to assume it or not. We need a switch to let the user decide.
FOFC isn't scaling with sample size.
Any kind of pure measurement model, FOFC slow with large N, e.g., 10,000.
If imports starting with no.uib.cipr.matrix are commented out, the problem code is put in red in IntelliJ. Classes affected are IndTestHsic, KernelUtils, Ling, Lingam.
The compiled classes seem to run OK, but the relevant tests break when run in Maven.
I think we should support 64 bit Windows. If so, need to translate this matrix algebra to a different library.
Joe
I'm a new user of Tetrad V. The program is very impressive, but its usability would benefit greatly from some attention to the manual (new_manual.pdf). To begin with, separate the material into sections and add a table of contents, all with hyperlinks.
Thanks very much.
Hi,
I was running TetradCmd in Windows for PC algorithm, but I got an error message. Please see below.
java -jar tetradcmd-5.1.0-10.jar -data input.txt -datatype discrete -algorithm pc -depth -1 -significance 0.01
Exception in thread "main" java.lang.IllegalStateException: No algorithm was specified.
at edu.cmu.tetradapp.TetradCmd.runAlgorithmTetradCmd.java:508
at edu.cmu.tetradapp.TetradCmd.TetradCmd.java:80
at edu.cmu.tetradapp.TetradCmd.mainTetrad.java:945
When I ran it for FCI or other algorithms, I didn't get any error message. Only PC algorithm is giving error message. Even if I got the error message while running for PC algorithm, I could get an output. When I compared the PC output and other algorithm outputs, they are different so I think PC algorithm is working and the error message doesn't seem critical. But, I am not sure whether PC algorithm is working appropriately. Could you explain what the error message means and if I can fix it.
Thank you,
Sanghoon
Doesn't seem to find all and only Y structures.
The idea here is that when you create several data sets in a DataWrapper and go to save them, you have to save them one at a time. It would be more useful if they could all be saved with one menu command. This would allow simulation facilities in Tetrad to produce data sets useful for other programs.
There was a task in Ant to do this; it's not part of the default Maven deploy. Can it be added?
When I generate 1000 data sets in the Data box, may I be sure that the 1000 data sets will be generated by all different seed numbers? Is there no possibility that I would get the same data sets generated by the same seed number? This question is under assumption that I have many causal variables in my graph and I set large sample size enough to generate all different data sets more than 1000.
I tried to generate 10 data sets when I have just 1 causal variable and 1 target, and I set sample size 2. Then, as we can anticipate, many data sets (7~8 data sets) were all the same. So, I was curious if TETRAD is programmed to assign every different seed number when generating 1000 data sets.
Sorry for asking many questions these days, and thank you,
Sanghoon
The unit tests in tetradapp aren't actually being run. They should go in tetrad/src/test.
Hi,
I just tested to use "lib-tetrad-5.3.0-20151113.150857-1-tetradcmd.jar" in Windows (I am using Windows Server 2008 R2 Enterprise), and it is working, too. In the Wiki, you explained that it should be Unix-type machine.. I am confused. Do I know something wrong? Is it okay to use it in Windows, and may I expect the same performance?
Thank you,
Sanghoon
I'm not sure if it can be salvaged yet; there are now multiple paths to classes instead of just one.
The instructions at http://www.phil.cmu.edu/tetrad/current.html are out of date. It's not clear how ejars will be launched in the future; this needs to be worked out. The Linux instructions need to be updated, since Tetrad will run under Open JDK now.
Hi,
This question was asked in the Goggle group, but it was not asnwered.
In TETRAD, I used the template, "Simulate data from IM" and instantiated 100 data sets in the IM box. I know that I can save the data sets to .txt file in the Data box. But, I don't want to save the 100 data set files manually.
Is there command line method that I generate/instantiate data sets like I did in IM box, and save the data sets? I think using command line to simulate and instantiate data sets will be very complicated because it will be difficult in command line to set 'Graph type', such as which variable is direct and which variable is target, and to set 'Parametric model' giving probabilities for each variable in every condition, Therefore, I was curious if there is command line method at least to save data sets automatically after I instantiated 100 or 1000 data sets in TETRAD workspace. (But, also I thought that there would be no method to extract the instantiated data sets from TETRAD in order to save the data sets using command line methods). I like to know if there is command line method to save the instantiated data sets.
Thank you,
Sanghoon
Most methods are self-documenting because of their names and signatures. Leaving those aside, each class should have a class-level doc, even if it's simple.
With edu.cmu.tetrad paths in the published version have two tetrads in them. Needs to be adjusted in several places.
Make sure the algorithm have their proper names before they get too ensconsed.
The old gadget for inserting copyright notices doens't seem viable any longer with the conversion of the code to submodules. Try the Maven gadget instead.
http://codeoftheday.blogspot.com/2013/10/apache-maven-tips-addappend-copyright.html
I ran TETRA cmd and got some information of interacting nodes and edge directions between the edges. I found this definition of edge direction in the manual of TETRAD v.4
But, I couldn't find good definition of this in the manual of TETRAD v.5. Also, personally, I don't understand the difference between undirected (---) and unoriented (o-o). I think they sound the same. And, I think symbol of 'bidirected' should be <->, rather than o->. Could you teach me where I can find a good and clear definition of the edge directions? I need to define it for my lab's manuscript.
Thank you,
Sanghoon
Hi,
I followed the introduction for command line tetrad here,
https://github.com/cmu-phil/tetrad/wiki/Command-Line-Tetrad
But, it doesn't seem to work well for me. I am using linux server, and I think my java version is fine. Please look at below the java version checking and 'tetrad.jar' running, and error message. I attached my sample input file. Could you help me?
[user167@login0 8_TETRAD]$ java -version
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.1) (rhel-1.45.1.11.1.el6-x86_64)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
[user167@login0 8_TETRAD]$ java -jar lib-tetrad-5.3.0-20151113.150857-1-tetracmd.jar -data Simulated_example.txt -datatype discrete -algorithm pc -depth -1 significance 0.01
Exception in thread "main" java.lang.UnsupportedClassVersionError: edu/cmu/tetrad/cmd/TetradCmd : Unsupported major.minor version 51.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
Could not find the main class: edu.cmu.tetrad.cmd.TetradCmd. Program will exit.
[user167@login0 8_TETRAD]$
This is a wish-list item, maybe doable. Currently we are using the following matrix libraries:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
<version>3.5</version>
</dependency>
<dependency>
<groupId>colt</groupId>
<artifactId>colt</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>gov.nist.math</groupId>
<artifactId>jama</artifactId>
<version>1.0.2</version>
</dependency>
<dependency>
<groupId>com.googlecode.matrix-toolkits-java</groupId>
<artifactId>mtj</artifactId>
<version>1.0.1</version>
</dependency>
Much of this is overlapping functionality. We have a class, TetradMatrix, that wraps the Apache matrix library. Can we remove some of the other matrix libraries and use TetradMatrix instead?
Here’s what I will add today. I will make this into a multi-module maven project so that the tetradlib part of the project is a module that can be reused. (a description of this is here https://books.sonatype.com/mvnex-book/reference/multimodule.html).
If imports starting with no.uib.cipr.matrix are commented out, the problem code is put in red in IntelliJ. Classes affected are IndTestHsic, KernelUtils, Ling, Lingam.
The compiled classes seem to run OK, but the relevant tests break when run in Maven.
I think we should support 64 bit Windows. If so, need to translate this matrix algebra to a different library.
native_smooth_clean_master_graph.txt
The problem is the cycle checker. It checks for each node, depth first, whether there is a path from that node to itself. The question is whether there's a better way. Perhaps breadth first?
CCD does not pass its tests. The tests are good. It should pass all of them, and the tests should be commented back in.
It's not clear whether CCDGES should work or not. Probably it should be moved to a child repository unless proven correct.
See Issue #60.
Hi,
I was using "tetradcmd-5.1.0-10.jar" in windows batch. I like to make sure if the results of pc or fci search algorithm between "tetradcmd-5.1.0-10" vs. "lib-tetrad-5.3.0-20151113.150857-1-tetradcmd.jar" are different in terms of graph edges? When I ran both tetrad cmd, I got 32 edges by 'tetradcmd-5.1.0-10.jar', but I got just 27 edges by 'lib-etrad-5.3.0-20151113.150857-1-tetradcmd.jar'. Also, I found that some edge directions and interacting nodes are different between two results. For example,
SNP_A-2127756_3 --- SNP_A-1839049_2 vs. SNP_A-2127756_3 <->SNP_A-1999524_1.
Do you recommend to use the latest version of TETRAD cmd for accurate(?) search result?
Thank you,
Sanghoon
Several tests in TestGeneralizedSem (and maybe some other classes) depend on a random seed and sometimes fail. Need to fix a random seed for which they do not fail.
There were several versions of each; only one is needed.
I have a data of 20,000 records, my Bayes Parametric model contains latent variables and that is why I used EM Bayes Estimator to find an estimate of the parameters of the model. The problem is, the running time is very long - I waited few hours before I stopped the learning process. I have found other software (GeNIe, https://dslpitt.org/genie/) which can be used to estimate the parameters of the model for a given data and I was able to find an estimate of the parameters of my model for a shorter time. I have manually inserted the parameter values in the component "Instantiated model", however, I was not able to find a functionality to estimate the model fitness (P-value) so that I can know how good is my model. Could you please tell me whether this type of functionality exists in Tetrad?
Got rid of all but one gradient search for PAL.
The docs directory contains for instance several papers for algorihtms. Should add in known papers for algorithms that aren't already in the docs directory.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.