Giter Site home page Giter Site logo

raphael-group / hotnet2 Goto Github PK

View Code? Open in Web Editor NEW
97.0 19.0 43.0 2.86 MB

HotNet2 is an algorithm for finding significantly altered subnetworks in a large gene interaction network

License: Other

Python 62.78% C 0.27% Fortran 2.86% HTML 32.01% Shell 2.07%

hotnet2's Issues

makeHeatFile.py scores -hf data/heats/pan12.gene2mutsig.txt or mutsig flag?

Hi @matthewreyna, I was running a test today to make sure that as many functionalities of HotNet2 are exposed. Quick question, I observed here that the filename denotes mutsig, however makeHeatFile.py scores is used (see below for line).

python $hotnet2/makeHeatFile.py \
scores \
-hf data/heats/pan12.gene2mutsig.txt \

Can you verify if is this correct?

Refering to the usage info of makeHeatFile.py, each file should be run with the appropriate arguments, and wanted to make sure everything is correct in this bit.

makeHeatFile.py \ 
mutsig --mutsig_score_file \
MUTSIG_SCORE_FILE
makeHeatFile.py \
mutation --snv_file \
SNV_FILE 
makeHeatFile.py \
scores --heat_file \
HEAT_FILE
usage: makeHeatFile.py \
music --music_score_file \
MUSIC_SCORE_FILE

Request - Technical Documentation

I'd like to use the HotNet2 algorithm/pipeline inside a Jupyter Notebook and be able to embed it in other Python scripts.

I have a couple questions which could be addressed with a bit of technical documentation:

  • Is it possible for me to bring my own NetworkX (di)graph?
  • If so, what schema its node and edge data dictionaries would need to follow to run the algorithm?
  • What functions run the algorithm?
  • Where do the results go?

If you have answers, I'd be happy to write it in Sphinx format and submit a PR :) I enjoyed your paper and thanks for making this code open source.

Gene & Module Selection

A few more questions if you don't mind:

  1. Do you have any recommendations for gene choice? Should I include all genes or a subset that are significant by some threshold?
  2. How many genes per sample? Does it matter?
  3. How many samples? Is there a lower or upper bound?
  4. How is the heat map constructed? What does each cell represent and how is the clustering done?
  5. What is the origin of the included modules?
  6. What should custom modules look like? (# of modules, number of genes covered, size of modules, connectivity etc.)

Happy if you refer me to relevant material. I may of missed these items. I've got lot's of relevant input and am trying to follow best practices.

Daryl

HINT & MULTINET influence matrices

Thanks for providing the permutations for these additional protein interaction datasets mentioned in the Nature Genetics paper. However in each downloaded dataset I don't see the "dataset"_edge_list representing the true (not permuted) protein interactions within the network and neither do I see them in the main package. Are these available for download somewhere else?

Any comments on a similar method - MashUp?

Hi Raphael group,

Just found out about your method from a talk at CSHL, and it reminded me of another method (published 1 year after yours). Their paper actually didn't seem to mention yours at all, despite the similarities.

Just wondering, if you guys have heard about this, and if so, any comments about differences between HotNet2 and MashUp?

Can be anything: focus, concepts, technicalities, etc.

updating release tags

The latest tagged release is from 2015. It'd be nice to keep these up-to-date to avoid working with intermediate development states and to know when to update our installations. I've just take a user request to install this on our cluster and having these things would help me out.

Thanks for your consideration!

Consensus visualization only shows interactions from single network

The consensus visualization only shows interactions from a single network, which is not clear to the user or the intended behavior for the visualization.

From HotNet2.py and hotnet/viz.py, it looks we need a generate_viz_json function in hotnet/viz.py to handle inputs from multiple networks or another function to merge multiple outputs from generate_viz_json.

Error exsecute HotNet2

Good morning,
I tried executing HotNet2 on a Mac with Python 3.6.5 (command: python -V). As recommended on the README.txt file (present in the project’s root), I used virtualenv (v. 16.2.0, installed with the command: pip install virtualenv).

I used the “paper_commands.sh” file to see the commands to be executed via terminal. I based myself on:
> python ../makeNetworkFiles.py -e  data/networks/hint+hi2012/hint+hi2012_edge_list -i  data/networks/hint+hi2012/hint+hi2012_index_gene -nn hint+hi2012 -p  hint+hi2012 -b  0.4 -o  data/networks/hint+hi2012 -np 100 -c  1

Executing it, it gives me this error:
File "makeNetworkFiles.py", line 60
if not args.only_permutations:
^
SyntaxError: invalid syntax

The error is related to the -op (only permutation) parameter which is not defined in the command. So, I did some tests adding the file “data/heats/pan12.gene2freq.txt”, present in the project, but I still do not work (so it becomes python ../makeNetworkFiles.py -e  data/networks/hint+hi2012/hint+hi2012_edge_list -i  data/networks/hint+hi2012/hint+hi2012_index_gene -nn hint+hi2012 -p  hint+hi2012 -b  0.4 -o  data/networks/hint+hi2012 -np 100 -c  1 -op data/heats/pan12.gene2freq.txt).

Later, I went down to the "example" folder. By consulting the README.txt file in the folder I tried to run the command:
> python makeRequiredPPRFiles.py @example/configs/influence_matrix.config

Even though it was executed, it still couldn’t find the file “makeRequiredPPRFiles.py”, giving me the error:

python: can't open file 'makeRequiredPPRFiles.py': [Errno 2] No such file or directory

Do you have any suggestion?

Best regards

Alessandro LUMACA

Preparing the permutation files for hotnet2

Hi,
I have some difficulties when preparing the permutation files.
Firstly, I don't want to generate the permutation files by my self, but use the 1000 files you provided. I generated the influence matrix by createPPRMat.py from the old version of hotnet2. However, I noted that I also need a h5 file for the -pnp argument in hotnet2.py, but not a directory containing 1000 files. I do not know how to create this file if I just want to use the permutation files you provided on this website.
Secondly: so, I tried to create all of the permutation files by myself by the makeNetworkFiles.py. I followed the example in paper/paper_commands.sh. But an error occurred:
Traceback (most recent call last): File "/home/ruibinxi_pkuhpc/lustre1/software/hotnet2-master/makeNetworkFiles.py", line 92, in <module> run(get_parser().parse_args(sys.argv[1:])) File "/home/ruibinxi_pkuhpc/lustre1/software/hotnet2-master/makeNetworkFiles.py", line 65, in run save_diffusion_to_file( HOTNET2, args.beta, args.gene_index_file, args.edgelist_file, pprfile, params=params) File "/lustre1/ruibinxi_pkuhpc/software/hotnet2-master/hotnet2/network.py", line 81, in save_diffusion_to_file hnio.save_hdf5(output_file, output) File "/lustre1/ruibinxi_pkuhpc/software/hotnet2-master/hotnet2/hnio.py", line 420, in save_hdf5 f = h5py.File(file_path, 'a') File "/lustre1/ruibinxi_pkuhpc/jzj/anaconda2/lib/python2.7/site-packages/h5py/_hl/files.py", line 271, in __init__ fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr) File "/lustre1/ruibinxi_pkuhpc/jzj/anaconda2/lib/python2.7/site-packages/h5py/_hl/files.py", line 115, in make_fid fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2846) File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2804) File "h5py/h5f.pyx", line 98, in h5py.h5f.create (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/h5f.c:2290) IOError: Unable to create file (Unable to open file: name = '/home/ruibinxi_pkuhpc/lustre1/software/hotnet2-master/paper/data/networks/hint+hi2012/hint+hi2012_ppr_0.4.h5', errno = 17, error message = 'file exists', flags = 15, o_flags = c2)
However, at the beginning, there is no hint+hi2012_ppr_0.4.h5 in the output directory, this file is indeed created by makeNetworkFiles.py itself. So I was confused. It stopped because a file created by itself existed...... Could you give me some advice? (If my first problem could be solved and I can skip the permutation generating step, it is the best result I want)

Thanks a lot!
Yang

Visualization color scale

Hi,
The visualization figures can be hard to interpret since they're colored according to the local max and min gene score in each figure. Can the user input max and min scores so that all the figures are colored with relation to the same scale?
Thanks,
Priya

--alpha Arguments written in hprd.config and irefindex.config

Through last update(June 12), it seems the --alpha argument required by makeRequiredPPRFiles.py was chaged into --beta argument,
but inside hprd.config and irefindex.config, --alpha hasn't been changed to beta.

hprd.config
--edgelist_file influence_matrices/hprd/hprd_edge_list
--gene_index_file influence_matrices/hprd/hprd_index_genes
--prefix hprd
--alpha 0.60
--output_dir influence_matrices/hprd

burden test

Nice paper.

Is it possible to fudge the analysis to perform on generic gene level p/q values? I've got lot's of germline burden testing results that I'd like to test.

Daryl

Tests Failing

I ran the test script checkHotNet2Consensus.sh and found that the computed results do not match the reference. How do I diagnose this issue?

This is the output from the test script:

3,7c3,6
< 0     [ay, cg, ct, fl, fw, go, hh, hs, iw, kk, kp, kr, v]
< 1     [ar, bp, by, cq, cw, dv, ep, gn, hz, io, jx, kl, s] cv, hq, hy
< 2     [cl, dl, dy, ek, et, fx, fy, gi] dx
< 3     [av, ee, is, kn] gh, iz
< 4     [cm, eg, ii, jm]
\ No newline at end of file
---
> 0     [ay, cg, cm, ct, eg, fl, fw, go, hh, hs, ii, iw, jm, kk, kp, kr, v] gy
> 1     [cl, dl, dy, ek, et, fx, fy, gi] dx
> 2     [av, ee, is, kn] gh, iz
> 3     [ar, bp, by, cq, cw, dv, ep, gn, hz, io, jx, kl, s] cv, hq, hy
\ No newline at end of file

My only guess is that this may be caused by differing versions of Python packages. Is that possible? I tried to replicate the versions listed in the README, but conda reported compatibility issues between h5py 2.4.0 and NumPy1.6.2/SciPy 0.10.1, so it wouldn't let me install the exact versions listed (NetworkX 1.7 may have also conflicted with these versions of NumPy/SciPy - I don't remember exactly). This is my current python environment:

Python 2.7.13
NumPy 1.9.3
SciPy 0.17.0
NetworkX 1.7
h5py 2.4.0

Thanks,
Liron Ganel

Permuted networks were not put into subdirectories

  1. configs in examples is outdated, maybe for v1.1

  2. After run of makeNetworkfiles.py, permuted networks were not put into subdirectories with name 1, 2, 3... Therefore make followwing run of HotNet2.py fails

Running HotNet2 over a custom network

I would like to report an issue (or maybe my lack of understanding) on

min_score = min([score for score in heat.values() if score > 0])

This line is where my execution of HotNet2 halts. Further check revealed that none of values in "heat.values()" is larger than 0. In turn, it produces an empty list and "min" function raise s an exception.

If all values are zero (which is the case for me), it means that none of genes on that specific subnetwork have heat. So I wonder how this subnetwork became a significant hit. Thats where my confusion starts.

Would you please help figure out what I am missing?

Thanks!

Incorrect Edges Lead to Disconnected Components

Hello,

I'm trying to run HotNet2 on the BioPlex network (located at http://bioplex.hms.harvard.edu/data/BioPlex_interactionList_v4a.tsv) . After running the program I noticed a component with a pair of nodes that was disconnected from the others. Even at the minimal component size threshold being higher than two, the pair was present in the data.

I've been trying to debug the program and I've traced the issue as best as I could to narrow it down. Here's what I've found:

  1. The two nodes are not connected to any of the other nodes that are output as being in the same component in the original network. This appears to be true in the network as read in from the hdf5 file as well as the original text file.
  2. The two nodes are connected to one other node in the network internally. This appears to happen after the weighted graph is constructed in the run_helper function.
  3. When the visualization is output, the program looks at the original network (read through the hdf5 file) and not the weighted graph constructed by the similarity matrix, so that pair of nodes is disconnected.

I have a hard time understanding how the simliarity_matrix() and weighted_graph() functions are supposed to work, and my Python skills aren't great, so I've hit a wall in debugging this. Overall though, it looks like there's some errors with how the network is represented internally in the program.

Please let me know if I can be of more help here. I know I haven't provided many specifics, but I haven't figured out how to narrow the test case down to a small example that's trivially reproduced.

Error in findThreshold mutations

Getting a complaint about an unreferenced local variable when I run findThreshold mutations with the following input;

findThreshold.py mutations -r MAF0.1.CADD0.mutations.iref -if /home/storage/jpriest_backup/zoran/priest_apps/hotnet2-master/influence_matrices/irefindex/index_genes_uniform -hf ./MAF0.1.CADD0.heat -n 5 -c 16 -o ./MAF0.1.CADD0.iref.deltas.mutation -mf /home/storage/jpriest_backup/zoran/priest_apps/hotnet2-master/influence_matrices/irefindex/iref_ppr_0.55.mat -glf refflat.glf -gof refflat.gof -b 0.003 --bmr_file corr.bmr.MAF0.1.CADD0.combined.bmr

  • Performing permuted mutation data delta selection...
    Traceback (most recent call last):
    File "/home/storage/jpriest_backup/zoran/priest_apps/hotnet2-master/bin/findThreshold.py", line 161, in
    run(get_parser().parse_args(sys.argv[1:]))
    File "/home/storage/jpriest_backup/zoran/priest_apps/hotnet2-master/bin/findThreshold.py", line 102, in run
    deltas = get_deltas_for_mutations(args, infmat, infmat_index, heat_params)
    File "/home/storage/jpriest_backup/zoran/priest_apps/hotnet2-master/bin/findThreshold.py", line 147, in get_deltas_for_mutations
    heat_params["min_freq"], args.num_permutations, args.num_cores)
    File "/home/storage/jpriest_backup/zoran/priest_apps/hotnet2-master/hotnet2/permutations.py", line 117, in generate_mutation_permutation_heat
    genes = set([snv.gene for snv in snvs] + [cna.gene for cna in cnas])
    UnboundLocalError: local variable 'snvs' referenced before assignment

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.