Giter Site home page Giter Site logo

gistic2's Introduction

GISTIC2

This repository contains the Matlab source code for GISTIC (Genomic Identification of Significant Targets In Cancer), version 2.0.

Check out the GISTIC GitHub Page for links to GISTIC downloads, user documentaion, and publications.

Cloning the gistic2 project

This repository contains a submodule of matlab functions for processing copy number data named 'snputil.' To ensure that the snputil subdirectory is populated with files, you should clone this repository using the --recursive option:

git clone --recursive  https://github.com/broadinstitute/gistic2.git

Repository directory structure

docs - HTML source for links and user documentation published in the GISTIC GitHub page.

refgenes - source code for creating reference genome input files (offered on an 'as is' basis).

Gencode.v22.170324 - download EMBL files and build a Gencode reference genome.

hg38.UCSC.add_mir.160920 - download UCSC files and build an hg38 reference genome.

snputil - git submodule of utility Matlab functions for analyzing copy number data.

@SegArray - defines the SegArray class, a data compression scheme for segmented copy number.

source - Matlab source code for GISTIC.

support - additional files needed to build a GISTIC tarball.

user_docs - HTML source for standalone documentation in the current GISTIC tarball (v2.0.23).

gistic2's People

Contributors

stevenschumacher avatar

Stargazers

 avatar Duo Xie avatar  avatar 灵珠 avatar  avatar Bipin Singh avatar Qinsi avatar JINGXINXING avatar Maya Ylagan avatar Dandan avatar mhanbioinfo avatar Patrick Blaney avatar deaboa avatar Bo Zhao avatar Patrik da Silva Vital avatar Felix Beaudry, Ph.D. avatar wook2014 avatar Tao Wu avatar Hasi Hays (PhD) avatar Dinh Ngoc Khanh avatar Lukas avatar  avatar  avatar Hai Yang avatar  avatar  avatar Andrew McKay avatar zmiimz avatar Youcai avatar MJ LUO avatar mzhoufulai avatar Maria Roman Escorza avatar Simone Zhang avatar Xiaolong Cao avatar Shixiang Wang (王诗翔) avatar  avatar Samhita avatar Cal P avatar  avatar dodofly avatar Liang-Bo Wang avatar  avatar

Watchers

James Cloos avatar David Heiman avatar  avatar michellec avatar  avatar Samhita avatar  avatar

gistic2's Issues

Index exceeds matrix dimensions.

Hello, I encountered an error while using online analysis. May I know how to resolve it?
Index exceeds matrix dimensions.

Error in normalize_by_arm_length (line 85)

Error in make_sample_B (line 50)

Error in perform_deconstruction (line 68)

Error in perform_ziggurat_deconstruction (line 126)

Error in run_focal_gistic (line 151)

Error in run_gistic20 (line 124)

Error in run_gistic2_from_seg (line 249)

Error in gp_gistic2_from_seg (line 97)

MATLAB:badsubscript

How to create customized refgene mat file

Could you provide the code used to create the refgene.mat file?
/xchip/gistic/variables/20160920_UCSC_dump/hg38/make_hg38_rg_160920.m
So I can create a gene.mat file for other organisms.
Thank you

BLAS loading error

I have installed the latest version of MATLAB, but encountered the following error during runtime.

BLAS loading error:
refblas.so: cannot open shared object file: No such file or directory

Error in SegArray/nnz (line 13)

Error in SegArray/subsasgn (line 72)

Error in smooth_cbs (line 35)

Error in clean_gistic_input (line 77)

Error in run_gistic2_from_seg (line 238)

Error in gp_gistic2_from_seg (line 97)

MATLAB:binder:loadFailure

How to get the table_amp.conf_*.txt file?

I found in gdc, all the gistic2 output files got this table_amp.conf_99.txt and table_del.conf_99.txt, How to set the parameter to get these file? or how to get the significant peak in one region? if a wide region got 2 peaks ?

question about calling CNV with tumor and normal-pair sample

Dear developer

I want to call CNV with my tumor sample(array data), and the normal-pair sample i had.

When I run the MoChA pipeline, some hg19/hg39 related files was needed.

Should I use the nomal sample related files as the input instead of that hg19/hg39 related files? Could you tell me which files I should change?

Thank you!

No usage statement in the documentation

Hello,

I am looking at the official documentation for the tool, and although there is text relating to expected inputs and outputs but there is no usage statement. In general, there is a lack of information and guidance on how users are meant to execute the tool

Error with the MCR environment

hi, there

I have downloaded the latest version source code(tar.gz) and uncompressed it.

And then I run the run_gistic_example file and get the following error messages.

--- creating output directory ---
--- running GISTIC ---
Setting Matlab MCR root to /data_6t/lizhan/02.software/gistic_test/gistic2-2.0.23/support/MATLAB_Compiler_Runtime
Error:Could not find version 7.14 of the MCR.
Attempting to load libmwmclmcrrt.so.7.14.
Please install the correct version of the MCR.
Contact your vendor if you do not have an installer for the MCR.

gistic- overlapp issue

Dear GISTIC users.
Im beginner to GISTIC, and I have faced overlapped issues in CNVanalysis using GISTIC online, my file is a CNV of Colon cancer from TCGA.

this is a few lines of error:
Warning: Shortened 60 segments in '/opt/gpcloud/gp_home/users/Hanico88/uploads/tmp/run4785456974432728835.tmp/seg.file/1/111.txt' that overlap by one marker.
[�> In make_D_from_segseq_data at 122
In run_gistic2_from_seg at 193
In gp_gistic2_from_seg at 97]

would you please help me solve this issue?
my segmentation file is attached below
seg file.zip

about Segmentation fault (core dumped) bug

Hi,

Recently, I wanna calculate gistics from the tcga, but the program was interrupted with below error info in my log:

$ tail /home/yzpeng/matlab_crash_dump.2314-1
[117] 0x00007f6a6e07d609          /usr/lib/x86_64-linux-gnu/libpthread.so.0+00038409
[118] 0x00007f6a6dfa4293                /usr/lib/x86_64-linux-gnu/libc.so.6+01188499 clone+00000067
If this problem is reproducible, please submit a Service Request via:
    http://www.mathworks.com/support/contact_us/
A technical support engineer might contact you with further information.

And the output document only including:

$ ls ~/1.pipeline/01_wes/15-cnv-diff/gistics_output/2ed4.0/
D.cap1.5.mat       gistic_inputs.mat      scores.0.5.mat
focal_dat.0.5.mat  sample_seg_counts.txt

The only difference of this file is the size, because this segment file is pretty bigger(combination of tcga data)

I saw this error was might be a bug that due to a matlab version problem.

But I am still puzzle, because the problem never happen when I handle other data in the past.

My os:
20.04.1-Ubuntu

My command:

nohup ./gistic2 -b ~/1.pipeline/01_wes/15-cnv-diff/gistics_output/1st3.0 -seg ~/1.pipeline/01_wes/15-cnv-diff/tumor.low.seg.txt -refgene refgenefiles/hg19.UCSC.add_miR.140312.refgene.mat  -genegistic 1 -smallmem 1 -broad 1 -brlen 0.5 -conf 0.95 -armpeel 1 -savegene 1 -twosize 1 -maxseg 10000 -gcm extreme 1>2ed.log 2>&1 &

Thanks.

x chromosome CNV analysis

Hi. I am trying to format my seg file to run CNV analysis on chromosome X. I've tried leaving the "chromosome" column notation as "X" and converted "X" to 23, however both attempts were unsuccessful with different kinds of errors encountered. When i assign 23 to chromosome X, the run is works, however it includes output on all chromosomes except X. The reference genome I am using is hg19. Any thoughts as what may be wrong? Thanks!

genes in wide peak

Hi,
Thanks for this wonderful tool. I used GISTIC2 to analyze my WES data, and I got several file, including a file named "amp_genes.conf_95.txt", and a file named "all_data_by_genes.txt", I found the genes in wide peak in del_genes.conf_95.txt did not matched the genes in wide peak in all_data_by_genes.txt, details in below:
amp_genes
all_data_by_genes
many genes were omitted in file amp_genes.conf_95.txt

additionally, the gene EGFR locats in 7p11.2, but in the file amp_genes.conf_95.txt, EGFR located in 7q36.1. Similarly, BRAF locates in 7q34, but in the amp_genes.conf_95.txt, BRAF located in 7q31.2.
Hope your reponse! Appreciate

Compiling GISTIC2 for ARM64

Hello,

I am trying to compile GISTIC2 to work on Apple Silicon and running into the problem that some functions are undefined. The functions verbose, verbosedisp and set_verbose_level were absent. I found a workaround by creating my own versions of these functions since they likely printing functions. However, now I am seeing that impose_default_value which will be harder to recreate.

./gp_gistic2_from_seg: error while loading shared libraries: libncurses.so.5

Hello, my name is Carlos Carretero. I usually apply GISTIC2 in my WS but for my last analysis I don't have enough memory. I have tried to use it in a cluster but I have the Error:

"./gp_gistic2_from_seg: error loading shared libraries: libncurses.so.5: can't open shared objects file: No such file or directory."

I understand that the problem is maybe because Matlab is not well installed . I don't know if for the cluster there is another way to install GISTIC2 or you know this kind of problems.

Another solution, but I don't know if it is recommended, is to use in WS a smaller segment file with less samples and then concatenate the outputs. But I don't know if the outputs of the different runs are comparable.

Thanks in advance

Gistic2 significant genes don't locate in the corresponding cytoband

Hi,
Significant genes don't match the cytoband in gistic2 result file "del_genes.conf_99". For example, AKT2 gene is located in cytoband 19q13.2 in hg19 genome. However, AKT2 matched to 3 cytobands in file "del_genes.conf_99", none of them was 19q13.2. How to explain this?

屏幕快照 2020-11-20 下午2 28 08

Thank you.

mouse data

Good afternoon ,
I am trying to use GISTIC for mouse data, but failed.
Do you have versions for mouse data?

THANKS!

Symbol error

I have been trying to get this to run on our HPC which uses environment modules for the software installations. I have MCR 8.3 installed and the latest gistic2 however when i go to run it i receive a symbol error:

[root@cc-dclrilog61 support]# ./run_gistic_example
--- creating output directory ---
--- running GISTIC ---
Setting Matlab MCR root to /cm/shared/apps/MCR/8.3
./gp_gistic2_from_seg: symbol lookup error: ./gp_gistic2_from_seg: undefined symbol: getInitOptionsFromCtfFile_proxy

I cannot tell where the error is being generated from and how to solve it. Any thoughts?

Michael

Unreliable GISTIC2 results

Hi

Initially I ran gistic2 on hiplot platform using 31 samples as an input. Where I got some amplified and deleted peaks. Later, I added one more samples to the segment file and put 32 samples segmented file as an input. It generated all the result files. However, we got a peak that was initially showing deleted, now GISTIC2 is calling the same peak is as amplified.

Can you please explain this?

Annotate peaks for desired genes

Hello, I have been using Gistic2 for a few weeks and usually run it using gene pattern. However, I would like to know how to highlight specific genes (for now I get chromosome locations) to the peaks in the amp/del plot. I have seen presentations and publications with such images. I am sorry for such a lame question.

interpretation of gistic2 output

Hi, I'm a new to gistic2. Thanks for this excellent tool. I tried to use gistic2 to analyzed my WES data and the output included a "all_data_by_genes.xls" file,"all_lesions.conf_95.xls" file and "all_thresholded.by_genes.xls" file. In the all_data_by_genes.xls file, EGFR gene did not show amplification,as following:
EGFR
While, in all_thresholded.by_genes.xls, EGFR showed amplification
EGFR2

the amplification plot created by gistic2 did not show 7p11.2 amplification where EGFR gene locate in.
amp_qplot

I was confused to the output results. What is the value threshold of a gene in all_data_by_genes.xls file that indicate amplication or deletion? >1 for amplication, <-1 for deletion ?
Snipaste_2022-06-19_17-04-30

Hope your response! Appreciate!

crosses chromosomes

I'm using GISTIC2 and I'm getting this very strange error.

I tried to get the GISTIC peaks on a set of samples, depending on how they are grouped. So if I put the first half of samples the program works, if I put the other half the program also works, but if I take only the even samples I have this "crosses chromosomes" error. It happens on the genomic_location.m file, for some reason this function is receiving a peak that starts in a chr and ends in another one. How? Do you have any idea of what could be the problem? Any tips?

Thanks,
Diogo

I run the following command:
./gistic2 -b $basedirM -seg $segfileM -refgene $refgenefile -genegistic 1 -smallmem 0 -broad 1 -brlen 0.5 -conf 0.90 -armpeel 1 -savegene 1 -gcm extreme -fname ${j}_M -saveseg 0 -savedata 0

This is a link to my input file

And thats the output I get:
Setting Matlab MCR root to /.mounts/labs/reimandlab/private/projects/PPCG_CNA_pipeline/GISTIC2/MATLAB_Compiler_Runtime
GISTIC version 2.0.23
Warning: Excessive score truncated for 2 samples. #<-- (This warning also appears in the cases where everything runs fine)
[...] #<-- bunch of text I'm removing because I suppose they are not relevant
Deletion wide peak at chr17:42251808-42469287(233962:233977)
Deletion wide peak at chr18:51062961-77546473(242086:244793)
Deletion wide peak at chr21:39677666-43096885(256635:256994)
Writing all_lesions file to: /.mounts/labs/test/results/1_M.all_lesions.conf_90.txt
Error using genomic_location (line 63)
crosses chromosomes

Error in add_D_peakcalls (line 90)

Error in make_all_lesions_file (line 105)

Error in write_gistic_outfiles (line 70)

Error in run_focal_gistic (line 319)

Error in run_gistic20 (line 124)

Error in run_gistic2_from_seg (line 249)

Error in gp_gistic2_from_seg (line 97)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.