CTAT-VirusIntegrationFinder
Visit the wiki for documentation.
License: BSD 3-Clause "New" or "Revised" License
Visit the wiki for documentation.
Hi, thanks for your useful software, and we faced a interesting issue, maybe you can give us some advice; we have run ctat_vif to detect virus-host integration sites in both DNA and RNA sequencing data; however the detectable sites have great difference between these two data, that is, integration sites in RNA data are far more than in DNA data; how can we exclude the false positive results caused by RNA alternative splicing when the ctat_vif was applied to RNA-data?
looking forward to your kindly reply~
best wishes
Hi,
I am running VIF via Docker, with additional arguments for number of threads and output directory name - however those aren't being recognized. STAR still seems to run using 4 threads (the default), and output is created in the directory VIF_starChim_init. Is there a way to fix this? Any help would be greatly appreciated. Thanks!
HI, thanks for your wonderful/excellent software.
And I have one question, maybe you can give me some ideas, there are two types of reads that can be used to find the integration sites for PE reads, one is "split", the other is "span", and how does the software confirm the exact integration sites for the span reads?
for the span reads, one reads was merely aligned to virus or human, the other reads was merely aligned to human or virus, the exact integration sites are uncertain, and many researches discarded this kind of reads. can you be kind enough to tell us how ctat_vif achieved that ?
Hi @brianjohnhaas,
We had a great experience using this tool last year for an intern project. I'm trying to replicate/scale up some of the work they did and would love to wrap this method into a nextflow/nf-core pipeline along with another approach we tried.
We have a couple of other folks who are interested in collaborating on this as well. I wanted to reach out to ask you:
Thanks for your thoughts! BTW, we are tracking our work on this here: https://github.com/nf-osi/viralintegration/tree/dev
Hi there,
For those of us provisioning cloud instances to run the CTAT containers, it would be helpful if there was a note about the RAM requirements of CTAT/VIF, so that we can account for this when we are setting up the compute environment. I was unable to run VIF on a smaller cloud instance I provisioned today; error noted that 32GB of RAM was required.
EXITING: fatal error trying to allocate genome arrays, exception thrown: std::bad_alloc
Possible cause 1: not enough RAM. Check if you have enough RAM 31259734944 bytes
Possible cause 2: not enough virtual memory allowed with ulimit. SOLUTION: run ulimit -v 31259734944
(alternatively, if this number is dynamic based on the data, it would be nice to know how to estimate it).
I'd file a PR myself with a suggested change, but I don't think it's possible to do that on wiki pages. :)
Thanks for considering it!
Hi,Thanks you for the wonderful software which can analysis virus integration
when the software apply on PE100 work well,
but sometime such as single cell sequence ,the read1 always is barcode location, only read2 is useful.
can CTAT-VIF apply on single reads
how to go
BEST WISHES
hi, Thanks you for the wonderful software which can analysis virus integration, and i have two questions.
firstly, when integration sites were near to each other, then it would sum to one site, so what's the minimum range to distinct different integration sites? around 500bp?
secondly, what's the percent of chimeric reads for this software to detect? we have made some attempt to discover the threshold, maybe the detectable chimeric rate is between 15%-20%
wish your kindly reply
Best wishes
Hi,
I tried to run CTAT-VirusIntegrationFinder(ctat_vif.v0.1.0.simg) using singularity. The command and the error message were as shown below.
/path1/singularity exec --bind /path2 /path3/ctat_vif.v0.1.0.simg /usr/local/bin/ctat-VIF.py --CPU 4 --genome_lib_dir /path4/ctat_genome_lib_build_dir --viral_db_fasta /path4/viral.123.1.genomic.fna --left_fq /path5/RHN22_clean_1.fq.gz --right_fq /path5/RHN22_clean_2.fq.gz -O /path2/RHN22/single_vif/ --out_prefix RHN22.vif
BUG: next index is smaller than previous, EXITING
Dec 01 12:38:04 ...... FATAL ERROR, exiting
It seems that the error above is related to the reference file. The viral sequence reference files we use were downloaded from NCBI which is shown below. And we further merged them into one single viral sequence reference file which was used as the inputted viral reference file. However, no error appeared when I replaced the merged reference file with any one of the three reference files. I can't figure out how to solve this problem. Could you help me with that?
ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.1.1.genomic.fna.gz
ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.2.1.genomic.fna.gz
ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.3.1.genomic.fna.gz
Best,
Jianbiao Li
sorry, when I run CTAT-vif, the error message display:
EXITING because of FATAL ERROR: could not open genome file ~/wangjiaxuan/biosoft/CTAT-vif/genome_lib/VIF_index_human-plus-hpv16/ref_genome.fa.star.idx//genomeParameters.txt
SOLUTION: check that the path to genome files, specified in --genomeDir is correct and the files are present, and have user read permsissions
Jan 10 15:58:35 ...... FATAL ERROR, exiting
it seem to lack some the human_only_star_index in genomedir, but when I download from https://github.com/STAR-Fusion/STAR-Fusion/wiki/STAR-Fusion-release-and-CTAT-Genome-Lib-Compatibility-Matrix.
the download file don't have the error file ref_genome.fa.star.idx, where can find,
thanks
#---------------------------------------------
my command is
${soft_dir}/ctat-vif \
--left $read1 \
--right ${read2} \
--genome_lib_dir $genomedir \
--sample_id yesimola \
-O ./ \
--cpu 2
and the genome_lib_dir is :
genome_lib/VIF_index_human-plus-hpv16/
├── ref_annot.gtf
├── ref_genome.fa
└── VIF
├── hg_plus_viraldb.fasta
├── hg_plus_viraldb.fasta.fai
├── hg_plus_viraldb.fasta.star.idx
│ ├── chrLength.txt
│ ├── chrNameLength.txt
│ ├── chrName.txt
│ ├── chrStart.txt
│ ├── exonGeTrInfo.tab
│ ├── exonInfo.tab
│ ├── geneInfo.tab
│ ├── Genome
│ ├── genomeParameters.txt
│ ├── Log.out
│ ├── SA
│ ├── SAindex
│ ├── sjdbInfo.txt
│ ├── sjdbList.fromGTF.out.tab
│ ├── sjdbList.out.tab
│ └── transcriptInfo.tab
├── virus_db.fasta
└── virus_db.fasta.fai
Hi,
It is looking for bamsifter within the folder "/util/bamsifter/", butr it does not exist. Should it?
This is the command I ran:
ctat-VIF.py
--genome_lib_dir ctat_genome_lib_build_dir/
--left_fq ${1}_R1.fastq.gz
--right_fq ${1}_R2.fastq.gz
--viral_db_fasta ebv.fa
--viral_db_gtf ebv.custom.gtf
-O ${1}_VIF
This is the partial error:
bash -c 'set -eou pipefail && samtools view -h VIF_starChim_init/Aligned.sortedByCoord.out.bam EBV | samtools depth - > EBV.depth.tsv '
/bin/sh: /data/MoCha/paulyr/CTAT-VirusIntegrationFinder/util/bamsifter/bamsifter: No such file or directory
Traceback (most recent call last):
File "/data/MoCha/paulyr/CTAT-VirusIntegrationFinder/ctat-VIF.py", line 678, in
main()
File "/data/MoCha/paulyr/CTAT-VirusIntegrationFinder/ctat-VIF.py", line 301, in main
pipeliner.run()
File "/gpfs/gsfs6/users/MoCha/paulyr/CTAT-VirusIntegrationFinder/PyLib/Pipeliner.py", line 75, in run
cmd.run(checkpoint_dir)
File "/gpfs/gsfs6/users/MoCha/paulyr/CTAT-VirusIntegrationFinder/PyLib/Pipeliner.py", line 136, in run
raise RuntimeError(errmsg
Thanks!
nf-core/viralintegration#28 (comment)
Just cross posting for visibility.
I have installed CTAT-vif, and get some result in my workflow. But recently I find it upgrade !, and I find it become more and more power and comprehensive.
So I want to reinstall a new version and add my analysis workflow, but after download the zip, and success make it. but when I run it next, it display some error message:
Error: Invalid or corrupt jarfile ~/wangjiaxuan/biosoft/CTAT-vif/WDL/cromwell-58.jar
And I try to down it from the github use web browser,or let co-worker also download ,even I try to down cromwell-59,those all failed.
it's very dispirited,and I notice Previous version don't have cromwell in WDL file. how to slove the problem.
thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.