Giter Site home page Giter Site logo

lace's People

Contributors

brianjohnhaas avatar damayanthiherath avatar dpryan79 avatar hdashnow avatar lix1993 avatar nadiadavidson avatar quarkins avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lace's Issues

failed install Lace

After install Lace ,:

conda env create -f environment.yml 
conda activate lace
pip install .

running Lace return s

Lace/Lace_run.py  -h
Traceback (most recent call last):
  File "/share/software/Lace/latest/Lace/Lace_run.py", line 14, in <module>
    from Lace.BuildSuperTranscript import SuperTran
  File "/home/rna/software/miniconda3/envs/lace/lib/python3.9/site-packages/Lace/BuildSuperTranscript.py", line 11, in <module>
    import networkx as nx
  File "/home/rna/software/miniconda3/envs/lace/lib/python3.9/site-packages/networkx/__init__.py", line 114, in <module>
    import networkx.generators
  File "/home/rna/software/miniconda3/envs/lace/lib/python3.9/site-packages/networkx/generators/__init__.py", line 14, in <module>
    from networkx.generators.intersection import *
  File "/home/rna/software/miniconda3/envs/lace/lib/python3.9/site-packages/networkx/generators/intersection.py", line 13, in <module>
    from networkx.algorithms import bipartite
  File "/home/rna/software/miniconda3/envs/lace/lib/python3.9/site-packages/networkx/algorithms/__init__.py", line 16, in <module>
    from networkx.algorithms.dag import *
  File "/home/rna/software/miniconda3/envs/lace/lib/python3.9/site-packages/networkx/algorithms/dag.py", line 23, in <module>
    from fractions import gcd
ImportError: cannot import name 'gcd' from 'fractions' (/home/rna/software/miniconda3/envs/lace/lib/python3.9/fractions.py)

Failed to construct

Searched 68096 bases in 10 sequences
add_edge() takes exactly 3 arguments (4 given)
FAILED to construct
add_edge() takes exactly 3 arguments (4 given)
FAILED to construct
add_edge() takes exactly 3 arguments (4 given)
FAILED to construct
add_edge() takes exactly 3 arguments (4 given)
FAILED to construct

The following lines are reported and then the process stops.

Speed

Bypass the I/O by writing on aligner or encorporating blat using cython?

Not reverse complimenting the strand when it should

I think lace still isn't handling the strand properly. e.g. see RECK in /mnt/storage/nadiad/work_area/20160203_ALL/simulation/SIM/lace on our server
input files are all.fasta and all.groupings
in /mnt/storage/nadiad/work_area/20160203_ALL/simulation/SIM

lace was installed using conda instructions on the wiki.

Ask for help about this condition。

[chej2tc@mu01 Example]$ /GS01/software/biosoft/python/python3.5/bin/python /GS01/software/biosoft/Lace-1.00/Lace.py -o test2 Example_Genome.fasta clusters.txt


( ) / \ / )( __)
/ (
// ( (__ ) )
_
/_/_/_)(___)
Lace Version: 0.82
Last Editted: 30/01/17
Creating output directory
Creating dictionary of transcripts in clusters...
Creating a fasta file per gene...
Now Building SuperTranscript for each gene...
sh: blat: command not found
FAILED to construct
sh: blat: command not found
FAILED to construct
BUILT SUPERTRANSCRIPTS ---- 0.11878800392150879 seconds ----

Dear Oshalak,
I run the Lace.py with this code, but there two FAILED, although the output file has been created, but I don't know why has FAILED, is there any package I don't installation correctly? or This is right for run the lace?

Another question:
I don't understand how to use the "ClusterFile". The species I study just have Trinity created result, I don't know how to collect the ClusterFile. and Why we need a ClusterFile?

Does a cluster mean a unigene or transcript?

Does the cluster sequence in "SuperDuper.fasta" file mean a unigene? If it is, Can I do the regular non-model species transcriptome pipeline analysis? I mean, annotate these cluster genes(unigenes) and do downstream differential expression analysis?

analysing blat is failing

i made an assembly of arabidopsis using trinity, and trying to run LACE. however, i am getting errors. i figured out where it goes wrong, but not why.

in the function BuildGraph, between #Copy graph before simplifying and ####### Whirl Elimination ######################, i get a runtime error: dictionary changed size during iteration.

and in the first blat run, i sometimes get the error:

add_edge() takes 3 positional arguments but 4 were given

I am working on python 3.6 and used your method to install LACE. any tips?

Question: supertranscripts (conceptual question)

I have a more conceptual question. I have used Lace to create the reference SuperTranscript from de novo assembly in order to call variants between 2 individuals reared under 2 conditions (4 samples/4 libraries/4 vcf files). Reads used for calling SNPs were the same used for the de novo assembly, which was performed by trinity.
I would like to ask you why only heterozygous SNPs, which are defined as those with at least one read supporting the reference allele, should be further Analysed? I thought that homozygous SNPs would have been more informative to me as I want to detect any differences between the 2 strains.
Are these heterozygous SNPs those that are represented by GT:0/1 in the vcf files?

Thanks! Sofia

Excessive memory usage on large dataset

On a large dataset (made from 30 mouse samples, of different tissues, 100M RNAseq reads per sample) Lace consistently stalls without error. I traced this to excessive memory usage (>200GB of RAM), which exceeds our capacity to run the program on the whole dataset.

The denovo assembly was conducted in Trinity and the clustering was done using the necklace protocol. https://github.com/Oshlack/necklace.

Possibly related to issue #29 and/or #31.

Question: use SuperTranscripts for paralogs

Hi there,

Lace & superTranscripts sounds excellent for non-model organisms without a reference genome. I'd love to try it. Though the application would be slightly different. I think it may still work but would like your opinion.
I work with single celled eukaryotic algae. While they don't seem to usually splice their transcripts, they are riddled with paralogs which they transcribe. I'd like to use this method to compare paralogs by treating them the same as splice variants. Do you see any problems with this?

Cheers!

Lace processed without any files being produced.

Hello!
Could you please help me to resolve some issue I encountered while using Lace. All the time I ran it in my dataset, the job finished without any warnings, but nothing produced. The Lace on the test data worked successfully. I have a corset-produced clusters.
I have no idea how to resolve the issue. I am reinstalling and reconfiguring the Lace and trying different SLURM parameters for weeks.

Best regards
Asan

Problem with memory exceeding the limit.

Problem reported by user over email.
Appears to be a single cluster that uses all the memory.
User sent data and the problem was reproduced. We need to investigate Cluster-16676.1839

Raise error when networkx version is too high

I noticed during my use of Lace that when I included networkx v2 in my conda enviornment it ran without error and created an incorrect superTranscriptome where all "whirl" counts were set to 0 and no case change occcured in the sequences.

It would be helpful to raise this error at runtime so that new users can identify this and adjust their environment setup accordingly (especially since many other programs require networkx v2).

Lace stalls without error ...

hello,
lace stalls always without stopping or giving any error...
the individual fasta-files are generated and then supertranscripts are built (some with the usage of large amounts of RAM, up to 256+128 SWAP), but after a while the program stalls for a long time and after I stop the program with Ctrl-C following output is given: any ideas what goes wrong...

^CProcess ForkPoolWorker-5:
Traceback (most recent call last):
File "/data/analysis/Dietmar/SW/Supertranscript/Lace-master/Lace.py", line 192, in
Split(args.GenomeFile,args.ClusterFile,args.cores,args.maxTran,args.outputDir)
File "/data/analysis/Dietmar/SW/Supertranscript/Lace-master/Lace.py", line 136, in Split
pool.join()
File "/data/analysis/Dietmar/SW/Anaconda/Anaconda3/envs/lace/lib/python3.6/multiprocessing/pool.py", line 510, in join
self._worker_handler.join()
File "/data/analysis/Dietmar/SW/Anaconda/Anaconda3/envs/lace/lib/python3.6/threading.py", line 1056, in join
self._wait_for_tstate_lock()
File "/data/analysis/Dietmar/SW/Anaconda/Anaconda3/envs/lace/lib/python3.6/threading.py", line 1072, in _wait_for_tstate_lock
elif lock.acquire(block, timeout):
Traceback (most recent call last):
KeyboardInterrupt
File "/data/analysis/Dietmar/SW/Anaconda/Anaconda3/envs/lace/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/data/analysis/Dietmar/SW/Anaconda/Anaconda3/envs/lace/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/data/analysis/Dietmar/SW/Anaconda/Anaconda3/envs/lace/lib/python3.6/multiprocessing/pool.py", line 108, in worker
task = get()
File "/data/analysis/Dietmar/SW/Anaconda/Anaconda3/envs/lace/lib/python3.6/multiprocessing/queues.py", line 343, in get
res = self._reader.recv_bytes()
File "/data/analysis/Dietmar/SW/Anaconda/Anaconda3/envs/lace/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/data/analysis/Dietmar/SW/Anaconda/Anaconda3/envs/lace/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/data/analysis/Dietmar/SW/Anaconda/Anaconda3/envs/lace/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt

Python version should be downgraded in environment.yml

Hello!
Here is an issue I have recently encountered: the python version was incompatible with the networkx package.

Traceback (most recent call last):
  File "/project/_app/Lace/Lace/Lace_run.py", line 14, in <module>
    from Lace.BuildSuperTranscript import SuperTran
  File "/home/user/miniconda3/envs/lace/lib/python3.11/site-packages/Lace/BuildSuperTranscript.py", line 11, in <module>
    import networkx as nx
  File "/home/user/miniconda3/envs/lace/lib/python3.11/site-packages/networkx/__init__.py", line 114, in <module>
    import networkx.generators
  File "/home/user/miniconda3/envs/lace/lib/python3.11/site-packages/networkx/generators/__init__.py", line 14, in <module>
    from networkx.generators.intersection import *
  File "/home/user/miniconda3/envs/lace/lib/python3.11/site-packages/networkx/generators/intersection.py", line 13, in <module>
    from networkx.algorithms import bipartite
  File "/home/user/miniconda3/envs/lace/lib/python3.11/site-packages/networkx/algorithms/__init__.py", line 16, in <module>
    from networkx.algorithms.dag import *
  File "/home/user/miniconda3/envs/lace/lib/python3.11/site-packages/networkx/algorithms/dag.py", line 23, in <module>
    from fractions import gcd
ImportError: cannot import name 'gcd' from 'fractions' (/home/user/miniconda3/envs/lace/lib/python3.11/fractions.py)

Look at this compatibility table. fractions.gcd(a, b) has been moved to math.gcd(a, b) in Python 3.9. Either recent networkx version should be used or python version downgraded.

Best regards
Asan

Addtional functionality

Add functionality to:

  • Remove intermediate files (.fasta and .psl files created by SuperTranscript).
  • Supply the number of cores you wish to use.
  • Provide cross-checks and alternate annotation file or not.

(installation) In environment.yml, fix Python Version

Sadly, networkx 2.3 clashes with Python 3.9, since fractions.gcd() has been migrated to math.gcd(). Because of that, running Lace following installation causes an error:

ImportError: cannot import name 'gcd' from 'fractions'

Fixing the version of Python in Lace-1.14.1/environment.yml fixes this problem. I tried Python=3.5, and now it works.

DTE and DTU without biological replicates ?

In the example "Differential Transcript Usage on a non model organism", the script for DTE analysis requires having a biological replicate. Is it possible do the analysis without biological replicate? The DTE script of the example (voom_diff.R) doesn't run if not having biological replicate.

I modified the script and now it runs without biological replicates, but iI don't know if the analysis is correct.

My script is this:

# library
library('edgeR')

## Read in data
counts <- read.table("Counts/counts.txt",header=TRUE,sep="\t")

##Define groups
treatment = c('T1','T2')

## Make exon id
eid = paste0(counts$Chr,"-S",counts$Start,"-E",counts$End)

## Define design matrix
design <- model.matrix(~treatment)

## Make DGElist and normalise
dx <- DGEList(counts[,c(7:8)])
dx <-calcNormFactors(dx,group= treatment )

## glmFit
gfit <- glmFit(dx, design, dispersion = 0.1)
ds <- diffSpliceDGE(gfit, geneid = counts$Chr, exonid = eid)

## Results
topSpliceDGE(ds, number = 20, test = "Simes")
plotSpliceDGE(ds)

SuperDuperTrans.gff has wrong entries for special cases

I found a few odd entries in the chicken SuperDuperTrans.gff for the genes AKAP2 and FAM188B. Blocks are annotated beond the length of the super transcript. Both these gene have another gene that includes there name (PALM2-AKAP2 and INMT-FAM188B) and I think the annotation of these is getting confused. Some output from a command I was running that ran into the issue is pasted below.

Feature (AKAP2:4374-5634) beyond the length of AKAP2 size (3007 bp). Skipping.
Feature (AKAP2:5637-7198) beyond the length of AKAP2 size (3007 bp). Skipping.
Feature (FAM188B:2849-3179) beyond the length of FAM188B size (753 bp). Skipping.
Feature (FAM188B:3179-3409) beyond the length of FAM188B size (753 bp). Skipping.
Feature (FAM188B:2849-3179) beyond the length of FAM188B size (753 bp). Skipping.
Feature (FAM188B:3179-3237) beyond the length of FAM188B size (753 bp). Skipping.
Feature (FAM188B:3238-3282) beyond the length of FAM188B size (753 bp). Skipping.

'DiGraph' object issue on networkx 2.3

Hello!
I configured virtual environment using your environment.yml file and changed the python version to 3.9. While running Lace on test data, the error occured:
'DiGraph' object has no attribute 'node'

The suggestion on StackExchange is to change networkx version to 1.1 or modify files used by DiGraph.
What is your preferred way to solve the issue?

Best regards
Asan

mv: cannot stat ....

Error reported at the end of a run (to do with clean up).
Doesn't seems to affect the results.

Issues from Lukes review

  • Changed it so the default is 1 core
  • Have an option where you can decide where to put the output
  • Potentially have just one script Ribbon (which does both Checker and STViewer) [Optional]

Assessing completeness of SuperTranscript transcriptome assembly/BUSCO

Hello team of the Oshlak lab,

do you have experience with BUSCO-analysis on SuperTranscript data?
I have used corset and Lace to cluster and stitch plant transcriptome assemblies. Afterwards, it did not find a lot of the BUSCOs. However, when using OrthoFinder to find orthologs to additional species, which uses BLAST/Diamond, the assemblies looked more complete.

Do you think, SuperTranscripts are in principle compatible with BUSCO?

Could you think of an alternative way to check the completeness of the SuperTranscript assemblies?

Thank you,
Maria

gff table needs another column

As least featureCounts expects it. One line example:

PYGL SuperTranscript exon 0 301 . . 0

vs.

PYGL SuperTranscript exon 0 301 . . 0 .

Test

This is a test

Hello World

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.