smith-chem-wisc / spritz Goto Github PK
View Code? Open in Web Editor NEWSoftware for RNA-Seq analysis to create sample-specific proteoform databases from RNA-Seq data
Home Page: https://smith-chem-wisc.github.io/Spritz/
License: MIT License
Software for RNA-Seq analysis to create sample-specific proteoform databases from RNA-Seq data
Home Page: https://smith-chem-wisc.github.io/Spritz/
License: MIT License
See MetaMorpheus
Implement UCSCDownloadsWrapper similar to EnsemblDownloadsWrapper
There's more functionality we can implement for commandline output:
https://github.com/commandlineparser/commandline/wiki.
I especially like notes of error messages regarding the commandline input.
We can't use AppVeyor because the tests need to use Windows Subsystem for Linux
This lazy initialization will also help with downloading different versions only when needed (see #60).
In other words, note the dependencies in each workflow, and have ManageToolsFlow
download those each time if needed.
SplitNCigarReads can't use multiple threads, so run it in parallel for several BAM files to speed things up.
Based on memory and organism being used
That would make Genome.KaryotypicOrder()
and Genome.IsKaryotypic()
much easier!
Might need to catch this error somehow. I can't change the ulimit -n
maximum.
The /CMD/bin/Debug/STAR file is empty after the first time setup. (Both in Orion or LEI)
Delete the STAR directory, then setup again will install STAR successfully.
Related to STARWrapper.cs Line 184.
A bug only in LEI. a /CMD/bin/Debug/bin directory contain something in /CMD/bin/Debug/bedops are not remove somehow
Already partially implemented in Proteogenomics.ProteinAnnotation
The flow will generate multiple types of files (eg. STAR2PassAlignFLow.AlignFastqs have a lot of string path include index, sam file bam file...). I prefer to set those string path as constant ().
.
to the transcript ID for GRCh38.86 and not for GRCh37.75It can be used for more than one analysis.
Important for the GUI. We can use the BioDotNet framework for this:
Make Interval
, Marker
, and IntervalAndSubInterval<T>
classes like SnpEff, which have nice methods for applying variants. Interval can use SequenceRange from DotNetBio, since it's already comparable. Markers
, a collection of Marker
objects can be SequenceRangeGrouping
, a collection of SequenceRange objects that allows intersecting and such.
The key goal here is applying SNPs, MNPs, and indels in a tidy way.
Using DotNetBio SAMFlags implementation instead of RSeQC
Defaults are probably okay, but there are some advanced options:
--fragment-length-min <int>
Minimum read/insert length allowed. This is also the value for the
Bowtie/Bowtie2 -I option. (Default: 1)
--fragment-length-max <int>
Maximum read/insert length allowed. This is also the value for the
Bowtie/Bowtie 2 -X option. (Default: 1000)
--fragment-length-mean <double>
(single-end data only) The mean of the fragment length distribution,
which is assumed to be a Gaussian. (Default: -1, which disables use of
the fragment length distribution)
--fragment-length-sd <double>
(single-end data only) The standard deviation of the fragment length
distribution, which is assumed to be a Gaussian. (Default: 0, which
assumes that all fragments are of the same length, given by the
rounded value of --fragment-length-mean)
The TopHat website recommends to use HISAT2 instead https://ccb.jhu.edu/software/tophat/index.shtml
These setups need root permissions. 1) Generate scripts for each, 2) Run sudo bash script.bash &
for each to run in parallel with root permissions.
Currently, this is:
commands.Add
(
"if commmand -v " + dependency + " > /dev/null 2>&1 ; then\n" +
" echo found\n" +
"else\n" +
" sudo apt-get -y install " + dependency + "\n" +
"fi"
);
It should be something that checks for the apt dependency, not the command...
Limited it to 5 variants per transcript for now.
We should have a way to clean up all of the installed packages and return the bash commandline to the way it was before setting up Spritz.
using slncky and other tools
This is a good bug to note in a tutorial vid
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.