statonlab / automated_annotation Goto Github PK
View Code? Open in Web Editor NEWGenerate annotation reports for organisms
Generate annotation reports for organisms
If we can find a way to tell when was the last time an organism was annotated, we can allow the admin to specify every how often should we generate fasta files for those organisms.
Currently it takes almost 30 seconds per organism to find the list of available vocabularies. Is it possible to make it faster?
It could be useful to check if there are publications, analyses, etc attached to the organism
We recently got an emailed report from this module stating that organisms that don't have AGL
or DB:swissprot:display
are a problem. I think this is happening because we didn't clear the cache since we fixes this issue statonlab/hardwoods_site#505
I'll clear the cache and try again.
Make sure that everything we need for an organism exists
The emailed annotations don't include KEGG. We should probably check for that as well.
The generated fasta files can go to the Staton server and we can remotely execute IPS/diamond
ssh
commandsWe currently have a problem where the output FASTA file contains way too many features. Annotating these features take a very long time. So far, for our HWG site, it has taken over a month!
The agreed upon move is to allow the admin to limit the number organsims that the fasta files contain. Any method to pick the top N is fine (ie order by organism_id should work).
We are supposed to get an email report every month detailing if there are organisms with missing annotations. Although the settings form has been configured correctly, I am still not getting an email.
My cron entry:
0 2 1 * * cd /var/www/html & drush annotations-check;
in #7 it came up that this module recycles the same analysis over and over for annotations.
It does this for practical reasons: its a challenge to "clean up" after the old annotation set and to ensure we dont end up with multiple annotations.
That said I posit that its more correct to create multiple analyses. Each analysis refers to a specific run of a program: thats why the date run column is a primary column, and why in #7 we can't tell the last time the feature was annotated.
Creating multiple analyses would mean archiving the old analysis and deleting the corresponding annotations it loaded, and doing it in a way that doesnt mean we've lost the annotations for end users, which is a real challenge when we're taking about feature_cvterms. its pretty striaghtforward i think for feature_analysis annotations.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.