benkaehler / q2-clawback Goto Github PK

View Code? Open in Web Editor NEW

6.0 6.0 2.0 500 KB

This is a QIIME 2 plugin. For details on QIIME 2, see https://qiime2.org.

License: BSD 3-Clause "New" or "Revised" License

Python 98.71% TeX 0.74% Makefile 0.24% HTML 0.31%

q2-clawback's People

Contributors

Stargazers

Watchers

Forkers

nbokulich khemlalnirmalkar

q2-clawback's Issues

Cross-assembling weights for difference reference databases causes issues

In some instances it is useful to assemble weights using amplicons from one set of primers, then use them to train a classifier for a amplicons from a different set of primers.

If we do that at the moment, it causes issues downstream because the sets of taxa are slightly different for the trimmed reference data sets.

So, it would be good to have a utility that takes a training reference database and a target reference database and makes sure that the training reference database cannot generate taxa that are not in the target reference database.

The cross-reference database so assembled could then be used to train the uniform classifier that is responsible for assigning taxa prior to assembling weights.

assemble-weights-from-Qiita: make more efficient

Evidently 400 GB of RAM is inadequate for assemble-weights-from-Qiita with the SILVA reference db + various EMPO3 types.

Either this command should be more efficient, or else expose options to control memory usage. E.g., I suspect the bespoke classifier training step is where we are running out of memory... just expose reads-per-batch?

ENH: devise method for intelligent assignment of `unobserved_weight`

I am seeing some cases where unobserved_weight's default setting (1e-6) is actually higher than some observed weights (of very rare species obviously).

It would be useful to automatically set this based on some intelligent approach. Some ideas:

set to the lowest observed weight (or some value below that?)
set to 1 / N where N = the total number of sequences observed?

Maybe this does not really substantially impact performance, but I wonder if it could impact classification of rare SVs.

benkaehler / q2-clawback Goto Github PK

q2-clawback's People

Contributors

Stargazers

Watchers

Forkers

q2-clawback's Issues

Cross-assembling weights for difference reference databases causes issues

assemble-weights-from-Qiita: make more efficient

ENH: devise method for intelligent assignment of `unobserved_weight`

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent