smdabdoub / phylotoast Goto Github PK

Tools for phylogenetic data analysis including visualization and cluster-computing support.

License: MIT License

Python 100.00%

python phylogenetics qiime visualization 16s microbiome microbiome-analysis microbial-genomics bioinformatics computational-biology

phylotoast's People

Contributors

Stargazers

Watchers

Forkers

akshayparopkari vvaraljay bhawan1 achalneupane

phylotoast's Issues

How to remove/move legend in PCoA plots

Hello,

Is there any option to move or remove legend while making PCoA plots? My legend box is inside the plot and it is covering some plot points.

Add error handling to otu_to_tax_name.py.

While using --reverse_lookup, OTUs not found in taxa file must raise an error and report a helpful suggestion to the user.

AttributeError: 'module' object has no attribute 'rstyle'

When attempting to run the PCoA_bubble.py script, I run into the below error:

Traceback (most recent call last): File "/usr/local/bin/PCoA_bubble.py", line 254, in <module> main() File "/usr/local/bin/PCoA_bubble.py", line 251, in main category_colors, xr, yr, args.output_dir) File "/usr/local/bin/PCoA_bubble.py", line 109, in plot_PCoA graph_util.rstyle(ax) AttributeError: 'module' object has no attribute 'rstyle'

From my googling of the issue I'm assuming this has to do with matplotlib, but beyond that I'm helpless. As an aside- there are pretty major differences in the input of this script in the documentation online and the script-as-installed... but that could be the difference between the pip installed version and the github installed version correct?

My Environment

Version used: pip installed (So I assume up-to-date)
Operating System and version: Ubuntu 16.04.1 LTS

iTol.py: No warning given if mapping file sample IDs are not in the BIOM table

When calculating raw abundance (for example), if there is no overlap in sample IDs between the mapping file and the BIOM table, the output will be zero for each OTU.

While the mapping file is not strictly necessary for calculating raw abundance, it is for the other calculations. As such it is a required argument to the program. Additionally, for greater code simplicity the abundance calculation methods in the biom_calc module take a list of sample IDs. In the case of iTol.py calculating total OTU abundance, it just passes all the sample IDs in the mapping file to the abundance calculation methods. If none of those sample IDs are found in the BIOM table, nothing will be calculated and all zeroes will be output.

For these reasons, changing the code to just for the raw abundance use case would add unnecessary complexity. Instead, a warning should be output if no overlap in sample IDs is found between the mapping file and the BIOM table.

Editing raw_abundance() in biom_calc.py

Changes to biom_calc.py file

split raw_abundance() function into two functions
one to calculate total abundance and another to get the output of total abundance to transform it using a function
- the transformation function by default would be base 10 logarithmic function.

Modify parse_map_file()

Either use a csv.DictReader or have the function return the header line along with existing list of entries.

List of parse_map_file occurrences:
bin/PCoA_bubble.py: imap = util.parse_map_file(args.mapping)
bin/barcode_filter.py: barcodes = util.parse_map_file(mapFN, 1).keys()
bin/iTol.py: imap = util.parse_map_file(args.mapping)
bin/transpose_biom.py: mapping = util.parse_map_file(args.mapping)

LDA/PCoA update

LDA
- Remove --input_data_type parameter.
  - utilize argparse checks for relevant input files for bubble plots.
- Consolidate figure sizes for LDA and LDA bubble plots.
- Add optional 3D plotting functionality.
PCoA
- allow users to input distance matrices for PCoA plotting purposes.
- transfer PCoA computation to PhyloToAST backend.
  - allow users to calculate a variety of beta diversity metrics for downstream analysis.
  - add a parameter which lists out all the available beta diversity metrics which users can use.

Integrating network analysis into PhyloToAST

Incremental contributions
- ability to create network graph objects and generate .gexf files for usage in Gephi, Cytoscape, etc.
- possibly generate network plots in Python, in addition to using external visualization tools mentioned above.

Switch to Python3

Planned changes:
- Changing Python2 code syntax to align with Python3 syntax.
  - Tackling backward compatibility issues.
  - Continued usage of PEP8 and PEP257 guidelines.
- Checking and updating package dependencies.

Alpha diversity

diversity.py
- add a parameter which lists all the available alpha diversity metrics which users can use.

Changes to biom_summary() function

Move biom_summary() function from otu_calc.py to biom_calc.py.

Update PCoA_bubble

The PCoA_bubble script was written very early on, and never updated along with the rest of the visualization scripts. The code needs to be cleaned up in general, but a few specific things need to be addressed:

Remove dependence on the custom file format for specifying group names (mapping file column), colors (colors column in the mapping file), and OTU names (which can now be automatically generated by the otu_calc module)
Use the relative abundance method now available in the biom_calc module
Replace link_samples_to_categories() with the gather_categories method in the util module
Make applying the ggplot2 style optional

PCoA.py and PCoA_bubble.py do not support the new UniFrac file format in QIIME 1.9

Removing relative_abundance() from otu_calc.py

Changes to otu_calc.py

delete relative_abundance() functions.
rewrite assign_otu_membership() function to call relative_abundance from biom_calc.py file.