pereiramemo / big-mex Goto Github PK

View Code? Open in Web Editor NEW

12.0 12.0 1.0 79.8 MB

BiG-MEx implementation as Docker images and R packages

License: GNU General Public License v3.0

Shell 6.79% HTML 93.04% R 0.17%

big-mex's People

Contributors

Stargazers

Watchers

Forkers

mapo9

big-mex's Issues

Phylo placement repository

Create repository dedicated to the phylogenetic placement of domain seqs. onto reference trees

Make running scripts more robust

Scripts should test number of arguments

run_bgc_dom_annot.bash doesn't have a verbose option

In the wiki:

for i in $( echo "${SAMPLES}" ); do

  sudo "${BIN}/run_bgc_dom_annot.bash" \
  "${OUTPUT_DIR}/OSD${i}_ME_n_SE_shotgun_workable_merged.fastq.gz" \
  "${OUTPUT_DIR}/out_dom_annot_osd${i}" \
  --intype dna \
  --nslots 2 \
  --sample "osd${i}" \
  --verbose t

  echo "sample osd${i}"
  
done

and it returns:

WARN: Unknown option (ignored): --verbose

When output folder exists behaviour

If a folder already exists give the option to overwrite as a parameter (useful when processing many files). Now it just exits without a clear message.

Question Pertaining to FragGeneScanPlus

Hi,

I recently read your Biorxiv manuscript on BiG-MEx and really liked the methods you developed and found the subsequent analyses super interesting as well!

I am attempting to follow the workflows detailed in the wiki and perform ORF calling on my metagenome samples; however, I am unsure which FragGeneScan code base to use.

The official FragGeneScanPlus appears to have an unresolved issue related to translating of reverse frames, as is well described in the following ticket: hallamlab/FragGeneScanPlus#19 . A more up to date fork of this repo, called FragGeneScanPlusPlus [https://github.com/unipept/FragGeneScanPlusPlus], maintained by a different group, appears to resolve this issue and matches the output of the original FragGeneScan (by Rho et al 2010) for a test case (the one from the fore mentioned git issue ticket*).

Can you please confirm if FragGeneScanPlusPlus [https://github.com/unipept/FragGeneScanPlusPlus] is the program you recommend to users for metagenomic ORF calling?

Kind regards,
Rauf

Fix epereira/ufbgctoolbox:bgc_dom_merge_div

Unable to find image 'epereira/ufbgctoolbox:bgc_dom_merge_div' locally
docker: Error response from daemon: manifest for epereira/ufbgctoolbox:bgc_dom_merge_div not found.
See 'docker run --help'.

it should be

epereira/ufbgctoolbox:bgc_dom_merged_div

instead of

epereira/ufbgctoolbox:bgc_dom_merge_div

check warning in BGC profile prediction

check warning:
"In if (class(model_c) == "xgb.Booster") { ... :
the condition has length > 1 and only the first element will be used"

error when using run_bgc_dom_div.bash

Hello Emiliano.

run_bgc_dom_div.bash meta fails with following error messages,
Is there any solution ?

java -ea -Xmx198617m -cp /bioinfo/software/bbmap/current/ driver.FilterReadsByName in=/input/sample1-1.fq.gz in2=/input/sample1-2.fq.gz out=/scratch//out_dom_meta_div/redu_r1.fasta out2=/scratch//out_dom_meta_div/redu_r2.fasta names=/scratch//out_dom_meta_div/all_headers.list include=t overwrite=t
Executing driver.FilterReadsByName [in=/input/sample1-1.fq.gz, in2=/input/sample1-2.fq.gz, out=/scratch//out_dom_meta_div/redu_r1.fasta, out2=/scratch//out_dom_meta_div/redu_r2.fasta, names=/scratch//out_dom_meta_div/all_headers.list, include=t, overwrite=t]

Set INTERLEAVED to false
Exception in thread "main" java.lang.RuntimeException: Can't find file /input/sample1-1.fq.gz
at fileIO.ReadWrite.getRawInputStream(ReadWrite.java:880)
at fileIO.ReadWrite.getGZipInputStream(ReadWrite.java:973)
at fileIO.ReadWrite.getInputStream(ReadWrite.java:813)
at fileIO.FileFormat.getFirstOctet(FileFormat.java:407)
at stream.FASTQ.isInterleaved(FASTQ.java:126)
at stream.FastqReadInputStream.(FastqReadInputStream.java:60)
at stream.ConcurrentReadInputStream.getReadInputStream(ConcurrentReadInputStream.java:119)
at stream.ConcurrentReadInputStream.getReadInputStream(ConcurrentReadInputStream.java:55)
at driver.FilterReadsByName.process(FilterReadsByName.java:256)
at driver.FilterReadsByName.main(FilterReadsByName.java:41)
filterbyname R1 and R2 failed

Update README with newest Docker container information

Hello,

we try to deploy BiG-MEx on our HPC-Cluster via Singularity, but the Docker images mentioned in the README and wiki aren't partially not available. It's also not very clear in the documentation which container is actually relevant or not, as the container names in the wrapper scripts and the README aren't always the same.

Not available (or not public?) images mentioned in README/Wiki

epereira/bgc_dom_div

Not available (or not public?) images mentioned in the corresponding scripts

Can you update the README with the newest information needed to deploy BiG-MEx?

Thanks in advance and best regards,
Felix

Amplicon data analysis: skip targeted assembly

Add an option to process amplicon data, without generating a targeted assembly.

Conda version

Hi,

Thanks for a nice tool. Is it possible to make it available via conda?

-Susheel

Wiki: OSD*_orfs.faa missing in metagenomic_samples.tar.gz

Hello,

I try to go through the Getting Started in your Wiki, but it seems that there are files missing in the metagenomic_samples.tar.gz.

The Wiki says:

However, for this tutorial, we included the predicted ORFs (i.e., OSD*_orfs.faa) in metagenomic_samples.tar.gz.

But there are no OSD*_orfs.faa files in the archive download from: https://owncloud.mpi-bremen.de/index.php/s/fibkaNcmGifhLv9/download.

Best Regards,
Felix