Giter Site home page Giter Site logo

traitar's People

Contributors

abremges avatar kmooren avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

traitar's Issues

error while running traitar phenotype

I get the following error when trying to use traitar. Not sure what's up. Please advise.

NM! fixed it!

error:

Traceback (most recent call last):
  File "/export/scratch/tward/miniconda3/envs/traitar/bin/traitar", line 492, in <module>
    args.func(args)
  File "/export/scratch/tward/miniconda3/envs/traitar/bin/traitar", line 32, in phenolyze
    p = Traitar(args.input_dir, args.output_dir, args.sample2file, args.cpus, args.rearrange_heatmap, args.heatmap_format, args.no_heatmap_phenotype_clustering, args.no_heatmap_sample_clustering, args.gene_gff_type, args.primary_models, args.secondary_models)
  File "/export/scratch/tward/miniconda3/envs/traitar/bin/traitar", line 71, in __init__
    self.s2f = self.parse_sample_f()
  File "/export/scratch/tward/miniconda3/envs/traitar/bin/traitar", line 129, in parse_sample_f
    if not os.path.exists(os.path.join(self.input_dir,i)):
  File "/export/scratch/tward/miniconda3/envs/traitar/lib/python2.7/posixpath.py", line 68, in join
    if b.startswith('/'):
AttributeError: 'float' object has no attribute 'startswith'

command:

traitar phenotype /home/data/shotgun_fastas /home/traitar_test/traitar_file.txt from_nucleotides /home/trait_test/traitar_output

The files in the /home/data/shotgun_fastas dir are:

ls /home/data/shotgun_fastas
5425.137.R1.fna  5636.141.R1.fna  5636.205.R1.fna  5636.268.R1.fna  5636.344.R1.fna  5636.384.R1.fna  5636.441.R1.fna  5636.484.R1.fna  5636.547.R1.fna  5636.59.R1.fna   5636.77.R1.fna
5425.249.R1.fna  5636.143.R1.fna  5636.224.R1.fna  5636.269.R1.fna  5636.347.R1.fna  5636.389.R1.fna  5636.446.R1.fna  5636.487.R1.fna  5636.54.R1.fna   5636.600.R1.fna  5636.80.R1.fna
5425.286.R1.fna  5636.144.R1.fna  5636.225.R1.fna  5636.278.R1.fna  5636.349.R1.fna  5636.393.R1.fna  5636.451.R1.fna  5636.488.R1.fna  5636.551.R1.fna  5636.602.R1.fna  5636.88.R1.fna
5425.297.R1.fna  5636.146.R1.fna  5636.227.R1.fna  5636.288.R1.fna  5636.350.R1.fna  5636.395.R1.fna  5636.456.R1.fna  5636.495.R1.fna  5636.552.R1.fna  5636.604.R1.fna  5636.92.R1.fna
5425.342.R1.fna  5636.153.R1.fna  5636.229.R1.fna  5636.290.R1.fna  5636.353.R1.fna  5636.397.R1.fna  5636.457.R1.fna  5636.497.R1.fna  5636.558.R1.fna  5636.60.R1.fna   combined_seqs.fna
5425.470.R1.fna  5636.155.R1.fna  5636.231.R1.fna  5636.299.R1.fna  5636.354.R1.fna  5636.398.R1.fna  5636.458.R1.fna  5636.517.R1.fna  5636.561.R1.fna  5636.614.R1.fna
5425.475.R1.fna  5636.158.R1.fna  5636.235.R1.fna  5636.304.R1.fna  5636.356.R1.fna  5636.420.R1.fna  5636.462.R1.fna  5636.51.R1.fna   5636.563.R1.fna  5636.617.R1.fna
5425.485.R1.fna  5636.186.R1.fna  5636.239.R1.fna  5636.305.R1.fna  5636.365.R1.fna  5636.421.R1.fna  5636.463.R1.fna  5636.524.R1.fna  5636.577.R1.fna  5636.618.R1.fna
5425.550.R1.fna  5636.187.R1.fna  5636.243.R1.fna  5636.312.R1.fna  5636.367.R1.fna  5636.422.R1.fna  5636.468.R1.fna  5636.526.R1.fna  5636.584.R1.fna  5636.628.R1.fna
5636.118.R1.fna  5636.193.R1.fna  5636.244.R1.fna  5636.317.R1.fna  5636.370.R1.fna  5636.424.R1.fna  5636.472.R1.fna  5636.529.R1.fna  5636.587.R1.fna  5636.638.R1.fna
5636.121.R1.fna  5636.194.R1.fna  5636.248.R1.fna  5636.327.R1.fna  5636.372.R1.fna  5636.432.R1.fna  5636.476.R1.fna  5636.530.R1.fna  5636.589.R1.fna  5636.640.R1.fna
5636.126.R1.fna  5636.195.R1.fna  5636.258.R1.fna  5636.330.R1.fna  5636.378.R1.fna  5636.434.R1.fna  5636.478.R1.fna  5636.536.R1.fna  5636.58.R1.fna   5636.643.R1.fna
5636.127.R1.fna  5636.198.R1.fna  5636.263.R1.fna  5636.332.R1.fna  5636.380.R1.fna  5636.436.R1.fna  5636.479.R1.fna  5636.53.R1.fna   5636.591.R1.fna  5636.646.R1.fna
5636.131.R1.fna  5636.200.R1.fna  5636.264.R1.fna  5636.334.R1.fna  5636.381.R1.fna  5636.439.R1.fna  5636.483.R1.fna  5636.543.R1.fna  5636.598.R1.fna  5636.655.R1.fna

Attached is the traitar_file.txt
traitar_file.txt

Print version

To be included in pipelines, or to reproduce results, a (semantic) version number would be great.

urllib2 stream bug

File "/home/aweimann/.local/lib/python2.7/site-packages/traitar/get_external_data.py", line 15, in download
response = urllib2.urlopen("ftp://ftp.ebi.ac.uk/pub/databases/Pfam/releases/Pfam27.0/Pfam-A.hmm.gz", timeout = 5, stream = True)
TypeError: urlopen() got an unexpected keyword argument 'stream'

feature track generation

although run for the first time target directory

output dir traitar_out/phenotype_prediction/phypat+PGL/feat_gffs already exists; press 1 to continue with data from a previous run; press 2 to remove this directory; press 3 to abort
is prompted

also GNU parallel is run suspiciously often

legend placement

Reconsider placement of the legends in the heatmap plot i.e. swap sample color key and phenotype color key

requirements

@kmooren reported that dependencies are installed although already met by system packages.

67 Traits?

Probably doesn't belong here, but a list of all traits (plus their pfam yes/no combinations) would be awesome.

improve heatmap

  • combined heatmap for phypat and phypat+GGL with 4 different colors.
  • use discrete colors instead of color gradient.

ERROR: reduce the number of sample categories to less than 15

Hi,

I have installed v1.1.2 on my local cluster and it works with the sample data. When I run it using my own data however I get the above message even though I have only 14 Samples and Categories.

I have noticed the following in the traitar code though at line 140:

if len(uq) > 12:
sys.exit("reduce the number of sample categories to less than 15")

Is this a typo?

I am currently attempting to run with 9 samples and categories which seems to have got further.

Thanks,

Matt

Pfam track generation

In addtion to the Pfam feature track generation relevant for specific phenotypes, generate one track for the entire Pfam annotation.

Error during execution

The prodigal step finished fine but soon after I got the message about the Pfam annotation, I got a very long error:
running Pfam annotation with hmmer. This step can take a while. A rough estimate for sequential Pfam annotation of genome samples of ~3 Mbs is 10 min per genome.
Traceback (most recent call last):
File "/usr/local/bin/hmmer2filtered_best.py", line 64, in
filtered_df = apply_thresholds(args.infile_f, args.eval_thresh, args.bit_score_thresh, args.out_filt_f, args.out_excl_f)
File "/usr/local/bin/hmmer2filtered_best.py", line 24, in apply_thresholds
m_eval = m.loc[(m.iloc[:,12] <= eval_threshold) & (m.iloc[:, 13] >= bit_score_thresh), :]
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1026, in getitem
return self._getitem_tuple(key)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 617, in _getitem_tuple
retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1143, in _getitem_axis
return self._getbool_axis(key, axis=axis)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1040, in _getbool_axis
raise self._exception(detail)
KeyError: AssertionError("Cannot create BlockManager._ref_locs because block [FloatBlock: [E-value, score, bias, c-Evalue, i-Evalue, score, bias, acc], 8 x 3162, dtype: float64] with duplicate items [Index([u'target name', u'accession', u'tlen', u'query name', u'accession', u'qlen', u'E-value', u'score', u'bias', u'#', u'of', u'c-Evalue', u'i-Evalue', u'score', u'bias', u'from', u'to', u'from', u'to', u'from', u'to', u'acc', u'description of target'], dtype='object')] does not have _ref_locs set",)
Traceback (most recent call last):
File "/usr/local/bin/hmmer2filtered_best.py", line 64, in
filtered_df = apply_thresholds(args.infile_f, args.eval_thresh, args.bit_score_thresh, args.out_filt_f, args.out_excl_f)
File "/usr/local/bin/hmmer2filtered_best.py", line 24, in apply_thresholds
m_eval = m.loc[(m.iloc[:,12] <= eval_threshold) & (m.iloc[:, 13] >= bit_score_thresh), :]
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1026, in getitem
return self._getitem_tuple(key)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 617, in _getitem_tuple
retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1143, in _getitem_axis
return self._getbool_axis(key, axis=axis)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1040, in _getbool_axis
raise self._exception(detail)
KeyError: AssertionError("Cannot create BlockManager._ref_locs because block [FloatBlock: [E-value, score, bias, c-Evalue, i-Evalue, score, bias, acc], 8 x 3446, dtype: float64] with duplicate items [Index([u'target name', u'accession', u'tlen', u'query name', u'accession', u'qlen', u'E-value', u'score', u'bias', u'#', u'of', u'c-Evalue', u'i-Evalue', u'score', u'bias', u'from', u'to', u'from', u'to', u'from', u'to', u'acc', u'description of target'], dtype='object')] does not have _ref_locs set",)
Traceback (most recent call last):
File "/usr/local/bin/hmmer2filtered_best.py", line 64, in
filtered_df = apply_thresholds(args.infile_f, args.eval_thresh, args.bit_score_thresh, args.out_filt_f, args.out_excl_f)
File "/usr/local/bin/hmmer2filtered_best.py", line 24, in apply_thresholds
m_eval = m.loc[(m.iloc[:,12] <= eval_threshold) & (m.iloc[:, 13] >= bit_score_thresh), :]
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1026, in getitem
return self._getitem_tuple(key)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 617, in _getitem_tuple
retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1143, in _getitem_axis
return self._getbool_axis(key, axis=axis)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1040, in _getbool_axis
raise self._exception(detail)
KeyError: AssertionError("Cannot create BlockManager._ref_locs because block [FloatBlock: [E-value, score, bias, c-Evalue, i-Evalue, score, bias, acc], 8 x 2808, dtype: float64] with duplicate items [Index([u'target name', u'accession', u'tlen', u'query name', u'accession', u'qlen', u'E-value', u'score', u'bias', u'#', u'of', u'c-Evalue', u'i-Evalue', u'score', u'bias', u'from', u'to', u'from', u'to', u'from', u'to', u'acc', u'description of target'], dtype='object')] does not have _ref_locs set",)
Traceback (most recent call last):
File "/usr/local/bin/hmmer2filtered_best.py", line 64, in
filtered_df = apply_thresholds(args.infile_f, args.eval_thresh, args.bit_score_thresh, args.out_filt_f, args.out_excl_f)
File "/usr/local/bin/hmmer2filtered_best.py", line 24, in apply_thresholds
m_eval = m.loc[(m.iloc[:,12] <= eval_threshold) & (m.iloc[:, 13] >= bit_score_thresh), :]
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1026, in getitem
return self._getitem_tuple(key)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 617, in _getitem_tuple
retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1143, in _getitem_axis
return self._getbool_axis(key, axis=axis)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1040, in _getbool_axis
raise self._exception(detail)
KeyError: AssertionError("Cannot create BlockManager._ref_locs because block [FloatBlock: [E-value, score, bias, c-Evalue, i-Evalue, score, bias, acc], 8 x 2219, dtype: float64] with duplicate items [Index([u'target name', u'accession', u'tlen', u'query name', u'accession', u'qlen', u'E-value', u'score', u'bias', u'#', u'of', u'c-Evalue', u'i-Evalue', u'score', u'bias', u'from', u'to', u'from', u'to', u'from', u'to', u'acc', u'description of target'], dtype='object')] does not have _ref_locs set",)
Traceback (most recent call last):
File "/usr/local/bin/hmmer2filtered_best.py", line 64, in
filtered_df = apply_thresholds(args.infile_f, args.eval_thresh, args.bit_score_thresh, args.out_filt_f, args.out_excl_f)
File "/usr/local/bin/hmmer2filtered_best.py", line 24, in apply_thresholds
m_eval = m.loc[(m.iloc[:,12] <= eval_threshold) & (m.iloc[:, 13] >= bit_score_thresh), :]
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1026, in getitem
return self._getitem_tuple(key)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 617, in _getitem_tuple
retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1143, in _getitem_axis
return self._getbool_axis(key, axis=axis)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1040, in _getbool_axis
raise self._exception(detail)
KeyError: AssertionError("Cannot create BlockManager._ref_locs because block [FloatBlock: [E-value, score, bias, c-Evalue, i-Evalue, score, bias, acc], 8 x 2917, dtype: float64] with duplicate items [Index([u'target name', u'accession', u'tlen', u'query name', u'accession', u'qlen', u'E-value', u'score', u'bias', u'#', u'of', u'c-Evalue', u'i-Evalue', u'score', u'bias', u'from', u'to', u'from', u'to', u'from', u'to', u'acc', u'description of target'], dtype='object')] does not have _ref_locs set",)
Traceback (most recent call last):
File "/usr/local/bin/hmmer2filtered_best.py", line 64, in
filtered_df = apply_thresholds(args.infile_f, args.eval_thresh, args.bit_score_thresh, args.out_filt_f, args.out_excl_f)
File "/usr/local/bin/hmmer2filtered_best.py", line 24, in apply_thresholds
m_eval = m.loc[(m.iloc[:,12] <= eval_threshold) & (m.iloc[:, 13] >= bit_score_thresh), :]
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1026, in getitem
return self._getitem_tuple(key)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 617, in _getitem_tuple
retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1143, in _getitem_axis
return self._getbool_axis(key, axis=axis)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1040, in _getbool_axis
raise self._exception(detail)
KeyError: AssertionError("Cannot create BlockManager._ref_locs because block [FloatBlock: [E-value, score, bias, c-Evalue, i-Evalue, score, bias, acc], 8 x 2156, dtype: float64] with duplicate items [Index([u'target name', u'accession', u'tlen', u'query name', u'accession', u'qlen', u'E-value', u'score', u'bias', u'#', u'of', u'c-Evalue', u'i-Evalue', u'score', u'bias', u'from', u'to', u'from', u'to', u'from', u'to', u'acc', u'description of target'], dtype='object')] does not have _ref_locs set",)
Traceback (most recent call last):
File "/usr/local/bin/hmmer2filtered_best.py", line 64, in
filtered_df = apply_thresholds(args.infile_f, args.eval_thresh, args.bit_score_thresh, args.out_filt_f, args.out_excl_f)
File "/usr/local/bin/hmmer2filtered_best.py", line 24, in apply_thresholds
m_eval = m.loc[(m.iloc[:,12] <= eval_threshold) & (m.iloc[:, 13] >= bit_score_thresh), :]
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1026, in getitem
return self._getitem_tuple(key)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 617, in _getitem_tuple
retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1143, in _getitem_axis
return self._getbool_axis(key, axis=axis)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1040, in _getbool_axis
raise self._exception(detail)
KeyError: AssertionError("Cannot create BlockManager._ref_locs because block [FloatBlock: [E-value, score, bias, c-Evalue, i-Evalue, score, bias, acc], 8 x 1826, dtype: float64] with duplicate items [Index([u'target name', u'accession', u'tlen', u'query name', u'accession', u'qlen', u'E-value', u'score', u'bias', u'#', u'of', u'c-Evalue', u'i-Evalue', u'score', u'bias', u'from', u'to', u'from', u'to', u'from', u'to', u'acc', u'description of target'], dtype='object')] does not have _ref_locs set",)
Traceback (most recent call last):
File "/usr/local/bin/hmmer2filtered_best.py", line 64, in
filtered_df = apply_thresholds(args.infile_f, args.eval_thresh, args.bit_score_thresh, args.out_filt_f, args.out_excl_f)
File "/usr/local/bin/hmmer2filtered_best.py", line 24, in apply_thresholds
m_eval = m.loc[(m.iloc[:,12] <= eval_threshold) & (m.iloc[:, 13] >= bit_score_thresh), :]
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1026, in getitem
return self._getitem_tuple(key)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 617, in _getitem_tuple
retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1143, in _getitem_axis
return self._getbool_axis(key, axis=axis)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1040, in _getbool_axis
raise self._exception(detail)
KeyError: AssertionError("Cannot create BlockManager._ref_locs because block [FloatBlock: [E-value, score, bias, c-Evalue, i-Evalue, score, bias, acc], 8 x 2299, dtype: float64] with duplicate items [Index([u'target name', u'accession', u'tlen', u'query name', u'accession', u'qlen', u'E-value', u'score', u'bias', u'#', u'of', u'c-Evalue', u'i-Evalue', u'score', u'bias', u'from', u'to', u'from', u'to', u'from', u'to', u'acc', u'description of target'], dtype='object')] does not have _ref_locs set",)
ls: cannot access /home/maria/Desktop/Traitar_test_20160425/pfam_annotation/*_filtered_best.dat: No such file or directory
running phenotype prediction
Traceback (most recent call last):
File "/usr/local/bin/predict.py", line 110, in
annotate_and_predict((pt1, pt2), tarfile.open(args.model_tar, mode = "r:gz"), args.annotation_matrix,args.pfam_pts_mapping_f, args.out_dir, args.voters)
File "/usr/local/bin/predict.py", line 88, in annotate_and_predict
aggr_dfs = aggregate(pred_df, k)
File "/usr/local/bin/predict.py", line 31, in aggregate
maj_pred_dfs[0].iloc[:,i / k] = pred_df.iloc[:, i: i + k].apply(filter_pred, axis = 1, is_majority = True, k = k)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 98, in setitem
self._setitem_with_indexer(indexer, value)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 416, in _setitem_with_indexer
value = self._align_frame(indexer, value)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 587, in _align_frame
raise ValueError('Incompatible indexer with DataFrame')
ValueError: Incompatible indexer with DataFrame
Traceback (most recent call last):
File "/usr/local/bin/predict.py", line 110, in
annotate_and_predict((pt1, pt2), tarfile.open(args.model_tar, mode = "r:gz"), args.annotation_matrix,args.pfam_pts_mapping_f, args.out_dir, args.voters)
File "/usr/local/bin/predict.py", line 88, in annotate_and_predict
aggr_dfs = aggregate(pred_df, k)
File "/usr/local/bin/predict.py", line 31, in aggregate
maj_pred_dfs[0].iloc[:,i / k] = pred_df.iloc[:, i: i + k].apply(filter_pred, axis = 1, is_majority = True, k = k)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 98, in setitem
self._setitem_with_indexer(indexer, value)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 416, in _setitem_with_indexer
value = self._align_frame(indexer, value)
File "/usr/lib/python2.7/dist-packages/pandas/core/indexing.py", line 587, in _align_frame
raise ValueError('Incompatible indexer with DataFrame')
ValueError: Incompatible indexer with DataFrame
Traceback (most recent call last):
File "/usr/local/bin/merge_preds.py", line 79, in
comb_preds(args.phypat_dir, args.phypat_GGL_dir, args.out_dir, args.voters)
File "/usr/local/bin/merge_preds.py", line 19, in comb_preds
m1_scores = ps.read_csv("%s/predictions_majority-vote_mean-score.txt"%phypat_dir, index_col = 0, sep = "\t")
File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 420, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 218, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 502, in init
self._make_engine(self.engine)
File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 610, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 972, in init
self._reader = _parser.TextReader(src, **kwds)
File "parser.pyx", line 330, in pandas.parser.TextReader.cinit (pandas/parser.c:3200)
File "parser.pyx", line 557, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:5559)
IOError: File /home/maria/Desktop/Traitar_test_20160425/phenotype_prediction/phypat/predictions_majority-vote_mean-score.txt does not exist
running feature track generation
Traceback (most recent call last):
File "/usr/local/bin/traitar", line 329, in
args.func(args)
File "/usr/local/bin/traitar", line 19, in phenolyze
p.run(args.mode)
File "/usr/local/bin/traitar", line 164, in run
self.run_feature_track_generation(self.s2f.loc[:,"sample_name"], mode)
File "/usr/local/bin/traitar", line 249, in run_feature_track_generation
phypat_preds = ps.read_csv(os.path.join(self.phypat_dir, "predictions_majority-vote.txt"), index_col = 0, sep = "\t")
File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 420, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 218, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 502, in init
self._make_engine(self.engine)
File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 610, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/usr/lib/python2.7/dist-packages/pandas/io/parsers.py", line 972, in init
self._reader = _parser.TextReader(src, **kwds)
File "parser.pyx", line 330, in pandas.parser.TextReader.cinit (pandas/parser.c:3200)
File "parser.pyx", line 557, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:5559)
IOError: File /home/maria/Desktop/Traitar_test_20160425/phenotype_prediction/phypat/predictions_majority-vote.txt does not exist

Heatmap

Introduce heatmap with dendrogram as a primary visualization of the output.

example data

Change example data to strains that are in GIDEON but weren't used for the training. This way we have the actual labels in case somebody wants to check the predictions

Print smth. before the hmmsearch step

As this step takes quite a while, write something to stderr before you start it.
prodigal finishes rapidly, while hmmsearch takes forever (at least w/o parallel).

Running hmmsearch to search against Pfam HMMs, this might take a while!

Undocumented sample file

Document how the sample file should look like in the repo's README (and ideally in the program's help, too - from what I can see, it seems to be just a tsv: filename\tsome_name\n)

Optimize config

Optimize Pfam HMM db download in config mode and check config if the program is started in phenotype mode.

traitAR stdout

  • Make sure only useful output gets to the user e.g. discard Prodigal stdout, heatmap logs etc.

Review paths

Review how external scripts, data models etc. are referenced to make sure that the program can be executed from any location.

feature track generation

breaks if continue option is selected and prompts directory already exists with no previous results.

Heatmap generation fails

Hi,

I'm using the v.1.11 release, installed in a virtual environment alongisde the python dependencies, using python 2.7.3.

It works until heatmap generation, I get the following error message:

[...]
running heatmap generation
traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py:789: RuntimeWarning: divide by zero encountered in double_scalars
  automin = (y[2] - y[1]) / clen
traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py:790: RuntimeWarning: divide by zero encountered in double_scalars
  automax = (y[-2] - y[-3]) / clen
Traceback (most recent call last):
  File "traitar/bin/heatmap.py", line 482, in <module>
    heatmap(matrix, row_header, column_header, primary_pt_models, args.color_f, args.row_method, args.column_method, args.row_metric, args.column_metric,   args.out_f, args.sample_f, secondary_pt_models)
  File "traitar/bin/heatmap.py", line 288, in heatmap
    cb = mpl.colorbar.ColorbarBase(axsl, cmap=cmap_p, norm=norm, spacing='proportional', ticks=bounds, boundaries=bounds)
  File "traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py", line 323, in __init__
    self.draw_all()
  File "traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py", line 346, in draw_all
    X, Y = self._mesh()
  File "traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py", line 814, in _mesh
    y = self._proportional_y()
  File "traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py", line 789, in _proportional_y
    automin = (y[2] - y[1]) / clen
IndexError: index 2 is out of bounds for axis 0 with size 2
traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py:789: RuntimeWarning: divide by zero encountered in double_scalars
  automin = (y[2] - y[1]) / clen
traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py:790: RuntimeWarning: divide by zero encountered in double_scalars
  automax = (y[-2] - y[-3]) / clen
Traceback (most recent call last):
  File "traitar/bin/heatmap.py", line 482, in <module>
    heatmap(matrix, row_header, column_header, primary_pt_models, args.color_f, args.row_method, args.column_method, args.row_metric, args.column_metric,   args.out_f, args.sample_f, secondary_pt_models)
  File "traitar/bin/heatmap.py", line 288, in heatmap
    cb = mpl.colorbar.ColorbarBase(axsl, cmap=cmap_p, norm=norm, spacing='proportional', ticks=bounds, boundaries=bounds)
  File "traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py", line 323, in __init__
    self.draw_all()
  File "traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py", line 346, in draw_all
    X, Y = self._mesh()
  File "traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py", line 814, in _mesh
    y = self._proportional_y()
  File "traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py", line 789, in _proportional_y
    automin = (y[2] - y[1]) / clen
IndexError: index 2 is out of bounds for axis 0 with size 2
Traceback (most recent call last):
  File "traitar/bin/heatmap.py", line 482, in <module>
    heatmap(matrix, row_header, column_header, primary_pt_models, args.color_f, args.row_method, args.column_method, args.row_metric, args.column_metric,   args.out_f, args.sample_f, secondary_pt_models)
  File "traitar/bin/heatmap.py", line 288, in heatmap
    cb = mpl.colorbar.ColorbarBase(axsl, cmap=cmap_p, norm=norm, spacing='proportional', ticks=bounds, boundaries=bounds)
  File "traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py", line 323, in __init__
    self.draw_all()
  File "traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py", line 346, in draw_all
    X, Y = self._mesh()
  File "traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py", line 814, in _mesh
    y = self._proportional_y()
  File "traitar/local/lib/python2.7/site-packages/matplotlib/colorbar.py", line 789, in _proportional_y
    automin = (y[2] - y[1]) / clen
IndexError: index 2 is out of bounds for axis 0 with size 2

I am using the following command:

traitar phenotype traitar_lao traitar_lao_sub/samples.txt from_genes traitar_lao_sub_OUT/ -c 1

For the test data set heatmap generation works.

Attached a subset of the dataset I'm using (first 4 samples instead of 33), as otherwise the file couldn't be uploaded.
traitar_lao_sub.zip

Is this maybe related to the fact that I only have 1 category in the samples.txt file?

Improve naming of result files

The result files generated by traitar are partly not self explanatory

  • improve the naming
  • add description in the README

config problem

abort traitar phenotype if traitar pfam has not been run

IOError: [Errno 2] No such file or directory: '/home/aaron/traitar/traitar/config.json'

Improve heatmap details

  • Rethink color bars e.g. make optional according to user input
  • Introduce limit for the number of samples visualized in the heatmap or more advanced scale everything according to the number of samples

Error message

Hello. I have installed traitar in a mac, and runs OK, except I get this error message during heatmap generation

traitar phenotype /work/01_SAG/05_traitar/01_input /work/01_SAG/05_traitar/Traitar_samples.txt from_nucleotides /work/01_SAG/05_traitar/02_output -c 4

/usr/local/sources/bioinfo/lib/python2.7/site-packages/matplotlib/axes/_base.py:3045: UserWarning: Attempting to set identical bottom==top results
in singular transformations; automatically expanding.
bottom=0, top=0.0
'bottom=%s, top=%s') % (bottom, top))
Traceback (most recent call last):
File "/usr/local/sources/bioinfo/bin/heatmap.py", line 482, in
heatmap(matrix, row_header, column_header, primary_pt_models, args.color_f, args.row_method, args.column_method, args.row_metric, args.column_metric, args.out_f, args.sample_f, secondary_pt_models)
File "/usr/local/sources/bioinfo/bin/heatmap.py", line 278, in heatmap
dr = dr[idx1]
UnboundLocalError: local variable 'idx1' referenced before assignment
/usr/local/sources/bioinfo/lib/python2.7/site-packages/matplotlib/axes/_base.py:3045: UserWarning: Attempting to set identical bottom==top results
in singular transformations; automatically expanding.
bottom=0, top=0.0
'bottom=%s, top=%s') % (bottom, top))
Traceback (most recent call last):
File "/usr/local/sources/bioinfo/bin/heatmap.py", line 482, in
heatmap(matrix, row_header, column_header, primary_pt_models, args.color_f, args.row_method, args.column_method, args.row_metric, args.column_metric, args.out_f, args.sample_f, secondary_pt_models)
File "/usr/local/sources/bioinfo/bin/heatmap.py", line 278, in heatmap
dr = dr[idx1]
UnboundLocalError: local variable 'idx1' referenced before assignment
/usr/local/sources/bioinfo/lib/python2.7/site-packages/matplotlib/axes/_base.py:3045: UserWarning: Attempting to set identical bottom==top results
in singular transformations; automatically expanding.
bottom=0, top=0.0
'bottom=%s, top=%s') % (bottom, top))
Traceback (most recent call last):
File "/usr/local/sources/bioinfo/bin/heatmap.py", line 482, in
heatmap(matrix, row_header, column_header, primary_pt_models, args.color_f, args.row_method, args.column_method, args.row_metric, args.column_metric, args.out_f, args.sample_f, secondary_pt_models)
File "/usr/local/sources/bioinfo/bin/heatmap.py", line 278, in heatmap
dr = dr[idx1]
UnboundLocalError: local variable 'idx1' referenced before assignment

Missing parentheses in call to 'print'

Hi there,

Just read about traitar and wanted to give it a shot! Installation went fine, however it looks like I'm not able to make it run. After calling traitar I get the following error:

File "/usr/local/bin/traitar", line 173 print self.user_message % out_dir ^ SyntaxError: Missing parentheses in call to 'print'

I've played a little bit around with this error by commenting out this print line, but then another line pops up. If I'm correct, this seems like an error related to using the Python2 print vs. the Python3 print function.

traitar installed as a python3.4 module using pip. Should I install using pip2 instead?

Thanks, looking forward to use traitar!

Sander

check if external programs are available

  • Include checks to see if Prodigal, hmmer and parallel are available.
  • If parallel execution is desired and parallel is not installed, fall back to sequential execution

Packaging of config file

Taken from #48:

Unfortunately my own config file was packaged into the Traitar source distribution, which is why it's looking for the aaron folder.

Is this something you can easily change in future releases?
We should ship a vanilla version of the Traitar software.

From my understanding, the config file gets created during/after the initial installation?
Then it shouldn't be a problem for new users, but maybe for users who update Traitar...

-c leads to "Unknown option" message

Hi,

I'm not sure "-c" for using multiple processors is working as expected. There are two "Unknown option" messages appearing:

bach@serendipity:~/traitar/test$ traitar phenotype indir samples.txt from_genes outdir -c 12
running_ annotation with hmmer. This step can take a while. A rough estimate for sequential Pfam annotation of genome samples of ~3 Mbs is 10 min per genome.
Unknown option: will-cite
Unknown option: will-cite

Best,
B.

PS: (this was on the "traitar-master" ZIP downloaded from github)

traitAR name

I can't help but think of traitAntibioticResistance when reading the name, especially given the current coolness of anything AR-related. What is AR in traitAR?

Doesnt work for me at the first step

lala@kw1322:~/projects/traitar_results$ traitar phenotype ../IMG_annotation/ samples.txt from_genes .
Traceback (most recent call last):
File "/usr/local/bin/traitar", line 492, in
args.func(args)
File "/usr/local/bin/traitar", line 32, in phenolyze
p = Traitar(args.input_dir, args.output_dir, args.sample2file, args.cpus, args.rearrange_heatmap, args.heatmap_format, args.no_heatmap_phenotype_clustering, args.no_heatmap_sample_clustering, args.gene_gff_type, args.primary_models, args.secondary_models)
File "/usr/local/bin/traitar", line 71, in init
self.s2f = self.parse_sample_f()
File "/usr/local/bin/traitar", line 142, in parse_sample_f
if len(i) > 30:
TypeError: object of type 'float' has no len()

Optionally disable genome and/or phenotype clustering

In some cases, it might be desired to disable the clustering. Instead, sort the genome rows by input order, and/or the phenotype columns by some intrinsic order (that makes sense, i.e. pre-clustered by type or simply alphabetically?).

sample_data

Might be of interest to also include the sample_output, plus a README explaining each output file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.