Giter Site home page Giter Site logo

Comments (13)

etal avatar etal commented on August 28, 2024

Does this mean a single row/bin/probe in the .cnr or .cnn file has a gene name like "FOO,BAR"?

Normally the scatter command will take "-g FOO,BAR" and look for a gene named FOO and another gene named BAR on the same chromosome. So it will not find the bins labeled "FOO,BAR" -- is that the bug? If so, I'm not sure what the proper fix is -- maybe if the search for "FOO" and "BAR" fails, try again with "FOO,BAR". Changing the delimiter used for "-g" isn't right, because any separator could be used to join the names of overlapping genes in the BED file.

Workaround: Replace the gene name separator in your BED/.cnn/.cnr/.cns files with something other than , -- like ; instead:

sed -i "s/,/;/g" my_sample.cnr

(Or have bedtools use a different separator, if that's how the original BED file of genes was created.)

from cnvkit.

mjafin avatar mjafin commented on August 28, 2024

That's it, thanks, will give it a try!

from cnvkit.

chapmanb avatar chapmanb commented on August 28, 2024

Eric and Miika;
Thanks a lot for the discussion and workaround. I swapped this over in bcbio to use ';' which will hopefully give us compatibility with CNVkit automatically. Miika, let us know if this doesn't fix it for you.

from cnvkit.

mjafin avatar mjafin commented on August 28, 2024

Thanks Eric and Brad, I'll test all of this and will reopen if I come across any issues.

from cnvkit.

mjafin avatar mjafin commented on August 28, 2024

@etal So we replaced commas by semicolons like this:
chr14 23440198 23441327 AJUBA;RP11-298I3.5 -0.19098 1.0
However I'm still getting this error:

AURA_142
Traceback (most recent call last):
  File "/group/ngs/src/bcbio-nextgen/0.8.8/rhel6-x64/anaconda/bin/cnvkit.py", line 4, in <module>
    __import__('pkg_resources').run_script('CNVkit==0.3.5.dev0', 'cnvkit.py')
  File "/group/ngs/src/bcbio-nextgen/0.8.8/rhel6-x64/anaconda/lib/python2.7/site-packages/setuptools-15.0-py2.7.egg/pkg_resources/__init__.py", line 723, in run_script
  File "/group/ngs/src/bcbio-nextgen/0.8.8/rhel6-x64/anaconda/lib/python2.7/site-packages/setuptools-15.0-py2.7.egg/pkg_resources/__init__.py", line 1643, in run_script
  File "/group/ngs/src/bcbio-nextgen/0.8.8/rhel6-x64/anaconda/lib/python2.7/site-packages/CNVkit-0.3.5.dev0-py2.7.egg/EGG-INFO/scripts/cnvkit.py", line 9, in <module>

  File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 719, in _cmd_scatter

  File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 758, in do_scatter

  File "build/bdist.linux-x86_64/egg/cnvlib/plots.py", line 367, in gene_coords_by_name
ValueError: No targeted gene named 'AJUBA' found

Any ideas?

from cnvkit.

etal avatar etal commented on August 28, 2024

I think your BED file has no rows labeled "AJUBA" by itself, only "AJUBA;RP11-298I3.5" and perhaps other joined names. So, while I look into a general fix, you should be able to do:

 cnvkit.py scatter Sample.cnr -g AJUBA;RP11-298I3.5

Or even:

 cnvkit.py scatter Sample.cnr -g AURA_12,AJUBA;RP11-298I3.5

Pre-processing the target BED file beforehand with "--short-names" would fix this problem for you, at the expense of losing some gene names and having to figure out separately where they are.

The general fix (not yet implemented) will probably involve creating an index of gene names split by , on the fly, and searching that separate data structure with your query. So ; would be used to hide from CNVkit the fact that these rows really represent 2 or more genes, and , (after the fix) would be used for this index.

from cnvkit.

chapmanb avatar chapmanb commented on August 28, 2024

Thanks Eric, the split gene name approach you mention is exactly what we'd be looking for. The general idea is we'd like to be able to label with multiple genes if a region overlaps (or is equal distance from) more than one. That way we don't have to come up with a prioritization approach for cases where we have multiple. We can swap back to using , for splitting genes later if that's the more correct way to separate what we want to be different genes. Thanks again for looking at this.

from cnvkit.

mjafin avatar mjafin commented on August 28, 2024

Thanks @etal and @chapmanb, I think I was jumping the gun a bit here!

from cnvkit.

etal avatar etal commented on August 28, 2024

I think the last commit provides the feature you're looking for. Could you try it and let me know?

from cnvkit.

mjafin avatar mjafin commented on August 28, 2024

@etal many thanks, sounds great! I'll give it a go first thing tomorrow.

from cnvkit.

mjafin avatar mjafin commented on August 28, 2024

Not sure if I'm doing something wrong, but tried what I think is the latest installation and am getting this:

Traceback (most recent call last):
  File "/group/ngs/src/bcbio-nextgen/0.8.8/rhel6-x64/anaconda/bin/cnvkit.py", line 9, in <module>
    args.func(args)
  File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 732, in _cmd_scatter

  File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 770, in do_scatter

  File "build/bdist.linux-x86_64/egg/cnvlib/plots.py", line 367, in gene_coords_by_name
ValueError: No targeted gene named 'MTND2P28' found

(is line 367 where this error should be coming from in the latest version?)

Here's a .cnr file line with MTND2P28 on it:

chr1    564297  565169  MTND1P23,MTND2P28,RP5-857K21.4,RP5-857K21.5     0.565725        1.0

from cnvkit.

etal avatar etal commented on August 28, 2024

The error should be coming from line 369 now, not 367. Did you pull from
GitHub? This code wasn't in the recent 0.4.0 release.

On Fri, Apr 17, 2015 at 7:52 AM, Miika Ahdesmaki [email protected]
wrote:

Not sure if I'm doing something wrong, but tried what I think is the
latest installation and am getting this:

Traceback (most recent call last):
File "/group/ngs/src/bcbio-nextgen/0.8.8/rhel6-x64/anaconda/bin/cnvkit.py", line 9, in
args.func(args)
File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 732, in _cmd_scatter

File "build/bdist.linux-x86_64/egg/cnvlib/commands.py", line 770, in do_scatter

File "build/bdist.linux-x86_64/egg/cnvlib/plots.py", line 367, in gene_coords_by_name
ValueError: No targeted gene named 'MTND2P28' found

(is line 367 where this error should be coming from in the latest version?)

Here's a .cnr file line with MTND2P28 on it:

chr1 564297 565169 MTND1P23,MTND2P28,RP5-857K21.4,RP5-857K21.5 0.565725 1.0


Reply to this email directly or view it on GitHub
#15 (comment).

from cnvkit.

mjafin avatar mjafin commented on August 28, 2024

OK, code is now up to date and things work great. Many thanks again for cnvkit, it's a great tool!

from cnvkit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.