Giter Site home page Giter Site logo

Raw or Norm counts cool about chromosight HOT 4 CLOSED

cgirardot avatar cgirardot commented on July 17, 2024
Raw or Norm counts cool

from chromosight.

Comments (4)

cmdoret avatar cmdoret commented on July 17, 2024

Hi @cgirardot,

The matrix needs to be in cool format. You are correct, by default chromosight will balance (i.e. normalize) the matrix if needed. If the balancing weights are already in the cool file, they will be reused. (so it does not matter if the cool file is balanced or not)

We generally use 10kb resolution for mammalian maps. For yeast and other micro-organisms, usually 2kb. These resolutions seem to generally work well for loop detection.

If your coverage is too low, you may have very few detection. In that case you may want to use a lower resolution and eventually increase the --perc-zero option to keep windows with a higher proportion of empty pixels.

I would say you can expect coverage to become an issue if you have <200M reads on mouse / human, or <10-15M on yeast, but the best is generally to visualise the map and see if what you're looking for is visible (e.g. loops).

from chromosight.

cgirardot avatar cgirardot commented on July 17, 2024

Hi @cmdoret
Thank you for the quick answer. I am not sure which reads are included in your number i.e. I work with fly (~ 140Mb genome) and I have ~40M valid unique reads (after all filtering) involved in intrachromosomal contacts per replicates ( 2 replicates per condition). I am planning to compare:

  • replicates with each other so 40M vs 40M reads
  • conditions so 80M vs 80M reads
    Given your input, it seems like I could go for 5K bins. Would you agree? Thanks for your input.

from chromosight.

cmdoret avatar cmdoret commented on July 17, 2024

I was talking about contacts in the raw matrix as well, yes.
I do not have much experience with drosophila, but 40M reads should be OK for the genome size.
Yes, I would try with 5kb and go down if you miss many longer range loops (where contacts are scarcer).

Note: What I said about balancing holds true if you have a standard cool file. That is, your raw contacts are stored in the pixels table, and balancing weights would be in the bins table. If for some reason, normalized contacts are hard coded as floats in the pixels table, you would need to run chromosight with --norm=raw to prevent it from normalizing your already normalized contacts. You can check that using cooler dump -t pixels myfile.cool. If the last column contains integers, everything's good, but if they're floats you need to prevent normalization.

from chromosight.

cgirardot avatar cgirardot commented on July 17, 2024

great, thx a lot.

from chromosight.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.