Comments (4)
Hi @cgirardot,
The matrix needs to be in cool format. You are correct, by default chromosight will balance (i.e. normalize) the matrix if needed. If the balancing weights are already in the cool file, they will be reused. (so it does not matter if the cool file is balanced or not)
We generally use 10kb resolution for mammalian maps. For yeast and other micro-organisms, usually 2kb. These resolutions seem to generally work well for loop detection.
If your coverage is too low, you may have very few detection. In that case you may want to use a lower resolution and eventually increase the --perc-zero
option to keep windows with a higher proportion of empty pixels.
I would say you can expect coverage to become an issue if you have <200M reads on mouse / human, or <10-15M on yeast, but the best is generally to visualise the map and see if what you're looking for is visible (e.g. loops).
from chromosight.
Hi @cmdoret
Thank you for the quick answer. I am not sure which reads are included in your number i.e. I work with fly (~ 140Mb genome) and I have ~40M valid unique reads (after all filtering) involved in intrachromosomal contacts per replicates ( 2 replicates per condition). I am planning to compare:
- replicates with each other so 40M vs 40M reads
- conditions so 80M vs 80M reads
Given your input, it seems like I could go for 5K bins. Would you agree? Thanks for your input.
from chromosight.
I was talking about contacts in the raw matrix as well, yes.
I do not have much experience with drosophila, but 40M reads should be OK for the genome size.
Yes, I would try with 5kb and go down if you miss many longer range loops (where contacts are scarcer).
Note: What I said about balancing holds true if you have a standard cool file. That is, your raw contacts are stored in the pixels table, and balancing weights would be in the bins table. If for some reason, normalized contacts are hard coded as floats in the pixels table, you would need to run chromosight with
--norm=raw
to prevent it from normalizing your already normalized contacts. You can check that usingcooler dump -t pixels myfile.cool
. If the last column contains integers, everything's good, but if they're floats you need to prevent normalization.
from chromosight.
great, thx a lot.
from chromosight.
Related Issues (20)
- How to compare loops of Hi-C from different conditions like DEG in RNA-Seq? HOT 4
- Questions about resolution in loop json file HOT 2
- Chromosight for single-cell Hi-C HOT 8
- Point and click mode HOT 8
- Different number of patterns for the same Hi-C matrix HOT 3
- How to evaluate the detected loops? HOT 4
- Different number of loops on GM12878 Hi-C map HOT 2
- chromosight detects hairpin, the numer is too large HOT 1
- Can chromosight detect loops based on restriction fragments level (1f, 2f, etc) HI-C matrix. HOT 2
- Bus error in chromosight quantify HOT 2
- How does Chromosight compute the Pearson correlation ? HOT 2
- Is there a way to use multiresolution .mcool files directly? HOT 1
- Where do these kernels come from? HOT 5
- Loop score calculation HOT 4
- Pattern = TAD? HOT 1
- Recommend parameters of borders detect HOT 1
- Tuning the parameters (perc-zero, perc-undetected, pearson) for a relatively small dataset HOT 2
- HicMatrix generated cool file not supported - possible solution HOT 6
- Handle variable bin size
- plots appear blurry
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chromosight.