Comments (13)
Hi @chenggang108,
I am not sure what the problem here is, so we will have to sleuth a bit.
First of all, could you please re-install chess to update to the latest release (0.3.5)?
I think it would suffice to run
pip install chess-hic --upgrade
pip install fanc --upgrade
Then, to speed up things a bit, it might make sense to convert the input files to fanc format, then you won't need to wait 1.5 hours every time you run chess sim
on these data. You can do that with
fanc from-cooler
Could you then re-run the analysis on the converted data and paste the logs here?
Best,
Nick
from chess.
Hi Nick,
Thank you for your swift reply. I will try it and let you know.
Thanks
Gang
from chess.
Dear @nickmachnik
Same problem for me after converting the cool data to fanc format using following command
fanc from-coole case.cool case.hic
The logs are following:
2020-11-22 22:02:43,563 INFO Running 'chess sim -p 6 case.hic treat.hic chr2_1mb_win_100kb_step.bed chr2.result'
2020-11-22 22:03:31,765 INFO CHESS version: 0.3.5
2020-11-22 22:03:31,780 INFO FAN-C version: 0.9.7
2020-11-22 22:03:31,806 INFO Loading reference contact data
2020-11-22 22:10:50,411 INFO Loading query contact data
2020-11-22 22:16:16,401 INFO Loading region pairs
2020-11-22 22:16:16,493 INFO Launching workers
2020-11-22 22:16:18,605 INFO Submitting pairs for comparison
2020-11-22 22:16:19,616 INFO Could not compute similarity for 1812 region pairs.This can be due to faulty coordinates, too smallregion sizes or too many unmappable bins
2020-11-22 22:16:34,566 INFO Finished 'chess sim -p 6 case.hic treat.hic chr2_1mb_win_100kb_step.bed chr2.result'
Closing remaining open files:case.hic...donetreat.hic...done
Hope for your help.
Best wishes,
Zheng Zhuqing
from chess.
Hi @biozzq , I think I will have to try to reproduce this to understand what is going on. Do you have a suggestion for small example dataset in cooler format that I could use for this (not necessarily yours, I understand if you don't want to share that data)?
@chenggang108 , does the error persist for you after the upgrade?
from chess.
Dear @nickmachnik
I would like to share my data with you. If possible, you can download the cool file and the genome size from following link;
https://drive.google.com/drive/folders/1dm66NJD8LgZ-N8HTNNo4FKkYIWI65zCc?usp=sharing
Hope these files can help you.
Best wishes,
Zheng zhuqing
from chess.
Dear @nickmachnik
The commands I used are as following
fanc from-cooler 0h_for_tad_pairs_no_YM_40kb.cool 0h_for_tad_pairs_no_YM_40kb.hic
fanc from-cooler 60h_for_tad_pairs_no_YM_40kb.cool 60h_for_tad_pairs_no_YM_40kb.hic
chess pairs --file-input new_mm10_gsize --chromosome chr2 4000000 2000000 chr2_4mb_win_2mb_step.bed
chess sim -p 6 0h_for_tad_pairs_no_YM_40kb.hic 60h_for_tad_pairs_no_YM_40kb.hic chr2_4mb_win_2mb_step.bed chr2.result
Best wishes,
Zheng zhuqing
from chess.
Hi @nickmachnik
I did all the updates and then tried three things:
First I ran chess sim with cool files; not working
Second, I converted .cool file to .fanc files by fanc from-cooler; not working
I finally prepare .hic with my allvalidpairs generated by hicpro; these .hic files work, but I need to remove 'chr' from the bed file prepared by chess pair.
I did not generate .fanc file from allvalidpairs only because I am not familiar with FNA-C.
It looks there is some thing wrong with the cool files
from chess.
@biozzq I can reproduce the all nan output with your data, but I don't know what is wrong yet, I will try to find out.
@chenggang108 do you know what was wrong with the cool files?
@biozzq, Following up from chenggang108's comment, are you completely sure that the data / the cool files are ok?
from chess.
Dear @nickmachnik
Thank you very much, I will try to use the .hic generated by hicpro. More, which normalization should be done before running chess? From your publication (following part), you used the KR normalization but not ICE. However, I used ICE using hicpro most of time. Also, from following context, i think you should do the normalization after masking the bins as zero. Is this right?
"Finally, bins with less than 25% (human) or 10% (mouse) of the median number of fragments per bin were masked and the matrix was normalized using Knight–Ruiz (KR) matrix balancing on each chromosome independently."
Best wishes,
Zheng zhuqing
from chess.
Hi @biozzq ,
We used KR balancing, but I believe ICE should be fine too. You have to mask bins before balancing, as stated in the paper. Balancing should give you equal sums in all rows, so you won't find poorly mappable bins after.
I am not sure what you mean by 'masking the bins as zero' though, could you elaborate?
from chess.
Hi, @kaukrise found a potential fix for this issue, see #23 (comment)
from chess.
Dear @nickmachnik
Sorry, I was not clear. As masking bins can be done by different ways, for example, treating the interaction frequency between these bins as zero, and can also remove these bins from the concat maps. Thus, I want to confirm which way you used in your study. Thank you.
Best wishes,
Zheng zhuqing
from chess.
FAN-C uses numpy masked arrays for masking bins. They are simply ignored in downstream analyses, this should be different from setting them to 0. You can read more about the FAN-C pipeline here.
from chess.
Related Issues (20)
- Get observed/expected from Juicer Hi-C HOT 2
- chess --version doesn't work?
- CNV bias in normalization HOT 2
- Conditions for conservation analysis of syntenic blocks HOT 5
- Nan Continued HOT 2
- No valid region pairs found? HOT 1
- Different resolution produce different result HOT 1
- Should the users be concerned about the problem raised in the new Contradictory Results bioRxiv preprint? HOT 2
- conservation analysis when only a few syntenic blocks are available HOT 3
- speed up the chess run HOT 1
- error of the chess extract HOT 3
- issue of normalized/chess extract HOT 1
- error on running chess sim HOT 2
- error when running extract on .hic files HOT 1
- something different from plotting HOT 9
- _pickle.PicklingError HOT 2
- chess extract error: operands could not be broadcast together with shapes HOT 1
- data_range parameter not specified - error HOT 7
- Chess sim output .tsv file explained HOT 1
- Normalization of .hic files HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chess.