kimmo1019 / hicgan Goto Github PK
View Code? Open in Web Editor NEWHi-C data super-resolution with generative adversarial networks
License: MIT License
Hi-C data super-resolution with generative adversarial networks
License: MIT License
Hey there,
Currently, I am trying to reproduce your work with Rao2014 GM12878 data.
However,at the stage where we preprocess and normalize the data and split it using data_split.py, the process gets killed due to excessive memory requirement.
I work on a workstation with 64gb RAM.
I am wondering on what memory specs did you convey this analysis?
The article states the used GPU cards but not the required memory.
Thanks in advance!
Hi kimmo1019,
I am wondering how do you recover the matrix?
We used some data to train your model and the shape of the tensor is (768, 40, 40, 1).
However, the shape of the related index file is (1407, 2).
It is not consistent(768!=1407). Could you tell me where is the index file created by your program? Thank you very much in advance for your time!
Hello there,
I am using your model. Even with pre-trained model parameters, I obtain pretty bad predictions.
I want to combine the subregions into per chromosome matrix and then compare instead of one-to-one subregion comparison. Below is the reverse of your code to combine subregions into the whole matrix:
import numpy as np
import hickle as hkl
import pandas as pd
import math
we'll need chromosome sizes and indices of submatrices from the original test_data
df = pd.read_csv("../../../chromosome.txt", sep="\t", header=None)
chrsizes = df.values[0:25,1] # sex and mitochondrial chrs included hg19
re_mat = hkl.load("../test_data.hkl")
dists = re_mat[2]
distc = []
for i in range(0,len(dists)):
distc.append(dists[i][1])
predchr = pd.unique(distc)
sub_inds = np.load("test_allchr_subregion_inds.npy")
thred = 200
size = 40
c = 1
for cname in predchr:
pr_mat = np.load('%s/sr_mats_pre.npy'%cname)
remat_ind = sum(sub_inds[:c])
c +=1
rematCond = re_mat[2][remat_ind:sum(sub_inds[:c])]
pp = 0
cnum = int(cname.split("chr")[1])
bin = int(math.ceil(chrsizes[cnum-1]/10000.0)) # ceil returns float !
row,col = bin,bin
sr_mat = -1*np.ones((row,col))
recombine the predicted matrix into original dimensions
for idx1 in range(0,row-size,size):
for idx2 in range (0,col-size,size):
my_cond = rematCond[pp][:]==[idx1-idx2,cname]
if (abs(idx1-idx2)<thred) & (my_cond):
sr_mat[idx1:idx1+size,idx2:idx2+size] = pr_mat[pp].reshape(40,40)
pp+=1
if pp==pr_mat.shape[0]:
break;
if pp==pr_mat.shape[0]:
break;
np.save("./pred_%s_hicGAN.npy"%cname,sr_mat)
I deal with your model for a while now and I couldn't detect my mistake if there is any.
So what do you think, when you updated the model do you think there appeared a problem?
Thank you,
Hi,
Thanks for the model.
I'm trying to use hicGAN to predict Hi-C matrix (data on GM12878).
I trained the model and stopped it after training for 7 days. Since I'm not familiar with TensorFlow V1, is there any hint for training? It trains quite slow.
It works fine to predict by loading "g_hicgan_best.npz", but the prediction was bad.
So I want to use the pre-train model. However, It shows an error when it loading "g_hicgan_GM12878_weights.npz":
tl.files.load_and_assign_npz(sess=sess, name=model_name, network=net_g)
File "/rhome/yhu/bigdata/.conda/envs/env_hicgan/lib/python3.6/site-packages/tensorlayer/files/utils.py", line 1712, in load_and_assign_npz
params = load_npz(name=name)
File "/rhome/yhu/bigdata/.conda/envs/env_hicgan/lib/python3.6/site-packages/tensorlayer/files/utils.py", line 1645, in load_npz
return d['params']
File "/rhome/yhu/bigdata/.conda/envs/env_hicgan/lib/python3.6/site-packages/numpy/lib/npyio.py", line 251, in __getitem__
pickle_kwargs=self.pickle_kwargs)
File "/rhome/yhu/bigdata/.conda/envs/env_hicgan/lib/python3.6/site-packages/numpy/lib/format.py", line 663, in read_array
"to numpy.load" % (err,))
Any suggestion?
Bests.
I'm sorry, I could not find the data(train_data_raw_count.hkl), and can you give the data? thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.