kimmo1019 / hicgan Goto Github PK

View Code? Open in Web Editor NEW

15.0 15.0 9.0 12.83 MB

Hi-C data super-resolution with generative adversarial networks

License: MIT License

Shell 3.53% Python 27.99% Jupyter Notebook 68.48%

hicgan's People

Contributors

Stargazers

Watchers

Forkers

marzuf goodstudychina yunligrp zjgbz lhanappa bedassa koushik-kumar cmacphillamy work-hard-play-harder

hicgan's Issues

Required memory for data_split.py on Rao2014 data

Hey there,

Currently, I am trying to reproduce your work with Rao2014 GM12878 data.
However,at the stage where we preprocess and normalize the data and split it using data_split.py, the process gets killed due to excessive memory requirement.
I work on a workstation with 64gb RAM.
I am wondering on what memory specs did you convey this analysis?

The article states the used GPU cards but not the required memory.

Thanks in advance!

The shape of predicted tensor is not correct

Hi kimmo1019,
I am wondering how do you recover the matrix?
We used some data to train your model and the shape of the tensor is (768, 40, 40, 1).
However, the shape of the related index file is (1407, 2).
It is not consistent(768!=1407). Could you tell me where is the index file created by your program? Thank you very much in advance for your time!

Bad Prediction

Hello there,

I am using your model. Even with pre-trained model parameters, I obtain pretty bad predictions.

I want to combine the subregions into per chromosome matrix and then compare instead of one-to-one subregion comparison. Below is the reverse of your code to combine subregions into the whole matrix:

import numpy as np
import hickle as hkl
import pandas as pd
import math

we'll need chromosome sizes and indices of submatrices from the original test_data

df = pd.read_csv("../../../chromosome.txt", sep="\t", header=None)
chrsizes = df.values[0:25,1] # sex and mitochondrial chrs included hg19
re_mat = hkl.load("../test_data.hkl")
dists = re_mat[2]
distc = []
for i in range(0,len(dists)):
	distc.append(dists[i][1])
predchr = pd.unique(distc)
sub_inds = np.load("test_allchr_subregion_inds.npy")
thred = 200
size = 40
c = 1
for cname in predchr:
	pr_mat = np.load('%s/sr_mats_pre.npy'%cname)
	remat_ind = sum(sub_inds[:c])
	c +=1
	rematCond = re_mat[2][remat_ind:sum(sub_inds[:c])]
	pp = 0
	cnum = int(cname.split("chr")[1])
	bin = int(math.ceil(chrsizes[cnum-1]/10000.0)) # ceil returns float ! 
	row,col = bin,bin
	sr_mat = -1*np.ones((row,col))

recombine the predicted matrix into original dimensions

	for idx1 in range(0,row-size,size):
		for idx2 in range (0,col-size,size):
			my_cond = rematCond[pp][:]==[idx1-idx2,cname]
			if (abs(idx1-idx2)<thred) & (my_cond):
				sr_mat[idx1:idx1+size,idx2:idx2+size] = pr_mat[pp].reshape(40,40)
				pp+=1			
			if pp==pr_mat.shape[0]:
				break;		
		if pp==pr_mat.shape[0]:
			break;

	np.save("./pred_%s_hicGAN.npy"%cname,sr_mat)

I deal with your model for a while now and I couldn't detect my mistake if there is any.

So what do you think, when you updated the model do you think there appeared a problem?

Thank you,

Load pretrain model error

Hi,
Thanks for the model.

I'm trying to use hicGAN to predict Hi-C matrix (data on GM12878).
I trained the model and stopped it after training for 7 days. Since I'm not familiar with TensorFlow V1, is there any hint for training? It trains quite slow.
It works fine to predict by loading "g_hicgan_best.npz", but the prediction was bad.

So I want to use the pre-train model. However, It shows an error when it loading "g_hicgan_GM12878_weights.npz":

tl.files.load_and_assign_npz(sess=sess, name=model_name, network=net_g)
File "/rhome/yhu/bigdata/.conda/envs/env_hicgan/lib/python3.6/site-packages/tensorlayer/files/utils.py", line 1712, in load_and_assign_npz
params = load_npz(name=name)
File "/rhome/yhu/bigdata/.conda/envs/env_hicgan/lib/python3.6/site-packages/tensorlayer/files/utils.py", line 1645, in load_npz
return d['params']
File "/rhome/yhu/bigdata/.conda/envs/env_hicgan/lib/python3.6/site-packages/numpy/lib/npyio.py", line 251, in __getitem__
pickle_kwargs=self.pickle_kwargs)
File "/rhome/yhu/bigdata/.conda/envs/env_hicgan/lib/python3.6/site-packages/numpy/lib/format.py", line 663, in read_array
 "to numpy.load" % (err,))

Any suggestion?

Bests.

Can you give the data(train_data_raw_count.hkl) in the hiclpus's Dpath?

I'm sorry, I could not find the data(train_data_raw_count.hkl), and can you give the data? thank you

kimmo1019 / hicgan Goto Github PK

hicgan's People

Contributors

Stargazers

Watchers

Forkers

hicgan's Issues

Required memory for data_split.py on Rao2014 data

The shape of predicted tensor is not correct

Bad Prediction

Load pretrain model error

Can you give the data(train_data_raw_count.hkl) in the hiclpus's Dpath?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent