Giter Site home page Giter Site logo

huwenboshi / hess Goto Github PK

View Code? Open in Web Editor NEW
35.0 2.0 11.0 651.96 MB

Estimate local SNP heritability and genetic covariance from GWAS summary association statistics.

Home Page: http://bogdan.bioinformatics.ucla.edu/software/hess/

License: GNU General Public License v3.0

Python 36.09% HTML 50.71% CSS 3.95% JavaScript 9.25%
summary-statistics snps reference-panel

hess's People

Contributors

huwenboshi avatar shihuwenbo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

hess's Issues

I cannot module load refpanl

I faced a problem in installing this package:
Traceback (most recent call last):
File "hess.py", line 8, in
from src.estimation import *
File "/home/yxy1234/hess_env/src/estimation.py", line 6, in
from refpanel import *
ModuleNotFoundError: No module named 'refpanel'
Could you tell me where I can install the pacakge refpanel?
Thank you.

ERROR Missing files for the reference panel

I tried STEP1-compute eigenvalues, squared projections, and product of projections in my computer, and get the ERROR:

[ERROR] Missing files for the reference panel /home/ubuntu/hess-0.5.3-beta/bfile/1kg_eur_1pct/1kg_eur_1pct_chr8.bed
I have tried other commands to view this file, they all worked, there is no problem with the path I specified. I wonder if it is because I specified the wrong file(not the bed file?) or the program does not have sufficient permissions to access this file?
I attach the command I tried to execute below:
for chrom in $(seq 22) do python /home/ubuntu/hess-0.5.3-beta/hess.py \ --local-rhog /home/ubuntu/mdd_dd/dd/dd.tsv /home/ubuntu/mdd_dd/mdd/mdd.tsv \ --chrom $chrom \ --partition /home/ubuntu/hess-0.5.3-beta/partition/nygcresearch-ldetect-data-ac125e47bf7f/EUR/fourier_ls-chr$chrom.bed \ --bfile /home/ubuntu/hess-0.5.3-beta/bfile/1kg_eur_1pct/1kg_eur_1pct_chr$chrom.bed \ --out /home/ubuntu/mdd_dd/hess_out/step1 done
Full output(CHR22as example):
[INFO] @----------------------------------------------------------@
| HESS | v0.5 | 9/October/2017 |
|----------------------------------------------------------|
| (C) 2017 Huwenbo Shi, GNU General Public License, v3 |
|----------------------------------------------------------|
| For documentation, citation & bug-report instructions: |
| http://bogdan.bioinformatics.ucla.edu/software/hess/ |
@----------------------------------------------------------@
[INFO] Command started at: Sun, 22 Jan 2023 18:36:08
[INFO] Command issued:
/home/ubuntu/hess-0.5.3-beta/hess.py
--bfile /home/ubuntu/hess-0.5.3-beta/bfile/1kg_eur_1pct/1kg_eur_1pct_chr22.bed
--local-rhog /home/ubuntu/mdd_dd/dd/dd.tsv /home/ubuntu/mdd_dd/mdd/mdd.tsv
--partition /home/ubuntu/hess-0.5.3-beta/partition/nygcresearch-ldetect-data-ac125e47bf7f/EUR/fourier_ls-chr22.bed
--out /home/ubuntu/mdd_dd/hess_out/step1
--chrom 22
/home/ubuntu/hess-0.5.3-beta/src/refpanel.py:13: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
partition = pd.read_table(filename, delim_whitespace=True)
[INFO] Loaded 24 partitions on chromosome 22
[INFO] Average window size is 1466370
[ERROR] Missing files for the reference panel /home/ubuntu/hess-0.5.3-beta/bfile/1kg_eur_1pct/1kg_eur_1pct_chr22.bed

Environment info:
/# packages in environment at /home/ubuntu/miniconda3/envs/hess:
/#
/# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
backports 1.0 pyhd8ed1ab_3 conda-forge
backports.functools_lru_cache 1.6.1 py_0 conda-forge
backports_abc 0.5 py_1 conda-forge
ca-certificates 2022.12.7 ha878542_0 conda-forge
certifi 2019.11.28 py27h8c360ce_1 conda-forge
cycler 0.10.0 py_2 conda-forge
dbus 1.13.6 hfdff14a_1 conda-forge
expat 2.5.0 h27087fc_0 conda-forge
fontconfig 2.14.1 hc2a2eb6_0 conda-forge
freetype 2.12.1 hca18f0e_1 conda-forge
functools32 3.2.3.2 py_3 conda-forge
futures 3.3.0 py27h8c360ce_1 conda-forge
gettext 0.21.1 h27087fc_0 conda-forge
glib 2.66.3 h58526e2_0 conda-forge
gst-plugins-base 1.14.5 h0935bb2_2 conda-forge
gstreamer 1.14.5 h36ae1b5_2 conda-forge
icu 64.2 he1b5a44_1 conda-forge
jpeg 9e h166bdaf_2 conda-forge
kiwisolver 1.1.0 py27h9e3301b_1 conda-forge
ld_impl_linux-64 2.39 hcc3a1bd_1 conda-forge
libblas 3.9.0 8_openblas conda-forge
libcblas 3.9.0 8_openblas conda-forge
libffi 3.2.1 he1b5a44_1007 conda-forge
libgcc-ng 12.2.0 h65d4601_19 conda-forge
libgfortran-ng 7.5.0 h14aa051_20 conda-forge
libgfortran4 7.5.0 h14aa051_20 conda-forge
libglib 2.66.3 hbe7bbb4_0 conda-forge
libgomp 12.2.0 h65d4601_19 conda-forge
libiconv 1.17 h166bdaf_0 conda-forge
liblapack 3.9.0 8_openblas conda-forge
libopenblas 0.3.12 pthreads_hb3c22a3_1 conda-forge
libpng 1.6.39 h753d276_0 conda-forge
libsqlite 3.40.0 h753d276_0 conda-forge
libstdcxx-ng 12.2.0 h46fd767_19 conda-forge
libuuid 2.32.1 h7f98852_1000 conda-forge
libxcb 1.13 h7f98852_1004 conda-forge
libxml2 2.9.10 hee79883_0 conda-forge
libzlib 1.2.13 h166bdaf_4 conda-forge
matplotlib 2.2.5 ha770c72_3 conda-forge
matplotlib-base 2.2.5 py27h250f245_1 conda-forge
ncurses 6.3 h27087fc_1 conda-forge
numpy 1.16.5 py27h95a1406_0 conda-forge
openssl 1.1.1s h0b41bf4_1 conda-forge
pandas 0.24.2 py27hb3f55d8_0 conda-forge
pcre 8.45 h9c3ff4c_0 conda-forge
pip 20.1.1 pyh9f0ad1d_0 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge
pyqt 5.9.2 py27hcca6a23_4 conda-forge
pysnptools 0.3.13 py27h95a95ce_6 bioconda
python 2.7.15 h5a48372_1011_cpython conda-forge
python-dateutil 2.8.1 py_0 conda-forge
python_abi 2.7 1_cp27mu conda-forge
pytz 2020.1 pyh9f0ad1d_0 conda-forge
qt 5.9.7 h0c104cb_3 conda-forge
readline 8.1.2 h0f457ee_0 conda-forge
scipy 1.2.1 py27h921218d_2 conda-forge
setuptools 44.0.0 py27_0 conda-forge
singledispatch 3.6.1 pyh44b312d_0 conda-forge
sip 4.19.8 py27hf484d3e_1000 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
sqlite 3.40.0 h4ff8645_0 conda-forge
subprocess32 3.5.4 py27h516909a_0 conda-forge
tk 8.6.12 h27826a3_0 conda-forge
tornado 5.1.1 py27h14c3975_1000 conda-forge
wheel 0.37.1 pyhd8ed1ab_0 conda-forge
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
zlib 1.2.13 h166bdaf_4 conda-forge

Step 2 local rg routine: AttributeError: 'float' object has no attribute 'sqrt'

(myenv27) C:\Users\danty\Dropbox\hess\h>python hess.py --prefix step1h_trait1 --out step2h_trait1
[INFO] Command started at: Fri, 09 Feb 2018 16:47:45
[INFO] Command issued:
hess.py
--prefix step1h_trait1
--out step2h_trait1
[INFO] Loaded results for 1568 loci from step 1
[INFO] Using 4133481 SNPs with average sample size 77096.0
[INFO] Re-inflate the summary statistics with lambda_gc: 1
[INFO] Total SNP-heritability estimate: 0.606 (0.00698)
Traceback (most recent call last):
File "hess.py", line 217, in
main()
File "hess.py", line 40, in main
argmap['gwse-thres'], argmap['tot-hsqg'], argmap['out'])
File "C:\Users\danty\Dropbox\hess\h\src\estimation.py", line 174, in local_hsqg_step2
min_eigval, reinflate, gwse_thres)
File "C:\Users\danty\Dropbox\hess\h\src\estimation.py", line 285, in local_hsqg_step2_helper
se = np.sqrt(local_hsqg_est_var)
AttributeError: 'float' object has no attribute 'sqrt'

Step 1 Error

Hi,

I'm receiving an error when looping through chromosomes and I wondered if you could help me figure it out?

[INFO] Loaded 133 partitions on chromosome 1
[INFO] Average window size is 1873901
[INFO] 39728178 SNPs read from reference panel
[INFO] Loaded 589535 SNPs with rs IDs and single-letter alleles on chromosome 1 from the GWAS summary data file
[INFO] 501348 SNPs left after filtering
[INFO] Loaded 183464 SNPs with rs IDs and single-letter alleles on chromosome 1 from the GWAS summary data file
Traceback (most recent call last):
File "hess.py", line 217, in
main()
File "hess.py", line 45, in main
argmap['out'])
File "/Users/benjaminperry/HESS/src/estimation.py", line 392, in local_rhog_step1
sumstats2.filter_sumstats(refpanel.get_map())
File "/Users/benjaminperry/HESS/src/sumstats.py", line 168, in filter_sumstats
elif a1a0 in reverse[a1a2]: flip.append(i)
KeyError: 'ag'

The same error is appearing on each chromosome as it loops through.

would really appreciate the advice!

Thanks

Error in HESS step 1

I was in trouble in step 1, seeing below:

[INFO] @----------------------------------------------------------@
| HESS | v0.5 | 9/October/2017 |
|----------------------------------------------------------|
| (C) 2017 Huwenbo Shi, GNU General Public License, v3 |
|----------------------------------------------------------|
| For documentation, citation & bug-report instructions: |
| http://bogdan.bioinformatics.ucla.edu/software/hess/ |
@----------------------------------------------------------@
[INFO] Command started at: Sat, 02 Dec 2023 20:38:54
[INFO] Command issued:
/gpfs/share/home/0016173048/software/hess/hess.py
--bfile /gpfs/share/home/0016173048/software/hess/1kg_eur_1pct/1kg_eur_1pct_chr1
--local-rhog /gpfs/share/home/0016173048/ghd/openGWAS/GWAS_result/GWAS_summary_clean/mi_hess.txt /gpfs/share/home/0016173048/ghd/openGWAS/GWAS_result/GWAS_summary_clean/neutrophil_hess.txt
--partition /gpfs/share/home/0016173048/software/hess/genome_partition_file/EUR/fourier_ls-all.bed
--out /gpfs/share/home/0016173048/ghd/openGWAS/hess_out/step1/step1
--chrom 1
[INFO] Loaded 133 partitions on chromosome 1
[INFO] Average window size is 1873901
[INFO] 670954 SNPs read from reference panel
[INFO] Loaded 614326 SNPs with rs IDs and single-letter alleles on chromosome 1 from the GWAS summary data file
[INFO] 504260 SNPs left after filtering
[INFO] Loaded 2350152 SNPs with rs IDs and single-letter alleles on chromosome 1 from the GWAS summary data file
Traceback (most recent call last):
File "/gpfs/share/home/0016173048/software/hess/hess.py", line 217, in
main()
File "/gpfs/share/home/0016173048/software/hess/hess.py", line 45, in main
argmap['out'])
File "/gpfs/share/home/0016173048/software/hess/src/estimation.py", line 399, in local_rhog_step1
sumstats2.filter_sumstats(refpanel.get_map())
File "/gpfs/share/home/0016173048/software/hess/src/sumstats.py", line 168, in filter_sumstats
elif a1a0 in reverse[a1a2]: flip.append(i)
KeyError: 'DI'
[INFO] @----------------------------------------------------------@
| HESS | v0.5 | 9/October/2017 |
|----------------------------------------------------------|
| (C) 2017 Huwenbo Shi, GNU General Public License, v3 |
|----------------------------------------------------------|
| For documentation, citation & bug-report instructions: |
| http://bogdan.bioinformatics.ucla.edu/software/hess/ |
@----------------------------------------------------------@
[INFO] Command started at: Sat, 02 Dec 2023 20:40:21
[INFO] Command issued:
/gpfs/share/home/0016173048/software/hess/hess.py
--bfile /gpfs/share/home/0016173048/software/hess/1kg_eur_1pct/1kg_eur_1pct_chr2
--local-rhog /gpfs/share/home/0016173048/ghd/openGWAS/GWAS_result/GWAS_summary_clean/mi_hess.txt /gpfs/share/home/0016173048/ghd/openGWAS/GWAS_result/GWAS_summary_clean/neutrophil_hess.txt
--partition /gpfs/share/home/0016173048/software/hess/genome_partition_file/EUR/fourier_ls-all.bed
--out /gpfs/share/home/0016173048/ghd/openGWAS/hess_out/step1/step1
--chrom 2
[INFO] Loaded 144 partitions on chromosome 2
[INFO] Average window size is 1688741
[INFO] 724748 SNPs read from reference panel
[INFO] Loaded 643392 SNPs with rs IDs and single-letter alleles on chromosome 2 from the GWAS summary data file
[INFO] 528339 SNPs left after filtering
[INFO] Loaded 2571117 SNPs with rs IDs and single-letter alleles on chromosome 2 from the GWAS summary data file
[INFO] 575703 SNPs left after filtering
[INFO] Loading fam file /gpfs/share/home/0016173048/software/hess/1kg_eur_1pct/1kg_eur_1pct_chr2.fam
[INFO] Loading bim file /gpfs/share/home/0016173048/software/hess/1kg_eur_1pct/1kg_eur_1pct_chr2.bim
[INFO] bed file is open /gpfs/share/home/0016173048/software/hess/1kg_eur_1pct/1kg_eur_1pct_chr2.bed
Traceback (most recent call last):
File "/gpfs/share/home/0016173048/software/hess/hess.py", line 217, in
main()
File "/gpfs/share/home/0016173048/software/hess/hess.py", line 45, in main
argmap['out'])
File "/gpfs/share/home/0016173048/software/hess/src/estimation.py", line 407, in local_rhog_step1
snpmap_locus, snpdata_locus = refpanel.get_locus(start, stop, min_maf)
File "/gpfs/share/home/0016173048/software/hess/src/refpanel.py", line 86, in get_locus
snpdata_locus = self.snpdata[:, start_idx:stop_idx+1].read().val.T
File "/gpfs/share/home/0016173048/miniconda3/envs/py2.7/lib/python2.7/site-packages/pysnptools/snpreader/snpreader.py", line 455, in read
val = self._read(None, None, order, dtype, force_python_only, view_ok)
File "/gpfs/share/home/0016173048/miniconda3/envs/py2.7/lib/python2.7/site-packages/pysnptools/pstreader/_subset.py", line 71, in _read
val = self._internal._read(composed_row_index_or_none, composed_col_index_or_none, order, dtype, force_python_only, view_ok)
File "/gpfs/share/home/0016173048/miniconda3/envs/py2.7/lib/python2.7/site-packages/pysnptools/snpreader/bed.py", line 238, in _read
from pysnptools.snpreader import wrap_plink_parser
File "init.pxd", line 861, in init pysnptools.snpreader.wrap_plink_parser (pysnptools/snpreader/wrap_plink_parser.cpp:8227)
ValueError: numpy.ufunc has the wrong size, try recompiling

and here is my code:

#step 1
for chrom in $(seq 22)
do
python /gpfs/share/home/0016173048/software/hess/hess.py
--local-rhog /gpfs/share/home/0016173048/ghd/openGWAS/GWAS_result/GWAS_summary_clean/mi_hess.txt /gpfs/share/home/0016173048/ghd/openGWAS/GWAS_result/GWAS_summary_clean/neutrophil_hess.txt
--chrom $chrom
--bfile /gpfs/share/home/0016173048/software/hess/1kg_eur_1pct/1kg_eur_1pct_chr$chrom
--partition /gpfs/share/home/0016173048/software/hess/genome_partition_file/EUR/fourier_ls-all.bed
--out /gpfs/share/home/0016173048/ghd/openGWAS/hess_out/step1/step1
done

I do not know what was going wrong? Can you do me a favor?

Updated release for python 3?

I was wondering if there was a release for use in python 3 environment as some packages using by hess require updated versions not available through python 2.

NaN in putative causality analysis

Hello,
I've done the causality analysis and the output has given for the first trait was NAN.

trait1 local_gcor1 local_gcor1_se trait2 local_gcor2 local_gcor2_se
Trait1 nan 0 Trait2 0.36 0.1

I have found this error: RuntimeWarning: invalid value encountered in double_scalars.
gcor_all_data = all_gcov/(np.sqrt(all_hsq1)*np.sqrt(all_hsq2))

here, pointed out the same error: https://stackoverflow.com/questions/27784528/numpy-division-with-runtimewarning-invalid-value-encountered-in-double-scalars

"You can't solve it. Simply answer1.sum()==0, and you can't perform a division by zero.

This happens because answer1 is the exponential of 2 very large, negative numbers, so that the result is rounded to zero.

nan is returned in this case because of the division by zero.

Now to solve your problem you could:

go for a library for high-precision mathematics, like mpmath. But that's less fun.
as an alternative to a bigger weapon, do some math manipulation, as detailed below.
go for a tailored scipy/numpy function that does exactly what you want! Check out @warren Weckesser answer."

So, is there any solution? Any advice in that case?

Thank you in advance,
Alvaro

Issue with chromosome 8 reference panel?

Hello,
I was trying to run HESS with the 1KG3 reference panel provided (https://ucla.box.com/shared/static/l8cjbl5jsnghhicn0gdej026x017aj9u.gz), and it runs successfully on all chromosomes but chromosome 8.
Whenever I try to load the bed file 1kg_eur_1pct_chr8.bed I get an error message from the 'refpanel.py' script, specifically in 'get_locus':

hess-0.5.4-beta/src/refpanel.py", line 84, in get_locus
start_idx = snpmap_locus.index.values[0]
IndexError: index 0 is out of bounds for axis 0 with size 0

Any thoughts on what this issue might be?

INFO] 1 SNPs in locus chr12:8377536-9031395 IndexError: tuple index out of range

Hi,
for some chromosomes i ma getting the below error for local h2 and rf. please help me to solve this problem,thank you so much!
here is my code :
hess/hess.py
--bfile 1kg_eur_1pct/1kg_eur_1pct_chr12
--partition EUR/fourier_ls-chr12.bed
--out step1
--chrom 12
--local-hsqg AD_hamo.txt
error:
[INFO] bed file is open 1kg_eur_1pct/1kg_eur_1pct_chr12.bed
[INFO] 1892 SNPs in locus chr12:61107-1080331
[INFO] 2707 SNPs in locus chr12:1080331-2544786
[INFO] 2461 SNPs in locus chr12:2544786-3677037
[INFO] 1606 SNPs in locus chr12:3677037-4417679
[INFO] 1788 SNPs in locus chr12:4417679-5321472
[INFO] 2417 SNPs in locus chr12:5321472-6419753
[INFO] 863 SNPs in locus chr12:6419753-8377536
[INFO] 1 SNPs in locus chr12:8377536-9031395
Traceback (most recent call last):
File "hess/hess.py", line 217, in
main()
File "hess/hess.py", line 35, in main
argmap['out'])
File "/home/ygl/hess/src/estimation.py", line 68, in local_hsqg_step1
ld_locus, sumstats_locus)
File "/home/ygl/hess/src/estimation.py", line 79, in local_hsqg_step1_helper
nsnp = ld.shape[0]
IndexError: tuple index out of range
Then, I have already manually modified the bed files to remove the loci that occur errors. but I still got this problem.
here is part of my bed file been modified like.
chr12 55665837 57548860
chr12 57548860 59308666
chr12 59308666 61123729
chr12 61123729 64032461
chr12 64032461 65559695
chr12 65559695 67181144
chr12 67181144 67909729
chr12 67909729 69826542
chr12 69826542 70957987
chr12 70957987 72645075
chr12 72645075 73818454
chr12 73818454 76511314
chr12 76511314 78570570
chr12 99305987 101447641
chr12 101447641 101862690
chr12 101862690 102964986
chr12 102964986 104848696
chr12 104848696 106436213
chr12 106436213 106958748
chr12 106958748 109025901
chr12 109025901 110336719
chr12 110336719 113263518
1

keyerror:local_hsqg_manhattan.py

"KeyError: 'chr'"
[code]
python /home/ygl/HESS0.5.3/misc/2 \

--local-hsqg-est step2_trait1.txt
--out CD_local_hsqg.pdf
--trait-name CD
[error]
Traceback (most recent call last):
File "/home/ygl/HESS0.5.3/misc/2", line 100, in
main()
File "/home/ygl/HESS0.5.3/misc/2", line 34, in main
even_chr_idx = index[np.where(hsq['chr']%2 == 0)]
File "/home/ygl/anaconda3/envs/hess/lib/python2.7/site-packages/pandas/core/frame.py", line 2927, in getitem
indexer = self.columns.get_loc(key)
File "/home/ygl/anaconda3/envs/hess/lib/python2.7/site-packages/pandas/core/indexes/base.py", line 2659, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'chr'
I have no idea why this error emerged and how to solve it. hope someone can help me ,thanks!

refpanel in step 1

Hi there,

While running step 1. I come across an error saying module not found "refpanel" when executing estimation.py. I can't seem to find any modules called refpanel.

Please could you help m resolve this?

image

How to remove loci with no SNPs?

Could you provide some clarification of how to properly/efficiently remove loci with no SNPs? This was not obvious to me and I don't want to make mistake.

My best guess it to go back to the partition (e.g., fourier_ls-chr1.bed) and remove the rows as indicated by the .log file. However, because I'm assessing many correlations, this would mean generating a new set of partition files for every correlation. Seems like it could be beneficial to add optional flags --drop-missing-loci in Step 1 and Step 2 of the local genetic correlation routine. [See below FAQ for context]

[ERROR] Rank of A less than the number of loci. There might be loci with no SNP. The second step of HESS and (\rho)-HESS involves inverting a matrix. If the matrix is not invertible, HESS and (\rho)-HESS will not attempt to estimate SNP-heritability or genetic covariance. Usually, this error is caused by empty locus (SNP with no SNP in it). If there are empty loci in the data, it is recommended to remove these loci or combine these loci with nearby loci.

Floating point exception

I was in trouble in step 1, seeing below:
[INFO] Loaded 294114 SNPs with rs IDs and single-letter alleles on chromosome 17 from the GWAS summary data file
[INFO] 183946 SNPs left after filtering
[INFO] Loaded 500929 SNPs with rs IDs and single-letter alleles on chromosome 17 from the GWAS summary data file
[INFO] 182424 SNPs left after filtering
[INFO] Loading fam file /home/ygl/1kg_eur_1pct/1kg_eur_1pct_chr17.fam
[INFO] Loading bim file /home/ygl/1kg_eur_1pct/1kg_eur_1pct_chr17.bim
/home/ygl/anaconda3/envs/hess/lib/python2.7/site-packages/pysnptools/snpreader/snpreader.py:676: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
pos = fields.as_matrix([0,2,3])
/home/ygl/anaconda3/envs/hess/lib/python2.7/site-packages/pysnptools/snpreader/snpreader.py:625: FutureWarning: Conversion of the second argument of issubdtype from str to str is deprecated. In future, it will be treated as np.string_ == np.dtype(str).type.
assert np.issubdtype(self._row.dtype, str) and len(self._row.shape)==2 and self._row.shape[1]==2, "iid should be dtype str, have two dimensions, and the second dimension should be size 2"
/home/ygl/anaconda3/envs/hess/lib/python2.7/site-packages/pysnptools/snpreader/snpreader.py:626: FutureWarning: Conversion of the second argument of issubdtype from str to str is deprecated. In future, it will be treated as np.string_ == np.dtype(str).type.
assert np.issubdtype(self._col.dtype, str) and len(self._col.shape)==1, "sid should be of dtype of str and one dimensional"
[INFO] bed file is open /home/ygl/1kg_eur_1pct/1kg_eur_1pct_chr17.bed
[INFO] 2635 SNPs in locus chr17:56-1172399
Floating point exception

phenotypic correlation for step 3

Hi,

Thank you for developing this tool.
I am using the local genetic correlation section of the tool (rho-HESS) and have a question regarding accounting for phenotypic correlation which is required to be fed in to step 3.
In the manual (https://huwenboshi.github.io/hess-0.5/local_rhog/), it states that the genetic covariance intercept term from cross-trait LDSC provides an approximation of phenotypic correlation. As I am using publicly available GWAS data from different consortia which consist of multiple source studies, possibly including UK Biobank, I am unsure about the number of overlapping samples between the 2 input GWAS datasets.
Should I not use this LDSC intercept value instead of calculating the exact phenotypic correlation (for which I do not have exact numbers to run), or should I use phenotypic correlation = 0 (as used in the rho-HESS publication) purely because they aren't identical consortia?

Many thanks in advance!

Step 1 error

[INFO] @----------------------------------------------------------@
| HESS | v0.5 | 9/October/2017 |
|----------------------------------------------------------|
| (C) 2017 Huwenbo Shi, GNU General Public License, v3 |
|----------------------------------------------------------|
| For documentation, citation & bug-report instructions: |
| http://bogdan.bioinformatics.ucla.edu/software/hess/ |
@----------------------------------------------------------@
[INFO] Command started at: Tue, 29 May 2018 12:25:40
[INFO] Command issued:
/Users/sonia.shah/Documents/Software/hess-0.5.4-beta/hess.py
--bfile ukbEUR_imp_chr1_v2_imp_QC_HRC_random10Ksubset
--local-rhog trait1 trait2
--partition fourier_ls-chr1.bed
--out step1
--chrom 1
[INFO] Loaded 133 partitions on chromosome 1
[INFO] Average window size is 1873901
[INFO] 2256485 SNPs read from reference panel
[INFO] Loaded 636448 SNPs with rs IDs and single-letter alleles on chromosome 1 from the GWAS summary data file
Traceback (most recent call last):
File "/Users/sonia.shah/Documents/Software/hess-0.5.4-beta/hess.py", line 217, in
main()
File "/Users/sonia.shah/Documents/Software/hess-0.5.4-beta/hess.py", line 45, in main
argmap['out'])
File "/Users/sonia.shah/Documents/Software/hess-0.5.4-beta/src/estimation.py", line 390, in local_rhog_step1
sumstats1.filter_sumstats(refpanel.get_map())
File "/Users/sonia.shah/Documents/Software/hess-0.5.4-beta/src/sumstats.py", line 168, in filter_sumstats
elif a1a0 in reverse[a1a2]: flip.append(i)
KeyError: 'tc'

HESS

I cannot install HESS on my Linux system, and there is no file similar to setup.py in the compressed package

Step 2 error

Hello,

I receive the following error when running step2 and specifying the h2 and se. Any assistance would be appreciated.

[INFO] Command started at: Wed, 13 Mar 2019 11:43:37
[INFO] Command issued:
hess.py
--tot-hsqg 0.087 0.004
--prefix step1
--out step2_total_h2

/HESS/hess-0.5.3-beta/src/estimation.py:126: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
compression='gzip', names=['start', 'stop', 'nsnp', 'rank', 'N'])
[INFO] Loaded results for 1703 loci from step 1
Traceback (most recent call last):
File "hess.py", line 217, in
main()
File "hess.py", line 40, in main
argmap['gwse-thres'], argmap['tot-hsqg'], argmap['out'])
File "
/HESS/hess-0.5.3-beta/src/estimation.py", line 180, in local_hsqg_step2
max_num_eig, min_eigval, reinflate, gwse_thres, tot_hsq)
File "~/HESS/hess-0.5.3-beta/src/estimation.py", line 307, in local_hsqg_step2_helper_tot_hsq
if len(reinflate) > 1:
TypeError: object of type 'float' has no len()

step1 KeyError: 'ID"

[INFO] @----------------------------------------------------------@
| HESS | v0.5 | 9/October/2017 |
|----------------------------------------------------------|
| (C) 2017 Huwenbo Shi, GNU General Public License, v3 |
|----------------------------------------------------------|
| For documentation, citation & bug-report instructions: |
| http://bogdan.bioinformatics.ucla.edu/software/hess/ |
@----------------------------------------------------------@
[INFO] Command started at: Thu, 07 Dec 2023 09:05:39
[INFO] Command issued:
hess.py
--bfile /home/zhanya/hess-hess-0.5/1kg_eur_1pct/1kg_eur_1pct_chr22
--local-rhog /external/zhanya/hess/mdd3.txt /external/zhanya/hess/whr4.txt
--partition /home/zhanya/hess-hess-0.5/fourier_ls-all.bed
--out /external/zhanya/hess/step1
--chrom 22
/home/zhanya/hess-hess-0.5/src/refpanel.py:13: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
partition = pd.read_table(filename, delim_whitespace=True)
[INFO] Loaded 24 partitions on chromosome 22
[INFO] Average window size is 1466370
/home/zhanya/hess-hess-0.5/src/refpanel.py:61: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
names=['SNP', 'BP', 'A0', 'A1'])
[INFO] 123295 SNPs read from reference panel
[INFO] Loaded 132836 SNPs with rs IDs and single-letter alleles on chromosome 22 from the GWAS summary data file
Traceback (most recent call last):
File "hess.py", line 217, in
main()
File "hess.py", line 45, in main
argmap['out'])
File "/home/zhanya/hess-hess-0.5/src/estimation.py", line 390, in local_rhog_step1
sumstats1.filter_sumstats(refpanel.get_map())
File "/home/zhanya/hess-hess-0.5/src/sumstats.py", line 168, in filter_sumstats
elif a1a0 in reverse[a1a2]: flip.append(i)
KeyError: 'ID"
This code only encountered a problem in the 22nd loop, reporting a KeyError: 'ID'. I used the same code to run 21 chromosomes before, and I don't know where the problem occurred
here is my code:
for chrom in $(seq 22)
do
python hess.py
--local-rhog /external/zhanya/hess/mdd3.txt /external/zhanya/hess/whr4.txt
--chrom $chrom
--bfile /home/zhanya/hess-hess-0.5/1kg_eur_1pct/1kg_eur_1pct_chr$chrom
--partition /home/zhanya/hess-hess-0.5/fourier_ls-all.bed
--out /external/zhanya/hess/step1
done

Step 2 different total SNP heritability estimates

Hello, sorry if the answer is quite obvious but I have a question.

  1. Why the total SNP-heritability estimate that appear in the log file is different from the one that I proportionate?

Here is my command, where the total SNP h^2 = 0.2354, SE = 0.0153

python ~/Softwares/hess-0.5.3-beta/hess.py --prefix step1 --tot-hsqg 0.2354 0.0153 --out step2

And here is the log file

[INFO] @----------------------------------------------------------@
| HESS | v0.5 | 9/October/2017 |
|----------------------------------------------------------|
| (C) 2017 Huwenbo Shi, GNU General Public License, v3 |
|----------------------------------------------------------|
| For documentation, citation & bug-report instructions: |
| http://bogdan.bioinformatics.ucla.edu/software/hess/ |
@----------------------------------------------------------@
[INFO] Command started at: Thu, 01 Nov 2018 16:51:56
[INFO] Command issued:
/home/marianar/Softwares/hess-0.5.3-beta/hess.py
--tot-hsqg 0.2354 0.0153
--prefix step1
--out step2
[INFO] Loaded results for 1703 loci from step 1
[INFO] Using 4661012 SNPs with average sample size 53293.0
[INFO] Re-inflate the summary statistics with lambda_gc: 1
[INFO] Total SNP-heritability estimate: 0.247 (0.0251)
[INFO] Command finished at: Thu, 01 Nov 2018 16:51:59

But now in the log file it says Total SNP-heritability estimate: 0.247 (0.0251).
I know is not such a big difference but I have done the same in other traits and some have change too much.

Thanks

Dealing with loci with no SNP

Hi Huwenbo-

@shihuwenbo @huwenboshi I am trying to work this out for the step2, but I am getting error ,

[INFO] Loaded results for 1703 loci from step 1
[INFO] Using 1066727 SNPs with average sample size 8965.0
[INFO] Re-inflate the summary statistics with lambda_gc: 1
[ERROR] Rank of A less than the number of loci. There might be loci with no SNP.

From what is given in the FAQS, I made sure the sumstats file dont have any NA in the SNP column. What can I do to make this run ? Which SNP's from where should be removed? Is the --drop-missing-loci function implemented yet ?

Thanks Huwenbo

Interpretation of results

Hi! I used ρHESS with two GWAS summary statistics of two diseases which have positive genome-wide correlation. However, ρHESS found negative local genetic correlation in the two top significant loci. I would appreciate your comment on the interpretation of this kind of mix results (positive genome-wide correlation but negative genetic correlation in the significant loci).

Thanks!

SNPs

Hi,

Thanks a lot for the very nice software. I have a stupid question. We want to calculate some genetic correlation using GWASs data from finngen and UK biobank. As we known that very high density SNPs were included in finngen study as a results a very long time needed for the analysis. Is it possible for us to keep only same SNPs for finngen in UK biobank during the analysis?, and it would affect the results? Thanks a lot.

Lee

#Step1-IndexError: tuple index out of range

Hello,
I'm receiving an error when running step1 for estimating local genetic covariance and correlation and I wondered if you could help me figure it out?

[INFO] Command issued:
hess.py
--bfile 1kg_eur_1pct/1kg_eur_1pct_chr15
--local-rhog 1.txt 2.txt
--partition EUR/fourier_ls-chr15.bed
--out LC/step1
--chrom 15
[INFO] Loaded 50 partitions on chromosome 15
[INFO] Average window size is 1650395
[INFO] 248416 SNPs read from reference panel
[INFO] Loaded 65093 SNPs with rs IDs and single-letter alleles on chromosome 15 from the GWAS summary data file
[INFO] 54792 SNPs left after filtering
[INFO] Loaded 217307 SNPs with rs IDs and single-letter alleles on chromosome 15 from the GWAS summary data file
[INFO] 175560 SNPs left after filtering
[INFO] Loading fam file 1kg_eur_1pct/1kg_eur_1pct_chr15.fam
[INFO] Loading bim file 1kg_eur_1pct/1kg_eur_1pct_chr15.bim
/Users/wds/opt/anaconda3/envs/hess/lib/python2.7/site-packages/pysnptools/snpreader/snpreader.py:625: FutureWarning: Conversion of the second argument of issubdtype from str to str is deprecated. In future, it will be treated as np.string_ == np.dtype(str).type.
assert np.issubdtype(self._row.dtype, str) and len(self._row.shape)==2 and self._row.shape[1]==2, "iid should be dtype str, have two dimensions, and the second dimension should be size 2"
/Users/wds/opt/anaconda3/envs/hess/lib/python2.7/site-packages/pysnptools/snpreader/snpreader.py:626: FutureWarning: Conversion of the second argument of issubdtype from str to str is deprecated. In future, it will be treated as np.string_ == np.dtype(str).type.
assert np.issubdtype(self._col.dtype, str) and len(self._col.shape)==1, "sid should be of dtype of str and one dimensional"
[INFO] bed file is open 1kg_eur_1pct/1kg_eur_1pct_chr15.bed
[INFO] 1 SNPs in locus chr15:20001200-21131604
Traceback (most recent call last):
File "hess.py", line 217, in
main()
File "hess.py", line 45, in main
argmap['out'])
File "/Users/wds/Desktop/hess/src/estimation.py", line 427, in local_rhog_step1
ld_locus, sumstats1_locus)
File "/Users/wds/Desktop/hess/src/estimation.py", line 79, in local_hsqg_step1_helper
nsnp = ld.shape[0]
IndexError: tuple index out of range

error in step1

Hello,
this is my code. Please help me check where the problem is and how to solve it. Thank you very much."
(python2.7_env) [zhanya@mu01 hess-hess-0.5]$ for chrom in $(seq 22)

do
python hess.py
--local-hsqg /external/zhanya/hess/mdd3.txt --local-hsqg /external/zhanya/hess/whr4.txt
--chrom $chrom
--bfile /home/zhanya/hess-hess-0.5/1kg_eur_1pct/ 1kg_eur_1pct_chr$chrom
--partition /home/zhanya/hess-hess-0.5/partition.txt
--out /external/zhanya/hess/step1
done
usage: hess.py [-h] [--bfile BFILE] [--chrom CHROM] [--partition PARTITION]
[--local-hsqg LOCAL-HSQG] [--local-rhog LOCAL-RHOG LOCAL-RHOG]
[--tot-hsqg TOT-HSQG TOT-HSQG] [--num-shared NUM-SHARED]
[--pheno-cor PHENO-COR]
[--local-hsqg-est LOCAL-HSQG-EST LOCAL-HSQG-EST]
[--prefix PREFIX] [--max-num-eig MAX-NUM-EIG]
[--min-eigval MIN-EIGVAL] [--min-maf MIN-MAF]
[--reinflate-lambda-gc [REINFLATE-LAMBDA-GC [REINFLATE-LAMBDA-GC ...]]]
[--gwse-thres GWSE-THRES] --out OUT
hess.py: error: unrecognized arguments: 1kg_eur_1pct_chr1

install HESS

Can you provide me with detailed instructions on how to install HESS? I'm having trouble installing it. Thank you very much!

estimate_lambdagc.py output

Hello!
I'm using the script on estimating the genetic control factor (estimate_lambdagc.py) after running step1, but when I run the script, it isn't producing any type of output.
I've checked the standard error files but it was completely empty and the standard output file says that it ran without an issue.
Is it suppose to create a new file? Where can I find the inferred genetic control factors?
(it also runs within seconds)

1kg_eur_1pct_chr

Hi! I encountered a problem in the hess analysis. I didn't know how to download the reference file(1kg_eur_1pct_chr) . Could someone please provide a link to download it?

"KeyError:1614" was reported when using [contrast_polygenicity.py] to make the plot among five traits.

"KeyError:1614" was reported when using [contrast_polygenicity.py] to make the plot among five traits.
【code】
python ./hess/misc/contrast_polygenicity.py
--local-hsqg-est trait1.txt trait2.txt trait3.txt trait4.txt trait5.txt
--show-se --no-negative --trait-names trait1 trait2 trait3 trait4 trait5
--out .five_traits_contrast.pdf
【Error】
Traceback (most recent call last):
File "./hess0.5/misc/contrast_polygenicity.py", line 146, in
main()
File "./hess0.5/misc/contrast_polygenicity.py", line 77, in main
total_hsq_jk = all_total_hsq[i]-all_local_hsq_sorted[i][k]
File "/lustre/home/acct-clsjj/clsjj/.local/lib/python2.7/site-packages/pandas/core/series.py", line 868, in getitem
result = self.index.get_value(self, key)
File "/home/../python2.7/site-packages/pandas/core/indexes/base.py", line 4375, in get_value
tz=getattr(series.dtype, 'tz', None))
File "pandas/_libs/index.pyx", line 81, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 89, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 987, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 993, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 1614

GC lambda

“We provide a simple script (misc/estimate_lambdagc.py) to infer the genomic control factor to reinflate the summary statistics if it is unknown. The following script should be executed after step 1 completes.” I wonder if this estimate of GC lambda is the same as the GC lambda from ldsc.py?
For example:
Call:
./ldsc.py
--no-check-alleles
--h2 ./Trait5.sumstats.gz
--ref-ld-chr ./eur_w_ld_chr/
--out ./Trait5_h2
--w-ld-chr ./eur_w_ld_chr/
Total Observed scale h2: 0.002 (0.0013)
Lambda GC: 1.0046
Mean Chi^2: 1.0109
Intercept: 0.9948 (0.0071)
Ratio < 0 (usually indicates GC correction).
Analysis finished at Mon Mar 4 15:11:53 2024
Total time elapsed: 30.09s
I'm asking this because the GC lambda estimate from estimate_lambdagc.py is different from the GC lambda from ldsc

IndexError: list index out of range for local rg

Hi,
for some chromosomes i ma getting the below error for local rg. please provide your input.

[INFO] 204309 SNPs read from reference panel
[INFO] Loaded 286860 SNPs with rs IDs and single-letter alleles on chromosome 19 from the GWAS summary data file
[INFO] 160989 SNPs left after filtering
Traceback (most recent call last):
File "/data2/nenduru/software/hess-0.5.4-beta/hess-0.5.4-beta/hess.py", line 217, in
main()
File "/data2/nenduru/software/hess-0.5.4-beta/hess-0.5.4-beta/hess.py", line 45, in main
argmap['out'])
File "/data2/nenduru/software/hess-0.5.4-beta/hess-0.5.4-beta/src/estimation.py", line 391, in local_rhog_step1
sumstats2 = SumStats(sumstats_fnm[1], chrom)
File "/data2/nenduru/software/hess-0.5.4-beta/hess-0.5.4-beta/src/sumstats.py", line 113, in init
idx = idx_map[name]; val = cols[idx]
IndexError: list index out of range

thanks,
Nitesh

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.