Giter Site home page Giter Site logo

hicool's Introduction

HiCool

The HiCool R/Bioconductor package provides an end-to-end interface to process and normalize Hi-C paired-end fastq reads into .(m)cool files.

  1. The heavy lifting (fastq mapping, pairs parsing and pairs filtering) is performed by the underlying lightweight hicstuff python library (https://github.com/koszullab/hicstuff).
  2. Pairs filering is done using the approach described in Cournac et al., 2012 and implemented in hicstuff.
  3. Cooler (https://github.com/open2c/cooler) library is used to parse pairs into a multi-resolution, balanced .mcool file. .(m)cool is a compact, indexed HDF5 file format specifically tailored for efficiently storing HiC-based data. The .(m)cool file format was developed by Abdennur and Mirny and published in 2019.
  4. Internally, all these external dependencies are automatically installed and managed in R by a basilisk environment.

Processing .fastq paired-end files into a .mcool Hi-C contact matrix

The main processing function offered in this package is HiCool(). One simply needs to specify:

  • The path to each fastq file;
  • The genome reference, as a .fasta sequence, a pre-computed bowtie2 index or a supported ID (hg38, mm10, dm6, R64-1-1, WBcel235, GRCz10, Galgal4);
  • The restriction enzyme(s) used for Hi-C.
library(HiCool)
x <- HiCool(
    r1 = '<PATH-TO-R1.fq.gz>', 
    r2 = '<PATH-TO-R2.fq.gz>', 
    restriction = 'DpnII,HinfI', 
    genome = 'R64-1-1'
)
## HiCool :: Recovering bowtie2 genome index from AWS iGenomes...
## HiCool :: Initiating processing of fastq files [tmp folder: /tmp/RtmpARIRQo/DZ28I8]...
## HiCool :: Mapping fastq files...
## HiCool :: Best-suited minimum resolution automatically inferred: 1000
## HiCool :: Remove unwanted chromosomes...
## HiCool :: Generating multi-resolution .mcool file...
## HiCool :: Balancing .mcool file...
## HiCool :: Tidying up everything for you...
## HiCool :: .fastq to .mcool processing done!
## HiCool :: Check /home/rsg/repos/HiCool/HiCool folder to find the generated files
## HiCool :: Generating HiCool report. This might take a while.
## HiCool :: Report generated and available @ sample^mapped-R64-1-1^DZ28I8.html
## HiCool :: All processing successfully achieved. Congrats!
x
## CoolFile object
##   .mcool file: sample^mapped-R64-1-1^55IONQ.mcool
##   resolution: 1000
##   pairs file: sample^55IONQ.pairs
##   metadata(3): log args stats

Output files

## HiCool/
## |-- sample^mapped-R64-1-1^55IONQ.html
## |-- logs
## |   |-- sample^mapped-R64-1-1^55IONQ.log
## |-- matrices
## |   |-- sample^mapped-R64-1-1^55IONQ.mcool
## |-- pairs
## |   |-- sample^mapped-R64-1-1^55IONQ.pairs
## `-- plots
##     |-- sample^mapped-R64-1-1^55IONQ_event_distance.pdf
##     |-- sample^mapped-R64-1-1^55IONQ_event_distribution.pdf

Reporting

On top of processing fastq reads, HiCool provides convenient reports for single/multiple sample(s).

x <- importHiCoolFolder(output = 'HiCool/', hash = '55IONQ')
HiCReport(x)

Installation

As an R/Bioconductor package, HiCool should be very easy to install. The only dependency is R (>= 4.2). In R, one can run:

if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::install("HiCool")

The first time a HiCool() function is executed, a basilisk environment will be automatically set up. In this environment, few dependencies will be installed:

  • python (pinned 3.9.1)
  • numpy (pinned 1.23.4)
  • bowtie2 (pinned 2.4.5)
  • samtools (pinned 1.7)
  • hicstuff (pinned 3.1.5)
  • cooler (pinned 0.8.11)

HiCExperiment ecosystem

HiCool is integrated within the HiCExperiment ecosystem in Bioconductor. Read more about the HiCExperiment class and handling Hi-C data in R here.

  • HiCExperiment: Parsing Hi-C files in R
  • HiCool: End-to-end integrated workflow to process fastq files into .cool and .pairs files
  • HiContacts: Investigating Hi-C results in R
  • HiContactsData: Data companion package
  • fourDNData: Gateway package to 4DN-hosted Hi-C experiments

hicool's People

Contributors

js2264 avatar jwokaty avatar

Stargazers

 avatar

Watchers

 avatar

hicool's Issues

Error in Generating multi-resolution .mcool file

Hi @js2264,

When running the HiCool function like that :
x <- HiCool(
r1 = './MKL30_Merged_R1.fastq.gz',
r2 = './MKL30_Merged_R2.fastq.gz',
restriction = 'HpaII',
resolutions = c(500, 1000, 2000, 5000, 10000),
genome = './genome_SL1344.fasta',
output = './HiCool_MKL30_Merged/'
)

I get this error :
HiCool :: Generating multi-resolution .mcool file...
Error in py_call_impl(callable, call_args$unnamed, call_args$named) :
TypeError: 'int' object is not iterable
Run reticulate::py_last_error() for details.
Calls: HiCool ... -> force -> -> py_call_impl
Execution halte

I feel like it is a fastq files problem, because on certain fastq files it works and on others it doesn't.
Do you have any clue ?
Thanks in advance

Anakim

Error executing get loops()

Dr. Serizay,

When I execute getloops() on objects imported by HiCExperiment from .mcool, I get the following error.

Ctrl <- import("Ctrl.mcool", format = 'mcool', resolution = 160000)
refocus(Ctrl, "2:11002807-12093193") %>% zoom(10000) %>% HiCool::getLoops()

PackagesNotFoundError: The following packages are not available from current channels:

  • samtools=1.16.1
  • bowtie2=2.5.0

Current channels:

To search for alternate channels that may provide the conda package you're
looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.

Error: one or more Python packages failed to install [error code 1]

How can I solve it? My platform is Window.

Thank you.

Compatible with windows?

Hello,

Thanks for building this package! Is it currently compatible with R on Windows? I've tried to install HiCool and get the following error:

* installing *source* package 'HiCool' ...
** using non-staged installation via StagedInstall field
'sh' is not recognized as an internal or external command,
operable program or batch file.
Warning in system2(file.path(R.home("bin"), "Rcmd.exe"), c("config", x),  :
  running command '"C:/PROGRA~1/R/R-44~1.0/bin/x64/Rcmd.exe" config CC' had status 1
'sh' is not recognized as an internal or external command,
operable program or batch file.
Warning in system2(file.path(R.home("bin"), "Rcmd.exe"), c("config", x),  :
  running command '"C:/PROGRA~1/R/R-44~1.0/bin/x64/Rcmd.exe" config CFLAGS' had status 1
'sh' is not recognized as an internal or external command,
operable program or batch file.
Warning in system2(file.path(R.home("bin"), "Rcmd.exe"), c("config", x),  :
  running command '"C:/PROGRA~1/R/R-44~1.0/bin/x64/Rcmd.exe" config CXX' had status 1
'sh' is not recognized as an internal or external command,
operable program or batch file.
Warning in system2(file.path(R.home("bin"), "Rcmd.exe"), c("config", x),  :
  running command '"C:/PROGRA~1/R/R-44~1.0/bin/x64/Rcmd.exe" config CXXFLAGS' had status 1
'sh' is not recognized as an internal or external command,
operable program or batch file.
Warning in system2(file.path(R.home("bin"), "Rcmd.exe"), c("config", x),  :
  running command '"C:/PROGRA~1/R/R-44~1.0/bin/x64/Rcmd.exe" config CPPFLAGS' had status 1
'sh' is not recognized as an internal or external command,
operable program or batch file.
Warning in system2(file.path(R.home("bin"), "Rcmd.exe"), c("config", x),  :
  running command '"C:/PROGRA~1/R/R-44~1.0/bin/x64/Rcmd.exe" config LDFLAGS' had status 1
'sh' is not recognized as an internal or external command,
operable program or batch file.
Warning in system2(file.path(R.home("bin"), "Rcmd.exe"), c("config", x),  :
  running command '"C:/PROGRA~1/R/R-44~1.0/bin/x64/Rcmd.exe" config FC' had status 1
'sh' is not recognized as an internal or external command,
operable program or batch file.
Warning in system2(file.path(R.home("bin"), "Rcmd.exe"), c("config", x),  :
  running command '"C:/PROGRA~1/R/R-44~1.0/bin/x64/Rcmd.exe" config FCFLAGS' had status 1
Error in (function (...)  : 'names' and 'val' are of different lengths
* removing 'C:/Users/rhackley/AppData/Local/R/win-library/4.4/HiCool'
The downloaded binary packages are in C:\Users\rhackley\AppData\Local\Temp\RtmpaiKLq0\downloaded_packages
Installation paths not writeable, unable to update packages 
path: C:/Program Files/R/R-4.4.0/library
Warning message:
In install.packages(...) :
  installation of package ‘HiCool’ had non-zero exit status

I have WSL2 activated on this machine, and have tried setting the default terminal in R to bash with python integration enabled, but the error persits.

>sessionInfo()
R version 4.4.0 (2024-04-24 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22621)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.9.3 forcats_1.0.0   stringr_1.5.1   dplyr_1.1.4     purrr_1.0.2     readr_2.1.5     tidyr_1.3.1    
 [8] tibble_3.2.1    ggplot2_3.5.1   tidyverse_2.0.0

loaded via a namespace (and not attached):
 [1] gtable_0.3.5      compiler_4.4.0    tidyselect_1.2.1  scales_1.3.0      yaml_2.3.8        fastmap_1.2.0    
 [7] R6_2.5.1          generics_0.1.3    knitr_1.46        munsell_0.5.1     pillar_1.9.0      tzdb_0.4.0       
[13] rlang_1.1.3       utf8_1.2.4        stringi_1.8.4     xfun_0.44         timechange_0.3.0  cli_3.6.2        
[19] withr_3.0.0       magrittr_2.0.3    digest_0.6.35     grid_4.4.0        rstudioapi_0.16.0 hms_1.1.3        
[25] lifecycle_1.0.4   vctrs_0.6.5       evaluate_0.23     glue_1.7.0        fansi_1.0.6       colorspace_2.1-0 
[31] rmarkdown_2.27    tools_4.4.0       pkgconfig_2.0.3   htmltools_0.5.8.1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.