Giter Site home page Giter Site logo

andreaskapou / melissa Goto Github PK

View Code? Open in Web Editor NEW
14.0 5.0 7.0 5.34 MB

Bayesian Clustering and Imputation of Single Cell Methylomes

License: GNU General Public License v3.0

R 100.00%
imputation clustering methylation variational-inference bayesian-inference

melissa's Introduction

Melissa: Bayesian clustering and imputation of single cell methylomes

DOI

New technologies enabling the measurement of DNA methylation at the single cell level are promising to revolutionise our understanding of epigenetic control of gene expression. Yet, intrinsic limitations of the technology result in very sparse coverage of CpG sites (around 5% to 20% coverage), effectively limiting the analysis repertoire to a semi-quantitative level.

Melissa (MEthyLation Inference for Single cell Analysis), is a Bayesian hierarchical method to quantify spatially-varying methylation profiles across genomic regions from single-cell bisulfite sequencing data (scBS-seq). Melissa clusters individual cells based on local methylation patterns, enabling the discovery of epigenetic diversities and commonalities among individual cells. The clustering also acts as an effective regularisation method for imputation of methylation on unassayed CpG sites, enabling transfer of information between individual cells.

The probabilistic graphical representation of the Melissa model is shown below:

Installation

To get the latest development version from Github:

# install.packages("devtools")
devtools::install_github("andreaskapou/Melissa", build_vignettes = TRUE)

Melissa dependence

Melissa depends heavily on the BPRMeth package, which is available on

Bioconductor: http://bioconductor.org/packages/BPRMeth/ and

Github: https://github.com/andreaskapou/BPRMeth.

Archive repository

There is also an archived version of the Melissa model (Github repo https://github.com/andreaskapou/Melissa-archive) for reproducing the results presented in the Kapourani and Sanguinetti (2018) biorXiv paper shown below.

Citation

Kapourani, C.-A. and Sanguinetti, G. (2018). Melissa: Bayesian clustering and imputation of single cell methylomes, bioRxiv.

melissa's People

Contributors

andreaskapou avatar nturaga avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

melissa's Issues

Question about synthetic data and data with known clusters

Hi, thanks for the great tool and I have two questions:

  1. Are synthetic data necessary for running Melissa? Perhaps works as a training group? I read the Vignettes - "2: Cluster and impute scBS-seq data using Melissa" - "3.2 Loading synthetic data", then got confused.
  2. Can I do imputation of cells which have already clustered into two groups? I have 60 cells and they are clustered into 2 groups based on scRNA data, now I want to investigate the scBS-seq profile of some specific genes (and gene regions) between the two groups. Will it work? Could you kindly give me some suggestions?

Thank you!

Issues with binarise_files()

I don't use R regularly, so I might be missing something crucial here.

When I use binarise_files, the .process_bismark_file function can't open the file that is passed to it. I guess it's because it only get's the file name, but not the absolute path to it. Adjusting my working directory to the path where the Bismark files are helps to find the files, but then it skips all files as already processed. That's plausible looking at the code of .process_bismark_file:

cell <- sub(".gz","", filename)
outfile <- sprintf("%s", cell)
if (file.exists(paste0(outfile, ".gz"))) {
  cat(sprintf("Sample %s already processed, skipping...\n", cell))

Removing the .gz ending and then adding it again just checks for the input file's existence.

What am I missing in order to use the binarise_files function?

Parallel processing

Hello,

I've been having some issues running create_melissa_obj with no_cores parameter. With no_cores greater than 1 it just displays "Reading file annotation_file.bed" and takes forever to run anything. It doesn't display anything after that. Finally, my Rstudio on workstation hangs and I have to restart it. Can you help me out with this issue?

Thanks
Ayush

Error running impute_met_files

Hello,

I'm having some problems running impute_met_files. I put impute met file in impute_met_dir, it just shows "Creating methylation regions...Error in if (!is.na(x)) { : the condition has length > 1", that is, in the execution to .impute_files_internal "met_region <- lapply(X = met_region, FUN = function (x) { if (!is.na(x)){" An error is reported here, my impute met file format is "chr1 13667 1\nchr1 13693 -1 ...".Can you help me out with this issue?

thanks
Ayush

Coverage file format

The vignette states that the Bismark coverage file has the following format:
Line 59: <met_prcg> <met_reads> <unmet_reads>

But the .process_bismark_file function expects a 5 column input:
Line 80: colnames(data) <- c("chr","pos", "met_prcg", "met_reads","unnmet_reads")

Error in FUN(X[[i]], ...) : subscript out of bounds

Hi @andreaskapou ,

I have my own data at the same format as melissa synthetic data and when I use melissa function, I get:

Error in FUN(X[[i]], ...) : subscript out of bounds

My input data looks like:

image

and

image

and every cell contains the specific genomic region:

image

The output matrix for every region is:

image

Do you have ideas why I get such error? I tried to debug it, unsuccessfully:

image

Thank you in advance!

Best,
Igor

which R version does Melissa support?

Hi,
I am trying to install Melissa. However I get error of installing 'clues'.

✓  checking for file ‘/private/var/folders/k1/r_2p07k149vcpj8jzy87_z5c8718_r/T/RtmpewT8v9/remotes6b5e7a810992/andreaskapou-Melissa-4a869b5/DESCRIPTION’ ...
─  preparing ‘Melissa’:
✓  checking DESCRIPTION meta-information ...
─  installing the package to build vignettes
         -----------------------------------
   ERROR: dependency ‘clues’ is not available for package ‘Melissa’
─  removing ‘/private/var/folders/k1/r_2p07k149vcpj8jzy87_z5c8718_r/T/RtmpEJ0BYr/Rinst6bef5dcd1ad6/Melissa’
         -----------------------------------
   ERROR: package installation failed
Error: Failed to install 'Melissa' from GitHub:
  System command 'R' failed, exit status: 1, stdout + stderr:
E> * checking for file ‘/private/var/folders/k1/r_2p07k149vcpj8jzy87_z5c8718_r/T/RtmpewT8v9/remotes6b5e7a810992/andreaskapou-Melissa-4a869b5/DESCRIPTION’ ... OK
E> * preparing ‘Melissa’:
E> * checking DESCRIPTION meta-information ... OK
E> * installing the package to build vignettes
E>       -----------------------------------
E> ERROR: dependency ‘clues’ is not available for package ‘Melissa’
E> * removing ‘/private/var/folders/k1/r_2p07k149vcpj8jzy87_z5c8718_r/T/RtmpEJ0BYr/Rinst6bef5dcd1ad6/Melissa’
E>       -----------------------------------
E> ERROR: package installation failed

And I also cannot install 'clues' by source code

install.packages("https://cran.r-project.org/src/contrib/Archive/clues/clues_0.6.2.2.tar.gz", type = "source", repos = NULL)
trying URL 'https://cran.r-project.org/src/contrib/Archive/clues/clues_0.6.2.2.tar.gz'
Content type 'application/x-gzip' length 572111 bytes (558 KB)
==================================================
downloaded 558 KB

* installing *source* package ‘clues’ ...
** package ‘clues’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
gfortran  -fPIC  -Wall -g -O2  -c  CH.f95 -o CH.o
make: gfortran: No such file or directory
make: *** [CH.o] Error 1
ERROR: compilation failed for package ‘clues’
* removing ‘/Library/Frameworks/R.framework/Versions/3.6/Resources/library/clues’
Warning in install.packages :
  installation of package ‘/var/folders/k1/r_2p07k149vcpj8jzy87_z5c8718_r/T//RtmpewT8v9/downloaded_packages/clues_0.6.2.2.tar.gz’ had non-zero exit status

So I think this might be a R version problem or some other issues. Could you tell me how to solve this, or which R version may solve this?
Thank you!

Could not find function "impute_met_state"

Hi there,

I tried to pass the vignette 'Cluster and impute scBS-seq...' and when I try to run the evaluating imputation performance:

imputation_obj <- impute_met_state(obj = melissa_obj, test = dt_obj$met_test)

I get the error:

Error in impute_met_state(obj = melissa_obj, test = dt_obj$met_test) : could not find function "impute_met_state"

Of course, I imported both Melissa and BPRMeth packages.

Have you deleted or move this function? Thanks in advance!

Best,
Igor

My sessionInfo():

R version 4.0.1 (2020-06-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 19.3

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=de_DE.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] Melissa_1.5.3 BPRMeth_1.15.3 GenomicRanges_1.40.0 GenomeInfoDb_1.24.2 IRanges_2.22.2 S4Vectors_0.26.1
[7] BiocGenerics_0.34.0 RMySQL_0.10.20 DBI_1.1.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.