Giter Site home page Giter Site logo

Comments (2)

cmdoret avatar cmdoret commented on August 16, 2024

Hi @GeorgyChistov

The convolution operation (implemented in the xcorr2 function) is used multiple times to compute the map Pearson coefficient at the end.

The normxcorr2 function is where the Pearson correlation is computed. The code is hard to read because 1) it was written to work on sparse matrices and 2) we avoided temporary variables as much as possible to limit memory usage.

tl;dr: We compute the map of Pearson correlations at the very end by plugging the results of correlation products into the different terms of the formula described in the paper methods. We are vectorizing the pearson formula over the whole matrix.

More details below:

Note: This is a slightly simplified version of what we do, because in practice we also account for missing bins (NaNs) by adjusting the denominator.

The basic concept is the following: A Pearson correlation coefficient is computed between each position of an image $IMG$ (the Hi-C map) and a template $TMP$ (the kernel). The result is an image of correlation coefficients $CORR$:

Assuming $TMP$ has $M_{TMP}$ rows and $N_{TMP}$ columns

$CORR[i, j] = Corr(IMG[i: i+M_{TMP}, j: j+N_{TMP}], TMP)$

Where $Corr(\cdot, \cdot)$ between images $X$ and $Y$ is defined as:

$Corr(X, Y) = \frac{ cov(X, Y) }{ \sigma(X) \cdot \sigma(Y) }$

$=\frac{ (X - \overline{X} ) \cdot\ (Y - \overline{Y} ) }{ \sqrt{ \overline{ (X - \overline{X} )^2 } } \cdot \sqrt{ \overline{ (Y - \overline{Y} )^2 } } }$
$=\frac{E[(X - E[X])] \cdot E[(Y - E[Y])]}{\sqrt{E[(X - E[X])^2]} \cdot \sqrt{E[(Y - E[Y])^2]}}$
$=\frac{E[XY] - E[X] \cdot E[Y]}{\sqrt{E[X^2]} - E[X]^2 \cdot \sqrt{E[Y^2] - E[Y]^2}} $

Given that X represents the image around a pixel of the Hi-C matrix and Y represents the template, $E[Y] $ and $\sqrt{E[Y^2] - E[Y]^2}$ are the kernel's mean and standard deviation (float values), and are constant across $IMG$.

The other values to compute are:

  • $E[XY]$: The convolution of the image and the kernel.
  • $E[X]$: The convolution of the image by the uniform (mean) kernel. Each pixel (i, j) of the resulting map give the mean of the window $ (i: i+M_{TMP}, j: j+N_{TMP})$ . Values in this matrix can just be squared to obtain $E[X]^2$
  • $E[X^2]$: The convolution of the squared signal by the uniform (mean) kernel.

Which means there are 3 convolution products to compute in order to obtain a map of Pearson coefficients.

from chromosight.

cmdoret avatar cmdoret commented on August 16, 2024

As for the kernel of a TAD corner, in principle you'd like to use a simple kernel. It should have a decent correlation with most TAD corners (i.e. 1 quarter dark, 3 quarters light) in your dataset.

The problem might be the size: If the kernel is too large, you will miss small TADs (as the corner only fills a tiny portion of the kernel and will show poor correlation). If the kernel is too small, you will get many false positives due to noise being picked up as corners.

from chromosight.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.