daesc's People
Forkers
sachinkavindaadaesc's Issues
Pre-processing the data from 10xchromium
Hello,
I am now struggling to generate the base quality score table for the bam file generated from cellranger pipeline. As the cellranger mapped the sequencing file to grch38 genome, the vcf file from gatk resource bundle is not compatible. While the vcf files from 1000 genome project are doable, they are splited into different chromosome and each file take up a bit of storage place. Is there any suggestion to get around with this problem or if there's any way to merge the base quality score table generated for each chromosome into one?
Identifying imprinted genes from reciprocal cross and scRNA-seq data
I am impressed by your tool and would like to leverage its capabilities in my research. Using a reciprocal mouse strain cross, I have generated stem cell lines from two distinct rat strains with characterized SNPs. I have performed scRNA-seq on these cell lines from both reciprocal crosses. I am specifically interested in identifying imprinted genes within this dataset using your DAESC algorithm. Can you please advise on the best approach to integrate my SNP and scRNA-seq data into your tool for this purpose?
Thank you
Implicit phasing
Hi
I'm using this tool and I noticed that it supports implicit phasing. Could you please clarify how implicit phasing differs from the standard (explicit) phasing? Additionally, is it possible for me to obtain the phased values using this tool, and if so, how?
Thank you!
Skewed p-value distribution
Hello DAESC team,
I have a question about interpreting the result using the DAESC-MIX model. I have tested ~10000 genes for allelic imbalance among three conditions. Then I plotted a histogram of p-values.
Looking at the p-value distribution, it is very skewed, identifying almost all genes as significant.
Can you share your insights on this?
The dataset used here is 10X Visium.
Thank you!
Specifying the design matrix for differential ASE
Hello DAESC dev team!
First of all, thank you for developing such flexible framework to test for allelic imbalance.
I am trying to incorporate DAESC into my analysis pipeline that tests for differential ASE across multiple individuals accounting for the disease status (condition), spatial location (cortical layers) and cell type.
To begin with, I applied DAESC on a toy dataset from one gene, two conditions, and two individuals per condition. Here, I fit the baseline model (DAESC-BB) specifying the design matrix x
as a binary numeric array denoting the condition.
Now, I want to extend this by also taking into account the spatial information and the interaction terms. Since both condition and spatial location as in cortical layers are categorical variables, I tried using model.matrix
function to encode the information in one-hot matrix. Here is the structure of the data frame I am working with and how I am invoking the daesc_bb
function.
str(cur.df)
'data.frame': 3046 obs. of 8 variables:
$ gene : chr "HES6" "HES6" "HES6" "HES6" ...
$ barcode : chr "CTCTCTAACTGCCTAG" "TCGGCGTACTGCACAA" "GGGCAGGATTTCTGTG" "CGGTTCCGGCTTCTTG" ...
$ allele1_count: num 1 1 1 1 1 1 0 3 0 1 ...
$ allele2_count: num 1 0 0 0 1 1 1 0 1 0 ...
$ total_count : num 2 1 1 1 2 2 1 3 1 1 ...
$ condition : Factor w/ 2 levels "HC","PD": 1 1 1 1 1 1 1 1 1 1 ...
$ sample_id : chr "BN0339" "BN0339" "BN0339" "BN0339" ...
$ layer : Factor w/ 7 levels "Layer 1","Layer 2",..: 7 4 7 4 5 3 6 6 5 6 ...
myformula = ~ cur.df$condition + cur.df$layer + cur.df$condition:cur.df$layer + 0
one_hot = model.matrix(myformula)
str(one_hot)
num [1:3046, 1:14] 1 1 1 1 1 1 1 1 1 1 ...
- attr(, "dimnames")=List of 2
..$ : chr [1:3046] "1" "2" "3" "4" ...
..$ : chr [1:14] "cur.df$conditionHC" "cur.df$conditionPD" "cur.df$layerLayer 2" "cur.df$layerLayer 3" ...
- attr(, "assign")= int [1:14] 1 1 2 2 2 2 2 2 3 3 ...
- attr(*, "contrasts")=List of 2
..$ cur.df$condition: chr "contr.treatment"
..$ cur.df$layer : chr "contr.treatment"
res = daesc_bb(y=cur.df$allele1_count, n=cur.df$total_count, subj=cur.df$sample_id, x=one_hot)
When I run this as is, I get the following error:
fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
cur.df.conditionPD
NA
Error in aod::betabin(cbind(y, n - y) ~ ., random = ~1, data = data.frame(y, :
Initial values for the fixed effects contain at least one missing value.
I think the reason is in the design matrix, where the first two columns have essentially the same information. After dropping the first column in one_hot
, it runs without error, but I want to double check whether this is the intended use of this variable when supplying multiple categorical variables.
Finally, I also want to add cell type information as another independent variable. In my case, cell type is not a categorical variable, since the data was generated using Visium. So for each cell type, the input data will be its estimated abundance. If I were to input all (1) condition, (2) layer and (3) cell type into x
, how should I structure it?
Thank you!
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.