Giter Site home page Giter Site logo

ceres's Introduction

CERES

Computational correction of copy-number effect in CRISPR-Cas9 essentiality screens

Installation instructions

You will need several packages available on Bioconductor before installing CERES. To install these, run:

source("https://bioconductor.org/biocLite.R")
biocLite(c("Biostrings", "Rsamtools", 
            "GenomeInfoDb", "BSgenome", 
            "BSgenome.Hsapiens.UCSC.hg19", "GenomicRanges"), type="source")

If the devtools package is not already installed, install from the R console:

install.packages("devtools")

To install CERES, either run:

devtools::install_github("cancerdatasci/ceres")

or clone the ceres repository, navigate to the parent directory of the local copy, and run from the R console:

devtools::install("ceres")

Note that if C++11 support is not already enabled, you may need to run

Sys.setenv("PKG_CXXFLAGS"="-std=c++11")

prior to running the install command.

Preparing CERES inputs also requires the bowtie and samtools command line tools. For OSX users with homebrew installed on their machine, these can be installed from the command line:

brew tap brewsci/science
brew install bowtie
brew install samtools

Run CERES on example data

Download these zipped files from depmap.org/ceres and extract into a directory. (e.g. ./data/download). If you haven't already fetched / built them yourself, you should also separately download the necessary bowtie indices here and place the unzipped files in the bowtie_indexes directory of the example folder.

The data in the example files are from screens of 33 cancer cell lines published in Aguirre et al. 2016 and 14 AML lines published in Wang et al. 2017.

Run the example script below, ensuring that the data_dir variable points to the directory with the data download.

library(ceres)

### Setup

# Edit this line to point to data directory
data_dir <- "./data/download"

cn_seg_file <- file.path(data_dir, "CCLE_copynumber_2013-12-03.seg.txt")
gene_annot_file <- file.path(data_dir, "CCDS.current.txt")

# Set bowtie index directory. Not needed if $BOWTIE_INDEXES environmental variable is set and includes hg19 index.
Sys.setenv(BOWTIE_INDEXES = file.path(data_dir, "bowtie_indexes"))


gecko_dep_file <- file.path(data_dir, "Gecko.gct")
gecko_rep_map <- file.path(data_dir, "Gecko_replicate_map.tsv")

wang_dep_file <- file.path(data_dir, "Wang2017.gct")
wang_rep_map <- file.path(data_dir, "Wang2017_replicate_map.tsv")



### Run CERES on Gecko data

gecko_inputs_dir <- file.path("./data/gecko_ceres_inputs", Sys.Date())

prepare_ceres_inputs(inputs_dir=gecko_inputs_dir,
                     dep_file=gecko_dep_file,
                     cn_seg_file=cn_seg_file,
                     gene_annot_file=gene_annot_file,
                     rep_map_file=gecko_rep_map,
                     chromosomes=paste0("chr", 1:22),
                     dep_normalize="zmad")

gecko_ceres <-
    wrap_ceres(sg_path=file.path(gecko_inputs_dir, "guide_sample_dep.Rds"),
               cn_path=file.path(gecko_inputs_dir, "locus_sample_cn.Rds"),
               guide_locus_path=file.path(gecko_inputs_dir, "guide_locus.Rds"),
               locus_gene_path=file.path(gecko_inputs_dir, "locus_gene.Rds"),
               replicate_map_path=file.path(gecko_inputs_dir, "replicate_map.Rds"),
               run_id="Gecko",
               params=list(lambda_g=0.68129207))

gecko_ceres_scaled <-
    scale_to_essentials(gecko_ceres$gene_essentiality_results$ge_fit)


### Run CERES on Wang2017 data

wang_inputs_dir <- file.path("./data/wang_ceres_inputs", Sys.Date())

prepare_ceres_inputs(inputs_dir=wang_inputs_dir,
                     dep_file=wang_dep_file,
                     cn_seg_file=cn_seg_file,
                     gene_annot_file=gene_annot_file,
                     rep_map_file=wang_rep_map,
                     chromosomes=paste0("chr", 1:22),
                     dep_normalize="zmad")

wang_ceres <-
    wrap_ceres(sg_path=file.path(wang_inputs_dir, "guide_sample_dep.Rds"),
               cn_path=file.path(wang_inputs_dir, "locus_sample_cn.Rds"),
               guide_locus_path=file.path(wang_inputs_dir, "guide_locus.Rds"),
               locus_gene_path=file.path(wang_inputs_dir, "locus_gene.Rds"),
               replicate_map_path=file.path(wang_inputs_dir, "replicate_map.Rds"),
               run_id="Wang2017",
               params=list(lambda_g=0.68129207))

wang_ceres_scaled <-
    scale_to_essentials(wang_ceres$gene_essentiality_results$ge_fit)

ceres's People

Contributors

j-g-b avatar pgm avatar remimarenco avatar robinmeyers avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ceres's Issues

bowtie2-align exited with value 1 with example data

I am trying to run the example data:

data_dir <- "/home/niek/Downloads/example_data"

cn_seg_file <- file.path(data_dir, "/CCLE_copynumber_2013-12-03.seg.txt")
gene_annot_file <- file.path(data_dir, "/CCDS.current.txt")

# Set bowtie index directory. Not needed if $BOWTIE_INDEXES environmental variable is set and includes hg19 index.
Sys.setenv(BOWTIE_INDEXES = file.path(data_dir, "bowtie_indexes")) #bowtie indexes were downloaded to this folder


gecko_dep_file <- file.path(data_dir, "Gecko.gct")
gecko_rep_map <- file.path(data_dir, "Gecko_replicate_map.tsv")

wang_dep_file <- file.path(data_dir, "Wang2017.gct")
wang_rep_map <- file.path(data_dir, "Wang2017_replicate_map.tsv")



### Run CERES on Gecko data

gecko_inputs_dir <- file.path("./data/gecko_ceres_inputs", Sys.Date())

prepare_ceres_inputs(inputs_dir=gecko_inputs_dir,
                     dep_file=gecko_dep_file,
                     cn_seg_file=cn_seg_file,
                     gene_annot_file=gene_annot_file,
                     rep_map_file=gecko_rep_map,
                     chromosomes=paste0("chr", 1:22),
                     dep_normalize="zmad",
                     bowtie_exe = "/home/niek/Downloads/example_data/bowtie2-2.4.2-linux-x86_64/bowtie2",
                     samtools_exe = "/home/niek/Downloads/example_data/samtools-1.12/bin/samtools")

gecko_ceres <-
  wrap_ceres(sg_path=file.path(gecko_inputs_dir, "guide_sample_dep.Rds"),
             cn_path=file.path(gecko_inputs_dir, "locus_sample_cn.Rds"),
             guide_locus_path=file.path(gecko_inputs_dir, "guide_locus.Rds"),
             locus_gene_path=file.path(gecko_inputs_dir, "locus_gene.Rds"),
             replicate_map_path=file.path(gecko_inputs_dir, "replicate_map.Rds"),
             run_id="Gecko",
             params=list(lambda_g=0.68129207))

gecko_ceres_scaled <-
  scale_to_essentials(gecko_ceres$gene_essentiality_results$ge_fit)

But I get the following error when trying to run prepare_ceres_inputs():

Error: Encountered internal Bowtie 2 exception (#1)
Command: /home/niek/Downloads/example_data/bowtie2-2.4.2-linux-x86_64/bowtie2-align-s --wrapper basic-0 -t -p 4 -a -v 0 -f -S hg19 /tmp/RtmpU3W3In/guides.fa /tmp/RtmpU3W3In/guides.sam 
(ERR): bowtie2-align exited with value 1
[E::hts_open_format] Failed to open file "/tmp/RtmpU3W3In/guides.sam" : No such file or directory
samtools view: failed to open "/tmp/RtmpU3W3In/guides.sam" for reading: No such file or directory
Error in value[[3L]](cond) : 
  failed to open BamFile: file(s) do not exist:
  ‘/tmp/RtmpU3W3In/guides.bam’

I have seen this problem in another issue but there was no response. How can I solve this issue?

levels in 'seqnames' with no entries in 'seqinfo' were dropped

I am having this problem in example data :)

Error in makeGRangesFromDataFrame(., seqinfo = genomeinfo, keep.extra.columns = T) :                                      
  The "start" and/or "end" columns contain NAs. Use 'na.rm=TRUE' to ignore the
  rows with NAs.
In addition: Warning message:
In .normarg_seqnames2(seqnames, seqinfo) :
  levels in 'seqnames' with no entries in 'seqinfo' were dropped

Error "Failed connect to ftp.ncbi.nlm.nih.gov:21"

I kept gettting the following error when running the prepare_ceres_inputs with the example data

Error in function (type, msg, asError = TRUE) :
Failed connect to ftp.ncbi.nlm.nih.gov:21; Connection timed out

Any suggestion?

Error message in example data run: Could not locate a Bowtie index

Hello,
I ran into this error message

 prepare_ceres_inputs(inputs_dir=gecko_inputs_dir,
+                      dep_file=gecko_dep_file,
+                      cn_seg_file=cn_seg_file,
+                      gene_annot_file=gene_annot_file,
+                      rep_map_file=gecko_rep_map,
+                      chromosomes=paste0("chr", 1:22),
+                      dep_normalize="zmad")
loading dependency data...

Parsed with column specification:
cols(
  Replicate = col_character(),
  CellLine = col_character()
)
loading copy number data...

|======================================================| 100%   43 MB
mapping sgRNAs to the genome...

Could not locate a Bowtie index corresponding to basename "hg19"
Overall time: 00:00:00
Command: /usr/local/Cellar/bowtie/1.2.2_p1/bin/bowtie-align-s --wrapper basic-0 -t -p 4 -a -v 0 -f -S hg19 /var/folders/1t/gz_zw2bx635fyxqplq1ttb6c0000gp/T//RtmpYoJMLF/guides.fa /var/folders/1t/gz_zw2bx635fyxqplq1ttb6c0000gp/T//RtmpYoJMLF/guides.sam 
[E::hts_open_format] Failed to open file /var/folders/1t/gz_zw2bx635fyxqplq1ttb6c0000gp/T//RtmpYoJMLF/guides.sam
samtools view: failed to open "/var/folders/1t/gz_zw2bx635fyxqplq1ttb6c0000gp/T//RtmpYoJMLF/guides.sam" for reading: No such file or directory
Error in value[[3L]](cond) : 
  failed to open BamFile: file(s) do not exist:
  '/var/folders/1t/gz_zw2bx635fyxqplq1ttb6c0000gp/T//RtmpYoJMLF/guides.bam'

I have downloaded hg19_bowtie.tar and placed it unzipped in the bowtie_indexes folder of example_data.

Thanks!

bowtie2-align exited with value 1

my run is giving
Error: Encountered internal Bowtie 2 exception (#1) Command: /Users/sudeeris/Downloads/bowtie2-2.5.1-macos-arm64/bowtie2-align-s --wrapper basic-0 -t -p 4 -a -v 0 -f -S hg19 /var/folders/hr/0x5lrx7s2fg_0hgrw94pdt8c0000gn/T//RtmppY1ven/guides.fa /var/folders/hr/0x5lrx7s2fg_0hgrw94pdt8c0000gn/T//RtmppY1ven/guides.sam (ERR): bowtie2-align exited with value 1 sh: samtools: command not found Error in value[[3L]](cond) : failed to open BamFile: file(s) do not exist: ‘/var/folders/hr/0x5lrx7s2fg_0hgrw94pdt8c0000gn/T//RtmppY1ven/guides.bam’

this error and I think there is problem with the bowtie command but i couldnt solve the problem.
Edit: I am. aware of the problem with the samtools command i already solved it.

Thanks in advance

Could you please tell me how to convert the raw count file to .gct file ?

Dear Contributors,

We have CRIPSR screen file in format like:

An example format of read-count.txt
[sgRNA tag] \t [GENE] \t [sample 1] \t [sample 2] ...
ATAGATGTCCTGTGGCCCCG-P53 [tab] TP53 [tab] 403 [tab] 362 ....
ACTCACTTCCTGTGGCCCCG-MDM2 [tab] MDM2 [tab] 45 [tab] 64 ....

Could you please tell me how to convert the raw count file to .gct file, which has negative values ?

Thanks,

#1.2
119461 128
Name Description A375-311cas9 Rep A p9 A375-311cas9 Rep B p9 A375-311cas9 Rep C p9 A375-311cas9 Rep D p9
A-673-311Cas9 Rep A p9 A-673-311Cas9 Rep B p9 A-673-311Cas9 Rep C p9 A-673-311Cas9 Rep D p9 BxPC3-311Cas9 Rep A p7
BxPC3-311Cas9 Rep B p7 BxPC3-311Cas9 Rep C p7 BxPC3-311Cas9 Rep D p7 CADO-ES-1-311Cas9 Rep A p4 CADO-ES-1-311Cas
9 Rep B p4 CADO-ES-1-311Cas9 Rep C p4 CADO-ES-1-311Cas9 Rep D p4 CAL-120-311Cas9 Rep A p6 CAL-120-
311Cas9 Rep B p6 CAL-120-311Cas9 Rep D p6 COLO-741-311Cas9 Rep B p6 COLO-741-311Cas9 Rep C p6
COLO-741-311Cas9 Rep D p6 COR-L105-311cas9 Rep A p8 COR-L105-311cas9 Rep B p8 COR-L105-311cas9 Rep C p
8 COR-L105-311cas9 Rep D p8 EW8-311cas9 Rep A P6 EW8-311cas9 Rep B P6 EW8-311cas9 Rep C P6 EW8-311c
as9 Rep D P6 EWS502-311cas9 Rep A P6 EWS502-311cas9 Rep B P6 EWS502-311cas9 Rep C P6 EWS502-311cas9 Rep D P6 G-402-31
1cas9 Rep A p6 G-402-311cas9 Rep B p6 G-402-311cas9 Rep C p6 G-402-311cas9 Rep D p6 HCC44-311cas9 Rep A p6 HCC44-31
1cas9 Rep B p6 HCC44-311cas9 Rep C p6 HCC44-311cas9 Rep D p6 Hs294T-311cas9 Rep 1 p7 Hs294T-311cas9 Rep 2 p7 Hs294T-3
11cas9 Rep 3 p7 Hs294T-311cas9 Rep 4 p7 HT-29-311cas9 Rep A p6 HT-29-311cas9 Rep B p6 HT-29-311cas9 Rep C p6 HT-29-31
1cas9 Rep D p6 K562-311cas9 Rep A p5 K562-311cas9 Rep B p5 K562-311cas9 Rep C p5 K562-311cas9 Rep D p5 L3.3-311
Cas9 Rep A p7 L3.3-311Cas9 Rep B p7 L3.3-311Cas9 Rep C p7 L3.3-311Cas9 Rep D p7 LNCaP Clone FGC-311Cas9 Rep A p6
LNCaP Clone FGC-311Cas9 Rep B p6 LNCaP Clone FGC-311Cas9 Rep C p6 LNCaP Clone FGC-311Cas9 Rep D p6
MeWo-311cas9 Rep A p6 MeWo-311cas9 Rep B p6 MeWo-311cas9 Rep C p6 MHH-ES-1-311Cas9 Rep A p6 MHH-ES-1
-311Cas9 Rep B p6 MHH-ES-1-311Cas9 Rep C p6 MHH-ES-1-311Cas9 Rep D p6 NCI-H1373-311Cas9 Rep A p8
NCI-H1373-311Cas9 Rep B p8 NCI-H1373-311Cas9 Rep C p8 NCI-H1373-311Cas9 Rep D p8 NCI-H2009-311Cas9 Rep A
p7 NCI-H2009-311Cas9 Rep B p7 NCI-H2009-311Cas9 Rep C p7 Panc.03.27-311cas9 Rep A P9 Panc.03.27-311ca
s9 Rep B P9 Panc.03.27-311cas9 Rep C P9 Panc.03.27-311cas9 Rep D P9 Panc 08.13-311Cas9 Rep A p8 Panc 08.
13-311Cas9 Rep B p8 Panc 08.13-311Cas9 Rep C p8 Panc 08.13-311Cas9 Rep D p8 Panc1-311Cas9 Rep A p8 Panc1-31
1Cas9 Rep B p8 Panc1-311Cas9 Rep C p8 Panc1-311Cas9 Rep D p8 PA-TU-8902-311cas9 Rep A p8 PA-TU-8902-311cas9 Rep B
p8 PA-TU-8902-311cas9 Rep C p8 PA-TU-8902-311cas9 Rep D p8 PA-TU-8988T-311cas9 Rep A p8 PA-TU-8988T-311c
as9 Rep B p8 PA-TU-8988T-311cas9 Rep C p8 PA-TU-8988T-311cas9 Rep D p8 PC-3-311Cas9 Rep A p6 PC-3-311Cas9 Rep
B p6 PC-3-311Cas9 Rep C p6 PC-3-311Cas9 Rep D p6 RD-ES-311Cas9 Rep A p5 RD-ES-311Cas9 Rep B p5 RD-ES-311Cas9 Re
p C p5 RD-ES-311Cas9 Rep D p5 SK-ES-1-311Cas9 Rep A p5 SK-ES-1-311Cas9 Rep B p5 SK-ES-1-311Cas9 Rep C p5
SK-ES-1-311Cas9 Rep D p5 SU.86.86-311cas9 Rep A p7 SU.86.86-311cas9 Rep B p7 SU.86.86-311cas9
Rep C P7 SU.86.86-311cas9 Rep D P7 T-47D-311Cas9 Rep A p6 T-47D-311Cas9 Rep B p6 T-47D-311Cas9 Rep C p6
T-47D-311Cas9 Rep D p6 TC32-311Cas9 Rep A p6 TC32-311Cas9 Rep B p6 TC32-311Cas9 Rep C p6 TC32-311Cas9 Rep D p6
TC-71-311Cas9 Rep A p6 TC-71-311Cas9 Rep B p6 TC-71-311Cas9 Rep C p6 TC-71-311Cas9 Rep D p6 TOV112D-311Cas9 Rep A p6
TOV112D-311Cas9 Rep B p6 TOV112D-311Cas9 Rep C p6 TOV112D-311Cas9 Rep D p6
GTCGCTGAGCTCCGATTCGA GTCGCTGAGCTCCGATTCGA 0.0623 -0.24881 0.292296 -0.224205 0.325802
0.306743 -0.154492 0.290709 0.236717 -0.029312 0.108429 0.002837 -0.08870
2 0.463192 0.255931 -0.159384 -0.22571 0.268574 0.198586 0.668924
0.252747 0.386452 0.651673 0.522511 0.118764 0.439978 0.268871 -0.19845
0.220029 0.028026 0.149452 0.442862 0.010879 0.065172 0.038645
-1.211578 -0.110853 0.056169 0.345662 0.841075 -0.005683 0.58177 -0.626985
-0.67095 -0.669859 -0.532884 0.347379 -0.190249 0.325732 0.497403 0.680722
0.729096 0.002355 -0.323609 0.383396 0.217298 0.440748 0.328533
-0.09274 0.062231 0.127001 0.062985 0.351674 0.527628 0.446774 -0.01644
9 0.461915 0.933975 -0.45761 -0.617223 -0.243123 0.370821 -0.529677
0.214101 -0.132138 0.7023 0.322142 -0.565776 0.463337 0.044973 0.576458
1.020644 0.377618 0.066749 0.29711 0.163964 0.312047 -0.023299 -0.216543
-0.461724 -0.185127 0.482342 0.716575 -0.526126 0.134906 -0.076902 0.065532
0.775962 -0.199227 0.29537 0.272886 0.228704 0.680328 0.220316 0.449653
0.561631 0.006004 0.298158 -0.321754 0.178818 0.232724 -0.336839
0.498237 0.388898 0.297567 0.251397 0.300136 0.797395 0.10322 0.319408
0.605233 0.33745 0.099813 0.394441 0.395975 0.26122 -0.394427 0.170345

fit_ceres Rcpp function is crashes with Killed: 9 message after awhile

The wrapper ceres::wrap_ceres() is always crashing after awhile with message "killed 9". After some investigation I found that crash happens in Rcpp function fit_ceres after a message "Instantiated model matrix..." (see attach)

I have an R version.string R version 4.2.1 (2022-06-23), macos 10.13.6

Screen Shot 2023-01-10 at 17 58 25

Using run_ceres directly

I can't get prepare_ceres_inputs to work consistently with my proxy settings and, on the rare occasion I do, I can't get it to work with my bowtie and samtools:

> prepare_ceres_inputs(inputs_dir="ceres_inputs",
+                      dep_file="ceres_inputs/ceres_LFC_input.gct",
+                      cn_seg_file="ceres_inputs/ceres_CN_input.tsv",
+                      gene_annot_file="example_data/CCDS.current.txt",
+                      rep_map_file="ceres_inputs/ceres_rep_input.tsv",
+                      genome_id="hg19",
+                      chromosomes=paste0("chr", 1:22),
+                      dep_normalize="zmad")
loading dependency data...

Parsed with column specification:
cols(
  Replicate = col_character(),
  CellLine = col_character()
)
loading copy number data...

mapping sgRNAs to the genome...

sh: bowtie: command not found
sh: samtools: command not found
Error in value[[3L]](cond) : 
  failed to open BamFile: file(s) do not exist:
  '/tmp/RtmpfQL4OU/guides.bam'
In addition: Warning messages:
1: In system(bowtie_cmd) : error in running command
2: In system(samtools_cmd) : error in running command

As a result, I've tried to put together the correct data and supply it directly to run_ceres. It fails with error:

> run_ceres(sg_data=sg_data, cn_data=cn_data, 
+           guide_locus=guide_locus, locus_gene=locus_gene, replicate_map=repmap)
Error in dimnames(x) <- dn : 
  length of 'dimnames' [2] not equal to array extent
In addition: There were 50 or more warnings (use warnings() to see the first 50)

and the warnings are:

Warning messages:
1: In mean.default(x, na.rm = T) :
  argument is not numeric or logical: returning NA

Obviously I'm using real data, but I thought dummy data would help you to spot what I'm doing wrong:


# log fold change calc from plasmid of each gRNA in each sample
dum_sg_lfc <- as.matrix(sapply(1:6, function(x) rnorm(4)))
rownames(dum_sg_lfc) <- c("ATCGA", "ATCGT", "ATCGC", "ATCGG")
colnames(dum_sg_lfc) <- c("A1", "A2", "B1", "B2", "C1", "C2")

# log2ratio copy number at each gRNA cut site in each cell line given as chr:pos
dum_cn_lr <- as.matrix(sapply(1:3, function(x) rnorm(4)))
rownames(dum_cn_lr) <- c("1:100", "1:200", "1:300", "1:400")
colnames(dum_cn_lr) <- c("A", "B", "C")

# dummy data using chr:pos as locus, entrez gene id as gene and sample to cell line names
dum_gl <- data.frame(Guide=rownames(dum_sg_lfc), Locus=rownames(dum_cn_lr))
dum_lg <- data.frame(Locus=rownames(dum_cn_lr), Gene=paste0("eg", 1:nrow(dum_cn_lr)))
dum_rep <- data.frame(Replicate=colnames(dum_sg_lfc), CellLine=gsub("[[:digit:]]*", "", colnames(dum_sg_lfc)))

run_ceres(sg_data=dum_sg_lfc, cn_data=dum_cn_lr, 
          guide_locus=dum_gl, locus_gene=dum_lg, replicate_map=dum_rep)


Hello, I'm in trouble with RPPanalyzer tutorial

Hello, I'm beginning user with github.
Also I'm beginning user with R.
And I'm bad at English.

And I'm in trouble with RPPanalyzer tutorial

I am following the tutorial RPPanalyzer for practice.
RPPanalyzer (Version 1.0.3)
Analyze reverse phase protein array data
User‘s Guide
Heiko Mannsperger and Stephan Gade
German Cancer Research Center
Heidelberg, Germany
October 1, 2012

below is What I done at R

BiocManager::install("RPPanalyzer")
library(RPPanalyzer)

##define path to example files
dataDir <- system.file("extdata",package = "RPPanalyzer")

change working directory

setwd(dataDir)
##store example sample description in a variable
sampledescription <- read.delim("sampledescription.txt")
s## show sample description header
head(sampledescription)

dataDir <- system.file("extdata", package = "RPPanalyzer")
setwd(dataDir)

store example sample description in a variable

slidedescription <- read.delim("slidedescription.txt")

show sample description header

head(slidedescription)

and here is what i'm trouble with

rawdata <- read.Data(blocksperarray = 4, spotter = "arrayjet",printFlags = FALSE)

when if i run , i got error like this ; Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent.

How can I handle with this...?

Accepting sgRNAs with length != 20

Currently, CERES assumes that sgRNAs have 20 nts before the PAM in the function guideAlignments. However, some libraries use 19 nt guides. Ideally this could be solved by taking the alignment end rather than the beginning for positive strand alignments. Simple hack that worked for me instead: add an argument guide_length to the guideAlignments call signature. guide_length can be determined in map_guide_to_locus by checking the number of characters in the first entry in guides (assumes all guides are the same length).

PAM Filter Bug

Hey all,

Looks like there's a minor bug when it goes to filter out alignments without PAMs in guideAlignments. The PAM regular expression in the code is currently "[ACGTN]GG|GG[ACGTN]", which allows "GGN" PAMs to pass through the filter. "[ACGTN]GG" should be sufficient, since getSeq fetches the PAM 5'->3' relative to the query, regardless of the aligned strand. As it stands, the current code incorrectly allows ~1-2% of alignments through the filter when using the Aguirre GeCKO data.

Best,
Scott

Calculating dependency probabilities

Hello,

I'm interested in calculating the dependency probabilities that are available on the DepMap portal. After following your tutorial I noticed that the last step was to scale_to_essentials(), which I am guessing creates the gene affect scores.

In your Meyers et al paper there is a mention of using the EM algorithm to generate the dependency probabilities, but I was wondering practically how that is done. Do you have a sample code you can share or add to the tutorial on your README.md?

Thanks!

Scale results to non-targeting sgRNAs

Hello @joshdempster,

In the last step of your examples scale_to_essentials() was used to scale the results to essential and non-essential genes, but have a screen that contains essential and non-targeting sgRNAs instead of nonessential genes. Do you have any suggestions for how to scale to controls?

Best,
Yuka

Additional resources

Hello cancerdatasci team,

I read the publication for CERES by Meyers et al. on Nat Genet, and I am excited to try CERES to correct for copy number effect in our CRISPR KO screen (which didn't use Gecko or Wang library). I'd like to learn more about this tool so that I can adapt it for use in our lab. I'm curious, are there are any tutorials out there besides what is currently in the README.md file? After running through the examples on README.md, it left me with a few questions for example:

  • What are the input file format requirements?
  • How were the example data generated (for example, Gecko.gct)
  • What is the "zmad" argument given to dep_normalize in the prepare_ceres_input()?
  • How was list(lambda_g=0.68129207) decided as an argument for params in wrap_ceres()?

Thanks for your time!
-Yuka

Cannot install ceres

Hi,

I am trying to install ceres in R, but it keeps me giving me errors. The error is as follows:
Error: Failed to install 'ceres' from GitHub:
(converted from warning) installation of package ‘C:/Users/User1/AppData/Local/Temp/RtmpuK9ftj/file60a846bd22d/ceres_0.0.0.9000.tar.gz’ had non-zero exit status

Downloading GitHub repo cancerdatasci/ceres@master
Skipping 5 packages not available: Biostrings, Rsamtools, GenomeInfoDb, BSgenome, GenomicRanges

Also, I think you need to update the Bioconductor download script for R version higher than 3.6.
The way it is written in manuscript does not allow installation of the 5 packages from Bioconductor.

I looked at the website and did download as suggested. But, as you can see above, it keeps giving me error.

Can you please help me? I am not familiar with R and it is quite hard to figure out when encountered such problems.

Thanks,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.