rnabioco / djvdj Goto Github PK
View Code? Open in Web Editor NEWAn R package to analyze single-cell V(D)J data
Home Page: https://rnabioco.github.io/djvdj
License: Other
An R package to analyze single-cell V(D)J data
Home Page: https://rnabioco.github.io/djvdj
License: Other
I am trying to add VDJ data to a Seurat gene expression object and I run into this error.
library(djvdj)
female1 <- import_vdj(female,vdj_dir=c("tcr"="data/raw/project/count/PBMC_1_3_2/outs/per_sample_outs/PBMC_1_3_2/vdj_t/"),cell_prefix="-1",filter_contigs=TRUE)
Error in .merge_meta(input, res) :
Cell barcodes do not match those in the object, are you using the correct cell barcode prefixes?
If I check manually, the cell labels in the Seurat object and in the VDJ (airr_rearrangement.tsv for example) has the same format
> head(rownames(female[[]]))
[1] "AAACCTGTCTACTATC-1" "AAAGATGCAACTGCGC-1" "AAAGCAACAGCCAGAA-1"
[4] "AAAGCAATCCTCGCAT-1" "AAAGTAGGTCACCTAA-1" "AAAGTAGGTCGTCTTC-1"
head(read.table("data/raw/2021_084_Maria/count/PBMC_1_3_2/outs/per_sample_outs/PBMC_1_3_2/vdj_t/airr_rearrangement.tsv",sep="\t",header=TRUE)$cell_id)
[1] "AAACCTGTCTACTATC-1" "AAACCTGTCTACTATC-1" "AAAGCAACAGCCAGAA-1"
[4] "AACACGTGTGGTGTAG-1" "AACCATGTCGCAAACT-1" "AACCATGTCGCAAACT-1"
In fact, I have quantified the overlap between gene-expression and vdj.
a1 a2 a1_cells a2_cells shared_cells percent_shared
ge vdj-b 523 12 12 2.3
ge vdj-t 523 241 241 46
VDJ-B is not good, but VDJ-T has 46% overlap. So I would expect the overlapping cells to have the metadata added and others to have NAs. Also, is it possible to add only VDJ-T without VDJ-B?
R version 4.1.0 (2021-05-18)
djvdj: "0.0.0.9000"
hello,
what output from cellranger can be used as input for djvdj to look at gene usage?
also, can this be customised...e.g. showing specific barcodes, rather than all those from cellranger?
thanks
ibseq
mutate_vdj cannot overwrite existing columns
This does not work:
x %>%
mutate_vdj(chains = str_c(chains, collapse = "_"))
Really great tool! Maybe I missed it, but was hoping to cite this tool in a paper we are publishing and can't see if there is something to cite. Can always just cite the github directly, but was wondering if there is a paper, etc, I should cite instead? Thanks!
It would be also interesting if you could somehow integrate a function for TCR antigen prediction by comparing the TCR seqs with published TCR antigen-specific datasets, something similar to TCRmatch,
anther databases that might be of interest
VDJdb, TCRex, immuneML, TCRGP, NetTCR, ERGO ,DeepTCR, ImRex
targets <- antigen_predection(seuratobj, cellnames ="listofcellsname")
head(targets)
cellnames CDR3b CDR3a antigen score
HI,
I ran the following to add BCR data but got the both TCR and BCR chains are present error. But the input is only a BCR file.
Will you please let me know how to solve this?
Thanks
bcr_added=import_vdj(
input=seurat_obj,
vdj_dir = "BCR_data/sample_data/",
)
Error in .f(.x[[i]], ...): Malformed input data, both TCR and BCR chains are present.
Jotting these down for reference to add later
filter_vdj()
#' @examples
#' # filter for cells with specific numbers of chains
#' filter_vdj(tiny_vdj, length(chains) == 3)
Hi, thank you for the program, it is very useful. We are looking to compare TCR/BCR sequences between samples to observe which cdr3 sequences are the same between samples. Is there a way to generate a list of conserved CDR3 sequences between multiple samples in a series, and perhaps a % which are the same as well? Using the calc_similarity with abdiv-jaccard, between two samples which appeared by eyeballing to have many overlapping TCR cdr3 sequences, it provided a jaccard calculation of 0.9712919 (which indictaes mostly dissimilar?). Thank you.
Andrew
Get complet TCR seq for cloning
seq <- fetch_complet_seq(object =seuratobj, cell_names = c("list_of_cells_names"))
Head(seq)
cellname cdr3a_nt cdr3b_nt cd3a cdrb tcra_chain tcrb_chian complete_tcra_seq complete_tcrb_seq
ATGTAGAG ATG ATG KLV KLV Trav15-Traj12 Trab6-trbj12 xxxxxxxxx xxxxxxx
Originally posted by @Ahmedalaraby20 in #95 (comment)
Thoughts on ROC analysis of protein-DNA tags as classifiers.
The question is how well a given reagent performs as a classifer relative to gene expression classifications (i.e., assuming these are the "gold standards"). AUC values could provide information about reagent quality and can be compared across reagents, batches, etc.
For a function roc_analysis()
, Input data would be so or sce with:
For a comparison, assume two possible states (e.g., B vs T cell, or B cell vs all other cells). Then step through the range of recovered protein-DNA tag signal and calculate:
TP / (TP + FN)
). TP
= number of B cells scoring positive, FN
= number of B cells scoring negative.FP / FP + TN
). FP
= number of T cells scoring positive, TN
= number of T cells scoring negative.plot_roc()
would plot TPR vs FPR for each of the ranked detection values, and roc_auc()
would provide the AUC value from the data.
Would this be of some interest?
.KEEP column is not removed from meta.data when clonotype_col = NULL
and filter_cells = TRUE
What does this refer to?
Line 49 in 8909616
Hey guys,
I have been trying to install djvdj but unfortunately I always end up with this error.
devtools::install_github("rnabioco/djvdj")
Error: Failed to install 'unknown package' from GitHub:
Timeout was reached: [api.github.com] Connection timed out after 10000 milliseconds
What am I doing wrong
raw data here https://www.ncbi.nlm.nih.gov/sra/?term=PRJNA578389
Custom code available upon request...
Have been thinking lately about doing a deep mutational scan on the MD4 BCR. Would be different way to see if changes in affinity can be accurately measured by AVID-seq.
What is the best way to test for a blank ggplot?
library(testthat)
library(tidyverse)
dat <- tibble(
x = seq(1, 10),
y = x ^ 2
)
this throws an error:
dat %>%
ggplot(aes(x, A)) +
geom_point()
but the error is not caught:
expect_error(
dat %>%
ggplot(aes(x, A)) +
geom_point(),
NA
)
Hi guys,
thanks for the very useful selection of functions you've created for interacting with VDJ data! I've started to edit the functions so that they'll work with SingleCellExperiment objects, too: https://github.com/friedue/SCEdjvdj
Feel free to link to it if someone comes asking; forks and contributions very welcome, of course.
Cheers,
Friederike
Hi, when I run the import_vdj function, I got the error message: "Malformed input data, NAs are present, check input files". Do you have any pointers on where I should troubleshoot? I checked the filtered_contig_annotations.csv file from the cellranger outs folder, and I do not find anything weird (except that the d_gene column has some NA values).
Hey guys,
Would be really nice if you could include more visualization methods such as Circos
Thanks alot
Hi,
Thanks for your great work on djvdj.
I had some problem with the import function. When the clonotype id is NA the function don't concatenate the row, the function can't use barcode as metadata and break.
thanks a lot
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.