Giter Site home page Giter Site logo

romanhaa / cerebro Goto Github PK

View Code? Open in Web Editor NEW
92.0 6.0 19.0 3.07 GB

Visualization of scRNA-seq data.

License: MIT License

Shell 0.33% HTML 74.61% CSS 1.20% C++ 10.12% C 1.40% R 7.62% Perl 0.01% JavaScript 1.70% Assembly 0.04% TeX 0.49% Makefile 0.05% q 0.02% Scheme 0.01% MATLAB 0.01% Roff 0.05% Tcl 2.30% Lua 0.02% Dockerfile 0.01% TypeScript 0.01% Python 0.04%

cerebro's Introduction

License: MIT Lifecycle: retired Twitter

⚠️ Discontinuation notice: Sadly, Cerebro and cerebroApp are no longer in active development. See here for more info.

Cerebro

Table of Contents

Screenshot Cerebro: overview panel

This is the standalone version of Cerebro, cell report browser, (currently available for macOS and Windows) which allows users to interactively visualize various parts of single cell transcriptomics data without requiring bioinformatic expertise.

The core of Cerebro is the cerebroApp Shiny application which is bottled into a standalone app using Electron. Therefore, it can also be run on web servers and Linux machines, requiring only R and a set of dependencies.

Input data needs to be prepared using the cerebroApp R package which was built specifically for this purpose. It offers functionality to export a Seurat object (both v2 and v3 are supported) to the correct format in a single step. The file should be saved either with the .crb or .rds extension, indicating that internally it is an RDS object. Furthermore, the cerebroApp package also provides functions to perform a set of (optional) analyses, e.g. gene set enrichment analysis, pathway enrichment analysis based on marker gene lists of groups of cells, and more.

The exported .crb file is then loaded into Cerebro and shows all available information.

Key features:

  • Interactive 2D and 3D dimensional reductions.
  • Sample and cluster overview panels.
  • Tables of most expressed genes and marker genes for samples and clusters.
  • Tables of enriched pathways for samples and clusters.
  • Query gene(s) and gene sets from MSigDB and show their expression in dimensional reductions.
  • NEW Visualize trajectories calculated with Monocle v2.
  • All plots can be exported to PNG. In addition, 2D dimensional reductions can be exported to PDF.
  • Tables can be downloaded in CSV or Excel format.

Basic examples for Seurat v2 and v3 and scanpy workflows and subsequent exporting can be found in the examples folder. There you can also find the raw data and the output file that can be loaded into Cerebro.

Further screenshots can be found in the screenshots folder.

Introduction to the Cerebro interface

Below you find a brief description of what each panel of the Cerebro interface shows.

For more detailed description, written for biologists without computational expertise, head over here.

Load data

Select input file (.rds or .crb). Shows number of cells, samples, clusters, as well as experiment name and organism.

Overview

Shows 2D and 3D dimensional reductions. Cells can be colored by meta data variables, automatically coloring the cells using a categorical or continuous scale. Cells can be randomly down-sampled to improve performance.

Samples

Shows sample-centric perspective of data.

  • Composition of samples by cluster as table and plot.
  • Distribution of number of transcripts and expressed genes by sample.
  • Distribution of mitochondrial and ribosomal gene expression by sample (if it was computed with cerebroApp).
  • Cell cycle by sample, either determined by the Seurat function or using Cyclone (if it was computed and assigned during exporting).

Clusters

Shows cluster-centric perspective of data. See info about Samples panel above for more details.

Most expressed genes

If computed in cerebroApp, provides tables of most expressed genes by sample and cluster.

Marker genes

If computed in cerebroApp, provides tables of marker genes by sample and cluster.

Enriched pathways

If computed in cerebroApp, provides tables of enriched pathways in marker gene lists of samples and clusters.

Gene expression

Allows to show the expression of specified genes (showing the average per cell if multiple genes) in the data set. Calculation is triggered after pressing SPACE or ENTER. Multiple genes must be submitted in separate lines or separated by either space, comma, semicolon. Shows which genes are available or missing (or misspelled) in data set. Expression levels are shown in dimensional reductions and as violin plots for every sample and cluster. Average expression across all cells of the 50 most expressed genes (of the ones specified by the user) are shown as well to quickly spot which genes drive the color scale.

Gene set expression

Basically the same as the gene expression panel except that it allows to select gene sets from MSigDB (requires internet connection). Only available for human and mouse data.

Trajectory

This tab gives access to trajectory information, if data is available. Currently, we support trajectories generated by Monocle v2 which can extracted through cerebroApp::extractMonocleTrajectory(). Multiple trajectories can be added to a single Seurat object so the user here needs to choose which of those available to visualize. Several interactive plots will be shown, including dimensional reduction, distribution of categorial variables along pseudotime, composition of transcriptional states by sample, cluster, as well as distribution of transcript counts and number of expressed genes by state.

Gene ID conversion

Provides table that allows to convert gene IDs and names. Includes GENCODE identifier, ENSEMBL identifier, HAVANA identifier, gene symbol and gene type. Only available for mouse and human. Based on GENCODE annotation version M16 (mouse) and version 27 (human).

Analysis info

Overview of parameters that were used during the analysis, as long as they were provided. Also shows list of mitochondrial and ribosomal genes present in the data set if computed with cerebroApp.

Motivation

Single cell RNA-sequencing data is rich and complex. Allowing experimental biologists to explore the results is beneficial for the iterative scientific process of performing analysis and deriving conclusions. Cerebro provides an easy way to access the data without any bioinformatic expertise.

Installation

For people without any experience in using the command line, getting access to Cerebro is probably easiest by downloading Cerebro for your OS from here, then unpacking and launching it. Currently, Cerebro is available only for macOS and Windows.

More experienced users of all platforms can alternatively launch the app through the dedicated cerebroApp R package - which is the core Cerebro - or the romanhaa/cerebro Docker container.

Please check the image and table below for an overview of the supported operating systems and requirements of each way to start Cerebo.

Options to launch Cerebro.

Standalone desktop application cerebroApp R package Docker container
Link Releases GitHub Docker Hub
Supported OS macOS, Windows macOS, Windows, Linux macOS, Windows, Linux
(not all tested)
Requirements - R (3.5.1 or higher) Docker client
Installation Download current release from GitHub repository Through BiocManager::install() Pull container from Docker Hub
Launch Cerebro Double-click executable Inside R Start container

Details: cerebroApp R package

Requirements: R (version 3.5.1 or higher)

A convenient IDE would be RStudio but it can be done from any R session. Make sure to install cerebroApp using BiocManager::install() to get the most recent version of dependencies on Bioconductor.

BiocManager::install("romanhaa/cerebroApp")
cerebroApp::launchCerebro()

Details: romanhaa/cerebro Docker container

Requirements: Docker client

docker pull romanhaa/cerebro:latest
docker run -p 8080:8080 -v <export_folder>:/plots romanhaa/cerebro
# for example
docker run -p 8080:8080 -v ~/Desktop:/plots romanhaa/cerebro

Then, in your browser you navigate to the address printed in the terminal, e.g. 127.0.0.1:8080.

Note 1: Binding a local directory with -v <export_folder>:/plots is only necessary if you want to export dimensional reductions from Cerebro.

Note 2: If you need to change the port, you can do that like this:

docker run -p <port_of_choice>:8080 -v <export_folder>:/plots romanhaa/cerebro
# OR
docker run -p <port_of_choice>:<port_of_choice> -v <export_folder>:/plots romanhaa/cerebro Rscript -e 'shiny::runApp(cerebroApp::launchCerebro(), port=<port_of_choice>, host="0.0.0.0", launch.browser=FALSE)'

Example data sets

We provide documentation and commands for the following example data sets:

  • pbmc_10k_v3: single sample of human peripheral blood mononuclear cells
  • GSE108041: 4 samples of A549 cells before and after infection with influenza virus
  • GSE129845: 3 samples of human bladder cells from (3 patients)

Conversion of other single cell data formats

Currently, the cerebroApp R package only provides a functions to export a Seurat (v2 or v3) object to the Cerebro input file. However, there are a few other important single cell data storage formats, e.g. AnnData (used by scanpy, SingleCellExperiment (used by scran and scater), and CellDataSet (used by Monocle).

We believe using the existing network of conversion/exporting functions is more efficient than creating a dedicated export function for scanpy data. To highlight how data processed with scanpy (stored in AnnData format) can be prepared for loading into Cerebro, we have prepared a scanpy-based workflow for the pbmc_10k_v3 example data set.

In the figure below, we highlight how you can generate the Cerebro input file from any of the four major formats.

Single cell data formats

Technical notes

Building from source

On macOS

To package Cerebro you need Git and Node.js (which comes with npm) installed on your computer. Then, from the command line, run:

# clone this repository
git clone https://gitlab.com/romanhaa/Cerebro.git
# install Electron packager
npm install electron-packager --global
# go into the repository
cd Cerebro
# install dependencies
npm install
# run the app
npm start
# build the app
npm run package-mac

To build the Windows version under macOS it is necessary to install Wine. I experienced problems with missing libraries of the stable version (4.0) so I recommend to use the developers version (4.4) using Homebrew:

brew tap caskroom/versions
brew update
brew install caskroom/versions/wine-devel
npm run package-win

On Windows

If you're using Linux Bash for Windows, see this guide or use node from the command prompt.

Troubleshooting

  • If the app shows a blank/white window, press CMR+R (macOS) or CTRL+R (Windows) to refresh the page. Especially on slower machines it can happen that the interface loads before the Shiny application is launched.

Credits

Contribute

To report any bugs, submit patches, or request new features, please log an issue through the issue tracker. For direct inquiries, please send an email to [email protected].

Citation

If you used Cerebro for your research, please cite the following publication:

Roman Hillje, Pier Giuseppe Pelicci, Lucilla Luzi, Cerebro: interactive visualization of scRNA-seq data, Bioinformatics, btz877, https://doi.org/10.1093/bioinformatics/btz877

License

Copyright (c) 2019 Roman Hillje

The MIT License (MIT)

cerebro's People

Contributors

kant avatar romanhaa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cerebro's Issues

New panel with useful links about methods.

Gene set expression with module score?

Hi and thanks for the very useful tool!

Currently gene set expression is calculated as an arithmetic mean, which has pitfalls like sensitivity to outliers. Could you add a feature where expression for gene sets is calculated like the module score in Seurat AddModuleScore? This would provide a sort of per-cell "enrichment" of a gene set, and would be more useful than just the mean.

Best regards,
Daniel

Hide elements that are missing?

Does it make sense to sense to hide elements that have nothing to show (and would therefore only show a text that info is missing)?

Info text is useful because it shows user that there could be more info, it's just not generated.

Example: Cell cycle boxes for samples and clusters. Especially the Cyclone info will probably often be missing.

Cerebro under proxy server

This is a screen capture of running Cerebro under a proxy server:

cerebro-under-proxy

Under this technical condition:

  • MacOSX High Sierra v. 10.13.6
  • no firewall: no Little Snitch, no Hands Off!, etc.
  • wifi under a proxy server

Input of Monocle trajectory analysis

Hi,
My name is Phoebe.
Thanks for the nice work, with such a clear demonstration!
I wonder what's the input in your trajectory step in this README.md? To my knowledge, the "seurat@assays$RNA@data" stores the normalized UMI count matrix, which might not be recommended to import in the Monocle object as I saw here:

image

I supposed "seurat@assays$RNA@counts" would be the suggested one as mentioned above?

Any suggestions would be appreciated. Not sure whether I get it wrong or not.
Thank you in advance!

Phoebe

White screen not going away

Even when reloading, the white screen is remaining, and i can't access the app past it. My hardware isn't particularly slow, so I don't know why.

Input files size/cell number limit

Hi and thanks again for this super useful tool!

I wonder if there is any pre-defined limit (number of cells or file size) in the .crb files that can be uploaded to Cerebro? For some data I'm able to upload and explore data with 20,001 cells but the upload prompts an error when uploading a file with 20,002 cells. Is there a way to increase this limit?

Many thanks!

Maximum Upload Size Exceeded in Standalone Version

Hey @romanhaa

First of all: THANK YOU for this amazing tool! I use Cerebro on a everyday basis and my non-bioinformatician colleagues love it for exploring results I share. I'm also sharing Cerebro objects for our new article submissions :)

I'm opening this issue because of a minor issue when using Cerebro on a Windows machine. Similarly to Cerebro implementation in R, an error message appears when trying to upload large files to Cerebro, stating 'Maximum Upload Size Exceeded'.

When I use Cerebro within R, I can easily bypass this by setting MaxFileSize to a larger value. However, I couldn't figure out how to do this when using a Windows standalone version (which I requested to be installed on my lab study room computer). Is there any way to set this value when using Cerebro in the standalone version?

Ability to create custom cell clusters

Hello!

First words: impressive work! I really like Cerebro 🙂


Also, I like some features of Loupe Cell Browser. Namely it is the ability to:

  1. Using mouse to select cells of interest and create a custom cluster.

Mouse selection of cells

  1. Using gene expression to create a custom cluster matching expression criteria (e.g. log2(counts) > 1).

Custom clusters based on gene expression

  1. Same as 2., but more advanced filters can be specified.

More advanced filter based on gene expression

  1. Using custom clusters to do the differential gene expression analysis.
    This is possible on both global (my cluster vs. all other cells not in my cluster) and local (my cluster 1 vs. my cluster 2) scale.

Differential expression analysis of custom clusters

  1. Not really a thing Loupe can do, but would it be possible to calculate an enrichment of custom gene set (possibly using custom clusters)?

Would it be possible to implement some of these features? We think analysis of scRNA-seq data is, in general, composed of a lot of manual work, and so we want to provide biologists a tool, which will be able not to only visualize data, but also to do some useful analyses.

Consider the case, when biologist will identify some interesting cell cluster and want to see its differential expression relative to all other cells, and also enriched pathways. I can imagine biologist could give me information I can use to create the cell cluster of interest, but then I have to manually run DEA and GSEA, and share the results. That's very time-consuming and we think such analysis can be easily done in a proper tool (Cerebro 🙂).

Thanks in advance! I think I could contribute to Cerebro, but I am not a Shiny expert ☹️

Can't open file using Cerebro Shiny app

Good day,

I cant load a .crbfile using the shiny app. After launching cerebroApp::launchCerebro(maxFileSize =100000), it does not load the file. The cerebro file was created successfully. Do not know what the issue might be.

crb file:

[14:05:40] Start collecting data...

[14:05:40] Overview of Cerebro object:


class: Cerebro_v1.3
cerebroApp version: 1.3.0
experiment name: ascites_prettx
organism: hg
date of analysis: 2020-12-21
date of export: 2020-12-21
number of cells: 19,325
number of genes: 20,762
grouping variables (2): orig.ident, cell_type
cell cycle variables (1): Phase
projections (3): mnn, umap, UMAP_3D
trees (1): cell_type
most expressed genes: orig.ident, cell_type
marker genes:
  - cerebro_seurat (2): orig.ident, cell_type
enriched pathways:
  - cerebro_seurat_enrichr (2): orig.ident, cell_type, 
  - cerebro_ssGSEA_go (2): orig.ident, cell_type
trajectories:
extra material:


[14:05:40] Saving Cerebro object to: cerebro_ascites_prettx_2020-12-21.crb

[14:06:18] Done!

Console output:

...
Warning in writeBin(bytes, req$.bodyData) :
  problem writing to connection
Warning in writeBin(bytes, req$.bodyData) :
  problem writing to connection
Warning in file(filename, open = "wb") :
  cannot open file '/tmp/RtmpSAaHxk/a88d7f66f4c38577f3b227ea/0.crb': Disk quota exceeded
Warning: Error in file: cannot open the connection
  [No stack trace available]

App screenshot:

Screenshot 2020-12-21 at 14 59 07

Thanks in advance for the help

Cholmod error in performGeneSetEnrichmentAnalysis(); large dataset

Hello,
Thank your for your amazing job! Our biologists greatly appreciate your application!

I am currently working on a large dataset (around 60,000 cells for 40,000 genes), and I have an error during the GSEA.

sobj <- cerebroApp::performGeneSetEnrichmentAnalysis(object = sobj, assay = "RNA", GMT_file = gmt.file, parallel.sz = 4)
[16:03:25] Loading gene sets...
[16:03:25] Loaded 50 gene sets from GMT file.
[16:03:25] Extracting transcript counts...
Error in asMethod(object) :
  Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105

I don't have this problem with a little dataset.
I tried with a larger memory (I work on a computing cluster), but the problem persists (I allows 150Go, but it use only 70Go).
Do you have a solution?

Error: object 'seurat' not found when running 'exportFromSeurat' function

Demo
Here is a minimal reproducible example which uses the official Seurat's object and exports it to cerebro file via 'exportFromSeurat'.

library(Seurat)
library(cerebroApp)
cerebroApp::exportFromSeurat(object=pbmc_small, file='./crb.rds', organism='hg', column_cluster='res.0.8', column_sample='orig.ident', experiment_name='pbmc')

Problem
The error is:

Error in cerebroApp::exportFromSeurat(object = pbmc_small, file = "./Downloads/crb.rds", :
object 'seurat' not found

Possible Solution
I was afraid the variable name 'seurat' was hard-written in the function. See these lines in the source code: https://github.com/romanhaa/cerebroApp/blob/e830e9b7191db75214a3fca838e95b9373ba75ed/R/exportFromSeurat.R#L363-L373

It might be the variable named 'export' in the function.

Windows application stuck on white screen.

When starting the Cerebro v1.1 windows app, it loads up to just a white screen. The log screen shows the following:

[2020-03-21 15:50:46.039] [info] stderr:
 Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : 
  namespace 'dplyr' 0.8.0.1 is being loaded, but >= 0.8.3 is required
Calls: <Anonymous> ... tryCatch -> tryCatchList -> tryCatchOne -> <Anonymous>

[2020-03-21 15:50:46.049] [info] stderr:
 Execution halted

[2020-03-21 15:50:49.125] [info] mainWindow loaded
[2020-03-21 15:53:08.103] [info] window-all-closed

Loading in multiple seurat object

Hi,

Thanks for the great app. I find it very useful to use your app. However, currently I am facing some issues with loading in files.

Based on your example, you load in 4 different H5 files which were produced by 10X Cell Ranger.

For me, I have a hashtag data. I proceeded to demultiplex it and currently I have one big seurat object with all the identifiers. I proceeded to break it down into 5 individual seurat objects (I have 5 groups) and I am lost here. I don't know how to load it into CerebroApp.

Could you offer some advice?

Thank you very much.

regards,
Fong

Cannot find 'print' in this Seurat object

I successfully loaded the seurat object using
cerebroApp::launchCerebro(maxFileSize =8000)
But I get an error when I click on any analysis tab,
Warning: Error in : Cannot find 'print' in this Seurat object
[No stack trace available]
Warning: Error in : Cannot find 'print' in this Seurat object
74:
Warning: Error in : Cannot find 'print' in this Seurat object
108:
Warning: Error in : Cannot find 'print' in this Seurat object
74:
Warning: Error in : Cannot find 'print' in this Seurat object
104:
Warning: Error in : Cannot find 'print' in this Seurat object
104:
Warning: Error in : Cannot find 'print' in this Seurat object
104:

What does this mean?
Thanks,
Rini

Error: cannot add bindings to a locked environment

Good day,

Thanks for developing such a great tool. I have created my .crb file in our HPC and downloaded it to my laptop where I download the Cerebro app. However, when loading the dataset, I get an error. What can be the cause of this problem? I have the impression that this is a standalone app that can run independently (I do not have R installed on my local computer).

Thanks in advance for your help!

Screenshot 2020-12-16 at 17 44 04

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.