Giter Site home page Giter Site logo

singleanalyst's Introduction

SingleAnalyst

Introduction

SingleAnalyst is an integrated platform for single-cell RNA-seq data analysis, focusing on the cell type assignment problem in single cell RNA-seq analysis.

SingleAnalyst implemented various quality control, normalization and feature selection methods for data preprocessing, and featured a k-nearest neighbors based cell type annotation and assignment methods

Requirement

  • python3 >= 3.6
  • linux / WSL

install

  1. install some dependency by conda (pip did not work properly for those package)
    conda install numpy bitarray
    conda install faiss-cpu -c pytorch
  2. install package
    pip install .

Usage

Data preprocessing

Read data

Read data, and create a singleCellData object.

from SingleAnalyst.basic import indexedList, infoTable, singleCellData
gene_info = indexedList(gene_list)
cell_info = infoTable(
    ['cell_list', 'cell_type'],
    [cell_list, cell_type_list])
ex_m = mmread(os.path.join(path, 'expr_m.mtx')).todense()
dataset = singleCellData(ex_m, gene_info, cell_info)

Or, read from saved data

import SingleAnalyst
datapath = 'output/xin/'
data_set = SingleAnalyst.dataIO.read_data_mj(datapath)

quality control

Filter out low quality data

f1 = scr.filter.minGeneCellfilter()
f2 = scr.filter.minCellGenefilter()

dataset = dataset.apply_proc(f1)
dataset = dataset.apply_proc(f2)

normalization

Data normalization

norm = scr.normalization.logNormlization()
dataset.apply_proc(norm)

feature selection

Select informative feature

s1 = scr.selection.dropOutSelecter(num_features=500)
s2 = scr.selection.highlyVarSelecter(num_features=500)
s3 = scr.selection.randomSelecter(num_features=500)

dataset.apply_proc(s1)

index build and similar search

Split data for test

train_d, test_d = scr.process.tt_split(dataset)
refdata = scr.RefData.queryData(train_d)
q_xdata = scr.RefData.queryData(test_d)

build index for reference data

nn_indexer = scr.index.faiss_baseline_nn()

index = scr.index.indexRef(refdata, nn=nn_indexer)

knn search and celltype annotation

qxm = q_xdata.get_qxm(gene_list=index.gene_ref.get_list())

res = index.get_predict(qxm=qxm)

# visually insapect knn result  
i_qx = qxm[19,:]
nnf = index.get_knn_vis(i_qx)

Contacts

[email protected] or [email protected]

singleanalyst's People

Contributors

cannedfishcan avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.