Giter Site home page Giter Site logo

hbasets's Introduction

HBASets

HBASets is a project for creating molecular maps of the brain starting with a query of a gene set of interest.

What is this tool?

Brain diseases are often due to variations in multiple genes (polygenic disorders) and their interactions with environmental factors.

There is a growing amount of open data describing gene expression in the brain. For example, the Allen Brain Atlas provides a number of highly comprehensive brain atlases in the human, mouse and monkey. However, these valuable resources are underused for the analysis of polygenic brain disorders because the data is not easily accessible beyond the level of a single gene.

This tool aims to make use of the substantial open data of gene expression in the brain to facilitate accessible, rapid and custom data mining of open brain transcriptome data (across time, anatomy, species and celltypes).

Who is this for?

Many scientific studies produce lists of genes that are differentially expressed in a brain disease or where genetic variants are associated with a mental illness. Often, these gene sets are derived from unbiased genome-wide studies and may not have been previously characterized.

This tool will further analyses and help answer questions such as:

  • Are these genes most expressed in childhood?
  • Are they expressed in a specific brain area?
  • Are they expressed in neurons or glia?
  • Do the above answers agree in mouse, monkey and human brain?

Data

To start with, you can download the normalized microarray datasets of gene expression from 6 adult human brains that was released by the Allen Brain Atlas. Place the raw data in /HBAsets/data/raw/ and you can run the make_dataset.py preprocessing script to generate the processed expression matrix which will be used for analysis.

Alternatively, if you would like to go straight to the analysis steps download the preprocessed data from GDrive here and place the files directly in /HBAsets/data/processed

How can I get involved?

HBAsets is openly developed and welcomes contributors. Check out the Roadmap and the contributing guidelines to get an idea of how to start. Your thoughts, questions, bugs, suggestions, etc are valuable to us so feel free to take part in discussion in the issues.

hbasets's People

Contributors

abbycabs avatar derekhoward avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

abbycabs

hbasets's Issues

Incorporate an expression heatmap in the webapp

In the notebook, I show a heatmap of the expression of the geneset in the top 10 brain structure hits using the genometools library which builds on top of plotly.

If it would be great to incorporate a heatmap based off the displayed table output.

Incorporate interactive results table

The dash app currently produces a static html table with some statistics for top 10 brain structure results.

I've looked around the dash forum and found this interactive table component which seems like a great way to display and explore the results.

Down the line this would be helpful to select the brain structures of interest to be displayed in the expression heatmap and other figures.

I'm also wondering if there are other suggestions for how to present this data.

Provide processed fetal human brain data

Examples analyses can be shown in the Jupyter Notebook. This will also provide more context for future steps for the Dash application which should incorporate a selection button so that the analysis is run on a specific dataset.

Refactor data processing to pull data directly from Allen Brain API

It would help from a reproducibility perspective if the data processing could run by pulling the data directly from the Allen Brain API.

There is currently a download_HBA_adult_human_data() function in the make_dataset.py along with the essential Well_Known_File IDs. This hasn't been incorporated yet because the donor information wasn't tracked in the same way as when you download the data directly from the Allen website.

Suggestions for incorporating tests

Multiple types of tests should probably be used.

I think the most essential would be testing to make sure the user input is in the format as expected (each gene separated by white space) -- what happens if a user puts in a comma separated list of genes?

Which other tests are essential for ensuring that the results can be trusted?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.