Giter Site home page Giter Site logo

neurodata / connectocross Goto Github PK

View Code? Open in Web Editor NEW
3.0 4.0 3.0 137.2 MB

Connectocross: statistical characterizations and comparisons of nanoscale connectomes across taxa (A paper in progress)

Home Page: https://docs.neurodata.io/connectocross/

License: MIT License

Python 21.64% Jupyter Notebook 78.36%
connectome

connectocross's Introduction

Connectocross: statistical characterizations and comparisons of nanoscale connectomes across taxa

Datasets


C. elegans male and hermaphrodite, full body

Paper Link
Data Link
Raw data location
# nodes ~300
# edges
# synapses
# graphs 2

Notes

  • has chemical and gap junction graphs
  • has some single-cell transcriptomics
  • has cell lineage

C. elegans timeseries, nerve ring

Paper Link
Data
Raw data location
# nodes ~50 - 150 per graph?
# edges
# synapses
# graphs 8

Notes

  • time series of graphs (though from different animals)
  • 2 animals at the last timepoint
  • I have code to pull data

Drosophila larva brain

Paper not yet available
Data we have it
Raw data location CATMAID
# nodes 2971
# edges ~100k
# synapses ~300k
# graphs 1

Notes:

  • Have incomplete cell lineage
  • I think Marta's lab has some single cell scRNAseq
  • Have edge type split by axo, dendrite

Drosophila adult brain chunk (hemibrain)

Paper Link
Data Link
Raw data location neuPrint
# nodes 20 - 25k, 67k more small objects
# edges
# synapses 64M
# graphs 1

Drosophila adult brain sparse (FAFB)

Paper Link
Data Link to overview, Link to CATMAID
Raw data location CATMAID
# nodes
# edges
# synapses
# graphs 1

Platynereis larva full

Paper Link
Data not yet available (I think)
Raw data location CATMAID
# nodes 2728
# edges 11437
# synapses
# graphs 1

MiCRONS

Bryan Jones Retina

Cionia intestinalis

Paper Link
Data
# nodes ~200?
# edges
# synapses
# graphs

Simple a priori models

a.k.a. look at the data, more or less

Simplest statistics

Things that we always want to know about a graph. Usually:

  • Number of nodes
  • Number of edges
  • For a connectome, maybe number of actual synapses

Density (ER)

  • compute the density (p) for each connectome, can simply plot each.

Left/right (SBM/DCSBM)

  • Test different hypotheses about $\hat{B}$ (see statistical connectomics)
    • is it more densely connected within block than between? To what extent?
      • maybe can compare this for many of the connectomes. probably not all
    • core-periphery
    • etc.

Left/right + any known metadata (SBM/DCSBM)

  • If any putative cell types are known, use those
  • now we get a more refined SBM than the above, maybe interesting, maybe not?
    • cell type data may not be available for all of the above
  • can do similar tests, results may or may not be different

General low rank (RDPG)

  • Scree plots
  • estimation of rank (ZG2)
  • not sure that this will be interesting to compare across connectome or not. would have to normalize for the number of nodes somehow, i'd think.

Distribution of weights, degrees

  • Can just look at distribution of edge weight for each, i guess where weight is number of synapses
  • in/out degree distribution, marginals and joint, is easy enough to plot.
    • again, don't know whether it'll be meaningful to compare across connectome or not

More complicated a priori models

Homotypic affinity

  • can test for whether cell pairs (or blocks?) are more likely than chance to connect (homotypic affinity)
  • requires having cell pairs
    • probably only maggot and c. elegans

Testing left vs right, quantify correlation, spectral similarity, GM performance, etc.

Testing for gaia's directedness (or just quantifying to what extent it happens)

  • degree of reciprocal feedback? had thought about something along the lines of testing for the difference between left and right latent positions. but maybe a simpler first statistic to compute is: P(edge from j to i | edge from i to j)

A posteriori models

Spectral clustering and estimating an SBM, DCSBM, DDSBM

  • can try to incorporate homotypic affinity also... or correlation L/R
  • figure 3 from maggot paper

Feedforward layout and proportion of feedforward edges

Models with biological metadata

Testing for Peter's rule via the contact graph

  • is the adjacency a noisy version of the contact graph?
  • how does rank change as we jitter xyz of synapses
  • could we also just swap synapses in an epsilon ball and see how structure changes?

Spectral clustering that uses morphology

Configuration models that swap synapses within an epsilon ball

Can we cluster edges via connectivity + space?

  • had talked about trying to cluster the line graph
  • spectral embedding of the line graph looked bad when I tried it. Need to follow up.

Niche models that may not work for all data

Different hypotheses for a multilayer SBM-like model

  • maggot data

Matching FAFB and hemibrain or either to maggot

  • could be spectral, could be GM
  • results maybe bad?
  • could use morphology, could not

Spectral coarsening between maggot and adult

connectocross's People

Contributors

bdpedigo avatar caseyweiner avatar jingyan230 avatar pauladkisson avatar spencer-loggia avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

connectocross's Issues

define a file spec for attributed graphs

  • should be well suited to the data we care about here as a first use case
  • in the past we have discussed a csv edgelist for the graph itself with separate json(s) for metadata

L/R homotypic affinity?

to what extent are edges between L/R pairs more probable than any L/R connection?

(which datasets have L/R pairs?)

(If they don't have pairs, can we use graph matching to predict?)

write data pulling functions for the connectomes of interest

for the connectomes listed in the readme (roughly, exact ones may change)

  • want a simple, clear script to pull necessary graph + metadata from wherever it's hosted online
  • saves that data to the format specified in #1

Note: obviously a bit downstream of #1, but some work on pulling the data could probably start concurrently. configuring how to save to whatever format we pick is likely not the bottleneck for this issue.

Consistent styling

Palettes

As a group, please decide on a consistent color palette for species/dataset:
https://seaborn.pydata.org/tutorial/color_palettes.html
https://matplotlib.org/stable/tutorials/colors/colormaps.html

It may make sense to do something like have two different shades of the same color for multiple related datasets (e.g. two C. elegans could be two shades of blue or something) as long as this is less distinct than the species annotation.

I'd like this palette to just be saved somewhere as a json that any script can just import

Style

As a group, please decide on a consistent style for matplotlib. This is also something that can be saved and import easily. One example is here which you are welcome to use or modify.

E.g. each notebook you each separately make can then just call set_theme() and everyone's plots will look the same

simple statistics

  • number of nodes
  • number of edges
  • max degree
  • graph density

chart of the above for each connectome we have

fit a priori SBMs to the connectomes

Spencer:

  • seems like something that is easy to do and we could do for all of them
  • could just use the metadata, find some features that we care about.

Ben:

  • maybe we fit using various node metadata columns and just report likelihood, number of parameters, bic or something like that
  • maybe also just plot them and show that we can fit these models

graph matching stuff

we could run graph matching on a lot of these connectomes
maybe even match some of them to other connectomes

(decide if we need a) lightweight graph + metadata object

i often find something like this helpful.

Just a light graph object that stores adjacency matrix + pandas metadata on the nodes dataframe

havent done anything smart for multigraph or edges with features

likely a better way to implement the above

unsure if it is even necessary or networkx is enough. but end up using adjacency matrix representation so much that it was convenient

write data pulling functions for the connectomes of interest - xlsx

for the connectomes listed in the readme (roughly, exact ones may change)

  • want a simple, clear script to pull necessary graph + metadata from wherever it's hosted online
  • saves that data to the format specified in #1

Note: Sub-issue of #2

Specifically, focus on datasets stored as .xlsx files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.