Giter Site home page Giter Site logo

cytodata-hackathon-2021's Introduction

Welcome to CytoData Hackathon 2021!

Welcome to Cytodata Hackathon 2021. Grab a cup of your favourite drink and join us in a journey that will bring you to learn new skills, make new friends and put your quantitative skills to the test while dealing with an interesting challenge in stem cells.


Pluripotency is the capacity of certain stem cells to give rise to virtually any cell of the body. Pluripotent cells can be taken from the early stages of development or induced through transduction of specific transcription factors (reprogramming).

Human iPSC (induced pluripotent stem cells) continuously face a choice between making exact copies of themselves (‘self-renew’), or turn into any human cell type (‘differentiation’), providing great potential for research and therapeutic uses.

Cell death or loss of pluripotency are common in iPSC culturing. Thus, assessing cell state (e.g. good/pluripotent or bad/differentiated) is a critical step in stem cell research, both in academia (to study them) or in biotech (for quality control).


  • ‘Good’: Colonies with pluripotent cells
  • ‘Bad’: Colonies with differentiated cells
  • ‘Empty’: Empty wells, no or very few cells

We imaged human iPSC cells on 96-well plates over 2 weeks using Sartorius Incucyte device. For your convenience we provide a training set divided as above in good, bad, empty and test sets with real-life examples.

Images will have single cells from early timepoints dividing to make colonies over time; feel free to take full advantage of cell morphology, colony morphology and/or context features such as how distant cells are to each other over time, etc..

This year, we challenge you to build pipelines that can analyse and assess the pluripotency of iPSC based just on phase microscopy images. The result of this hackathon could lead to more efficient processes in iPSC culturing and analysis.



Table of Contents

Challenges

Prizes

Dataset

Who

When

Where

What

Questions?



Challenges

The CytoData Hackathon 2021: 3 challenges for your quantitative/qualitative minds. Please use our standardised format sheet for reporting your clustering predictions. Evaluations will have two quantitative scores and one qualitative jury prize.

Challenge 1:

Using the dataset in the ‘Training’ folder, cluster the images in the ‘Test 1’ folders correctly into ‘Good’ and ‘Bad’ categories. Teams which clustered the most images correctly will get the highest score.

Challenge 2a:

How generalisable is your algorithm? Can it assess images independent of the time point on a single image? Please use your algorithm to cluster the images in the ‘Test 2a’ folder. Teams with the most images correctly clustered will get the highest score.

Challenge 2b:

Can you obtain spatial information from the images? Would you be able to tell which part of the colony is good and which part is bad for the images in our folders? You can start from training test1 and move onto the others too.

Challenge 3:

Come up with your own standards to assess the pluripotency of iPSC. Any other bright idea welcome, let your imagination run, the sky's the limit. The aim is to have a feedback loop clustering in good/bad the culture status to be used for quality control.



Prizes

Consist in a combination of cytodata glory, legendary bit.bio mugs, other surprises

1. Best score on cluster Test 1

2. Best score on cluster Test 2a

3. Best jury prize (visualisation-showcasing)



Dataset

75 human iPS colonies

2 different human iPS cell lines

90 images per well over 2 weeks

Training set (Good - Bad - Empty)

Test1 6550 files -> 13GB (containing time information)

Test2a 6550 files -> 13GB (scrambled from time information)


All data can be downlodaed using the following links:

Training.zip (4.7 GB)

Test.zip

If you find the Test.zip too large to download, you can download separately subsets of the Test dataset using links below:

Test 1.zip

Test 2a.zip


We have done some preliminary colony segmentation using Cell Profiler. We have not segmented further objects inside each colony. The measurements for colony segmentation can be downloaded here as csv files: docs.csv. The Cell Profiler pipeline can be found in this main branch as '2021-09-30 incucyte mask processing v1.0.cpproj' (works on CP version 4.2.0 and above). Feel free to use any of the parameters as a starting point to analyse colony pluripotency state: colony area, diameter, circularity, cell distances, cell morphology, any of the above, combined cell profiler features, multidimensional hyperspaces, etc.

Thanks to Cytodata and all scientists in our Cellular Phenotyping team for support. The experiments for this dataset were set up and acquired by Sarah Hussain, Stefan Milde, Fiona Connolly at bit.bio using Sartorius Incucyte device. The dataset was organised with the help from Sanaullah Nazir at bit.bio.



Who?

Over 70 hackathon participants will be grouped in 9 teams. Please select a team by clicking here: CytoData 2021 - Teams Registration, enter your name and timezone. We encourage you to select team members on similar timezones.

Feel free to use the Slack channel to add a few words about yourself or a link to your website or workplace. Great chance to meet up sharing your passion for image analysis and computation.



When (all times in BST, UK time)

Wednesday 6th October:

  • Dataset and instructions released
  • Please make sure you can download the data and read me

Friday 8th October:

  • 13:00-14:00: Introduction to dataset
  • 14:00-16:00: Open desk for participants to ask questions

Monday 11th October:

  • 13:00-14:00: Refresher for dataset background
  • 14:00-16:00: Open desk for participants to ask questions

Wednesday 13th October:

  • 13:00-16:30: Cytodata conference
  • 16:30-18:00: Supervised hacking

Thursday 14th October:

  • 13:00-16:30: Cytodata conference
  • 16:30-18:00: Supervised hacking

Friday 15th October:

  • 13:00-14:00: Supervised hacking wrap up
  • 14:00-17:00: All hackers to showcase their work
  • 17:00-17:30: The Jury gathers
  • 17:30-18:00: Closure ceremony and winners communicated



Where?

Conference Zoom channel

Main Zoom channel for Hackathon

https://us02web.zoom.us/j/89647782734?pwd=Nmx5d1VSazdZZlRZRXNyQmhML2N5dz09

Once you have finalised your teams, please join the corresponding Zoom breakout room in the link above. We will do our best to have one of us 1-6pm UK time. That will give you a chance to ask questions so we all make sure the tasks are clear.



What?

Please note this is a chance to bring together a community of creators, scientists and analysts with the spirit of learning together. We are confident everyone will respond to this spirit and we will not tolerate disruptive, offensive or inappropriate behaviour.

The Hackathon and conference is organised by bit.bio and Cytodata together. All code submissions should be available under MIT license (open source). The data from the shared dataset should be made available under CC0 license (open source).

This will allow anyone to use freely, to build on the community expertise as in the past years and to enjoy the occasion to build connections, learn about new opportunities to share experience with other participants.



Questions?

Feel free to reach us out on the dedicated zoom main hackathon channel. Alternatively please feel free to post on the cytodata slack channel.

See https://society.cytodata.org and https://www.bit.bio/events/cytodata2021

Git link: https://github.com/Esme233/Cytodata-hackathon-2021

cytodata-hackathon-2021's People

Contributors

esme233 avatar wagenrace avatar

Watchers

 avatar

Forkers

wagenrace

cytodata-hackathon-2021's Issues

Usless data

Hey there are a few columns with duplicated or same data in the metadata set

        "Metadata_TimePoint.1",  # duplicate
        "Metadata_Well.1",  # duplicate
        "Metadata_Series",  # always 1
        "Metadata_Site",  # always 0
        "Metadata_Site.1",  # always 0 and duplicate
        "Metadata_FileLocation",  # always blank
        "Metadata_prefix",  # always blank
        "Location_MaxIntensity_Z_Phase",  # always 0
        "Metadata_Frame", # always 0
        "Location_CenterMassIntensity_Z_Phase", # always 0
        "Location_Center_Z", # always 0
        "Number_Object_Number", # duplicate

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.