Giter Site home page Giter Site logo

2018-august-workshop's People

Contributors

acharbonneau avatar charlesreid1 avatar raynamharris avatar

Watchers

 avatar  avatar  avatar  avatar

2018-august-workshop's Issues

Multi-Stack Demo for Oct 16th Breakout (can we combine a few breakouts?)

Description

I think we should have a breakout that spans both days that focuses on what needs to be done between now and Oct 1st (or so) leading up to the final phase 1 Oct 16th demo.

To see details of the plan the full stacks are putting together see the Multi Team - End-of-pilot (October) Full Stack Demo Script

Topics should include some of the big issues we're facing and could best be solved by the full stacks working in the same room:

  • First step, just recap the current plan for those unfamiliar with the doc above, review our timeline (internal demo due Oct 1st)
  • Getting access for TOPMed/GTEx/MOD data on Google (see #15)
  • Confirm which group is doing which part of the overall compute
  • Testing the alignment and variant calling workflows provided by TOPMed and ported to WDL by Calcium and CWL by Xenon
  • Sharing analysis results from stack to stack with DOS (see #14)
  • Timeline
  • Identifying users from each stack and onboarding them into other stacks, retrieving an access token for use with DOS to access results from the cross stack compute
  • get together with KC2 and KC6 by the end of this, what do we implement?
  • working on our "messaging" for the presentation to the IC directors
    • what are the hard-problems we're trying to solve in Commons? Link to the use cases here.
    • what have we built in phase 1 and to address these hard problems?
    • what is our plan in phase 2 for addressing/continuing to address these hard problems
  • More questions for discussion:
    • Will we have some subset of metadata harmonized on the TOPMed and GTEx samples, in a form that can be exchanged between the full stacks?
    • Will we do variant calling at each full stack or on a single full stack?
    • see more here

Type and size of room desired (conference room or lounge)

16

Target audience

Full stacks

modEnrichr: Prototype Gene Set Enrichment Analysis Tools for MODs

Thanks for suggesting a breakout group! We have 6 conference rooms reserved that seat 6, 10, and 16 people as well as comfortable seating in lounges around the building.

Please provide the following:

  • Description
    For a summer student project, Zach Flamholz, a senior from Princeton University, developed modEnrichr. modEnrichr has gene set libraries for yeast, fly, fish, and worm. These libraries were created from AGR resources but also using GeneRIF and ARCHS4 Zoo. Zach would like to present his project to representatives from AGR for feedback and possible future collaboration.
  • Type and size of room desired (conference room or lounge)
    Small room.
  • Target audience
    AGR data stewards and people interested in data integration.

KC7: Crosscut Metadata Model Schema and Instances

KC7: Crosscut Metadata Model Schema and Instances

Description

  • Review and deep dive into the Crosscut Metadata Model
  • Update on release timeline
  • Update from Full Stacks on ingest plans
  • Discussion of metadata exchange between stacks
  • Feedback

Room Request

Conference room for approximately 15

Target Audience

  • KC7 participants
  • Full Stacks

Learn more about TOPMED

During this breakout session, the Data Stewards will give an in-depth presentation of their data, QC processes, curation, users and use cases.

Proposed by Mary Shimoyama

Data Commons Communication: Who, how, why, when

Description: We have probably all suffered from information overload or miscommunication during Phase 0 and Phase I of the DCPPC, and now, more than ever, we need to think about how to improve internal and external communication. This breakout group will discuss challenges and opportunities for communication including but not limited to social media, websites, newsletters, video interviews, blogs, infographics, publications, protocols, and best practices.

Audience: Anyone is welcome to attend. A small group of folks from NIH, Team Phosphorus and Team Copper have been meeting bi-weekly to discuss this, but we welcome extra input.

Room: An informal lounge area seems appropriate.

KC1 - Reporting about the Analysis of the FAIRness of >4,000 Published Bioinformatics Tools

Thanks for suggesting a breakout group! We have 6 conference rooms reserved that seat 6, 10, and 16 people as well as comfortable seating in lounges around the building.

Please provide the following:

  • Description
    KC1 would like to engage the consortium about what can be done to improve the state of bioinformatics tools and pipeline. Many tools disappear after publication, and there is currently no clear way to benchmark tools, and there are no requirements to make bioinformatics tools FAIRer. Megan Wojciechowicz performed a global analysis about the status of published bioinformatics tools and prepared a presentation that can initiate a discussion about this topic.
  • Type and size of room desired (conference room or lounge)
    Small room.
  • Target audience
    Phase II brainstorming enthusiastics

KC2 - GUID creation

Description
Discussion about GUID creation, related to DEMO GUIDs for TOPMed (https://github.com/dcppc/dcppc-deliverables/issues/58)

Type and size of room desired
Conference room for up to 10 people

Target audience
KC2, particularly Sodium and full stacks (Helium, Calcium and Xenon)

What data and where?

Description:
Let's align on the data each group has available and identify next steps to ensure that each full stack has the exact same data:

  • What are the data sets? Within each set, what are the expected data types? (WGS, RNAseq, etc.)
  • How many files are in each set?
  • Are they on AWS and GCP?
  • Compare manifests

Desired room:
Conference room, 10 or 16 people

Target audience:
Full stacks

All Things Auth

  • Description: A sidebar for those interested in working through some of the policy-based and technical approaches related to improved methods for authenticating and authorizing users for the Data Commons, including what we can do now/soon, and what we might want to do in the future.
  • Type and size of room desired (conference room or lounge): Large conference room preferred
  • Target audience: Full stacks, KC6, & those interested in topics related to whitelist, VDS, SDDP, Passport, DUOS, etc. (Note: This will not be a rehash of the meeting in NC, but instead a new conversation informed by those discussions.)

Day 2 Talks

Presenters can add their name here or in the comments below.

Please upload slides here: https://drive.google.com/drive/u/0/folders/16oNTuVSPMa4sjCRUG1q60DpoFi43GRn2

Each speaker will have 5 minutes with 3 minutes for questions.

- 09:10 Phosphorus: Anup Mahurkar
- 09:19 Data Stewards (TOPMed): Albert Smith
- 09:30 Helium: Kira Bradford
- 09:40 Data Stewards (AGR): Nathan Dunn
- 09:49 Nitrogen: Denis Torre
- 09:58 Argon: Ravi Madduri
- 10:07 Carbon: Jessica Lyons 
- 10:16 Copper: Charles Reid
- 10:50 STRIDES: Nick Webber
- 10:59 Calcium: Bob Grossman
- 11:08 Oxygen: Hua Xu
- 11:26 Data Stewards (GTEx):
- 11:35 Xenon: Charlotte Whicher
- 11:44 Sodium: Martin Fenner
- 11:53 Breakout group planning

(P.S. Order generated from a random sequence: 13 12 7 8 1 5 10 14 4 9 2 3 11 6)

Rooms for breakouts

Day 2
Number of people, room type, room number, breakout group
24, 1050 (10th floor) - Full Stacks
6, conference, 916 - KC7 Cross Cut
10, conference, 912 - KC2 Schema
16, conference, 19th-floor room 2, TOPMED
10, conference, 19th-floor room 3, KC1 FAIR

Extra space
6, conference, 901 -
60, lecture, this room
60, roundtables, 19th-floor lobby
8, couches, adjacent to room 912
8, couches, room 914

Day 1
Number of people, room type, room number, breakout group
6, conference, 916 - KC2 GUIDS (starting w Full Stacks, then separating after break)
10, conference, 912 - Full Stacks
6, conference, 901 - communication
16, conference, 19th-floor room 2, MODS
10, conference, 19th-floor room 3, All things Auth

Learn more about MODs and AGR

During this breakout session, the Data Stewards will give an in-depth presentation of their data, QC processes, curation, users and use cases.

Proposed by Mary Shimoyama

DOS service for multi-stack compute

Description:
The full stack teams have agreed to each stand up a DOS instance to serve up the data coming out of our cross stack compute demo plan for October. So, ideally, we do our shared, multi-stack recompute and each stack (Helium, Xenon, and Calcium) share the resulting GTEx CRAMs and VCFs using a DOS service. This DOS service is usable with a token from each full stack (we already showed in the July F2F that we could onboard users in each stack) and that gives the DOS caller a 1) native path with temporary credentials and/or 2) a signed URL. For Argon we get a manifest/bdbag and a shared bucket location to pull RNAseq analysis data files. Each stack then uses the DOS endpoints and bdbag from Argon to onboard these data into their system as is appropriate.

  • Let's start with a 1-page primer describing what needs to be implemented
  • Can we also run the aligner and checker workflow during this time to ensure they are functioning as expected?

Desired Room:
Conference room, 16 people

Target audience:
Full stacks

Day 1 Talks

Talks should focus on awesome things your team has accomplished. Presenters can add their name here or in the comments below.

Please upload slides here: https://drive.google.com/drive/u/0/folders/16oNTuVSPMa4sjCRUG1q60DpoFi43GRn2

Each speaker will have 5 minutes with 3 minutes for questions.

  • 09:20 Calcium: David Siedzik
  • 09:29 Data Stewards (AGR):
  • 09:38 Copper:
  • 09:47 Argon:
  • 09:56 Xenon:
  • 10:05 Phosphorus:
  • 10:15 Data Stewards (GTEx):
  • 10:30 Coffee break (~20 minutes)
  • 10:51 Data Stewards (TOPMed): Ken Rice
  • 11:00 Oxygen: Lucila re Block chain (10 min + 5 for questions)
  • 11:15 Sodium:
  • 11:24 Helium: Ray Idaszak
  • 11:33 Carbon:
  • 11:42 Nitrogen: Avi Ma'ayan

(P.S. Order generated from a random sequence: 2 5 4 1 9 14 11 12 6 13 8 7 3 10)

KC1 - FAIR assessment of GTEx and GTEx interface to BioJupies

Thanks for suggesting a breakout group! We have 6 conference rooms reserved that seat 6, 10, and 16 people as well as comfortable seating in lounges around the building.

Please provide the following:

  • Description
    Members of KC1 manually reviewed all the digital objects on the GTEx portal and scored them for FAIRness using FAIRshake. We would like to meet with GTEx representatives to discuss this work to see if this feedback would be helpful for them to make the portal FAIRer.
  • Type and size of room desired (conference room or lounge)
    Small room.
  • Target audience
    GTEx data stewards.

KC2 - Schema.org/bioschemas with identifiers.org

Description
Discuss using Compact identifiers for MODs data to resolve to MODs landing pages withschema.org/bioschemas metadata in JSON-LD.

Type and size of room desired (conference room or lounge)
A conference room for about 10 people

Target audience
MODs, KC2, KC7

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.