The main objective for the KnowEnG Project 1 team (Organizing Data) is to collect community datasets curating annotations of and interactions between genes and proteins and transform them into a comprehensive, heterogeneous network, called the ‘Knowledge Network’ (KN). This Knowledge Network will assist researchers to understand their experimental genomic data spreadsheets through network-based machine learning and graph mining tools developed by Project 2 (Analytics Suite) and visualizations created by Project 4 (User Interfaces). Project 1 is committed to identifying important public data collections to augment the KN and building a library of parsers to convert their inconsistent data formats into standard network representations while preserving the provenance and relevant metadata. We have incorporated our parsers into an automated and containerized pipeline that will be run at regular intervals to maintain current, versioned databases of the Knowledge Network.
The most complete documentation can be found at the Read the Docs