Giter Site home page Giter Site logo

grimmlab / phenotype_prediction Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 2.0 68.66 MB

A comparison of classical and machine learning-based phenotype prediction methods on simulated data and three plant species

Home Page: https://dx.doi.org/10.3389/fpls.2022.932512

License: MIT License

Jupyter Notebook 100.00%
arabidopsis-thaliana blup deep-neural-networks genomic-selection machine-learning phenotype-prediction plant-phenotyping

phenotype_prediction's Introduction

Comparison Phenotype Prediction

In this repository, we included all simulated phenotypes, precomputed permutation-based GWAS results as well as the code for conducting the simulations and generate all figures related to the below-mentioned publication. In this work, we show a systematic comparison of eight phenotype prediction models. For that purpose, we used both a variety of synthetic as well as real-world data. All prediction models were optimized using easyPheno.

Simulations

We generated synthetic phenotypes using the code in the Jupyter notebook Simulations.ipynb. The genotype matrix consisting of 10k markers we used for the simulations can be found in the folder Simulations. Further, all configurations of our simulations including the SNP ids for the causal SNPs and background SNPs as well as their effect sizes can be found in the Simulations folder. We additionally included an overview of the prediction results on all simulated scenarios in Results.

Real-world data

We performed GWAS for four phenotypes of Arabidopsis thaliana using permGWAS. The GWAS results can be found in the GWAS_results folder. An overview of the prediction results on Arabidopsis thaliana can be found in Results.

Results hyperparameter optimization

We further included a detailed overview of the hyperparameter optimzation for each prediction model and phenotype, both for simulated and real-world data, in a .zip-archive in Results. There is a .xlsx-file for each phenotype which has different sheets. One sheet gives an overview on the results. Further, there is one sheet for each prediction model with the results per outerfold. Beyond that, there is a sheet with the runtime overview and tried hyperparameter combination for each outerfold and prediction model.

Plots

The Jupyter notebook Plot_Scripts.ipynb contains the code to generate all plots that were shown in the publication and to reproduce all analysis regarding feature importance.

Citation

When using parts of this repository, please cite our publication:

A comparison of classical and machine learning-based phenotype prediction methods on simulated data and three plant species
Maura John, Florian Haselbeck, Rupashree Dass, Christoph Malisi, Patrizia Ricca, Christian Dreischer, Sebastian J. Schultheiss and Dominik G. Grimm
Frontiers in Plant Science (https://dx.doi.org/10.3389/fpls.2022.932512)

Keywords: Phenotype Prediction, Genomic Selection, Plant Phenotyping, Machine Learning, Arabidopsis thaliana.

phenotype_prediction's People

Contributors

dominikgrimm avatar fhaselbeck avatar maurajohn avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.