A python3 package for operations on pedigree and genotype data, including simulation of genotype-phenotype associations under quantitative genetic models
Requires: Python 3.4+, numpy, scipy, pandas, cython
In addition to basic pedigree data manipulation, pydigree also includes submodules for more complicated tasks:
- simulation: Provides classes for simulating genetic data
- stats: Classes and functions for statistical genetics
- mixedmodel: Provides classes for using mixed models with family data
- io: Provides functions for importing/exporting data from common data formats, including:
plink
: Functions for working with plink format PED/MAP datavcf
: Functions for working with the VCF genotype formatgenomesimla
: Includes a function for reading genomeSIMLA format chromosome templates- sgs: Functions for shared genomic segment data
Invidiual
: Models an individual with pedigree and phenotype dataPopulation
: Models groups of Individuals with a common genetic backgroundPedigree
: A special case of Population for related individuals. Implements kinship/inbreeding functionsPedigreeCollection
: A container class handling multiple pedigreesChromosomeTemplate
: Models a chromosome with information on allele frequency and marker positionChromosomeSet
: The set ofChromosomeTemplate
s for a populationAlleles
: Stores a haploid set of allelesSparseAlleles
: Stores a haploid set of alleles as differences from a referenceLabelledAlleles
: An efficient container for storing references to a founder chromosomeMixedModel
: A class for fitting mixed-effect models with related individualsMLEResult
: A class containing the maximum likelihood estimates of parameters and values pertaining to the likelihood function at the MLEArchitecture
: A class describing the genetic architecture for a trait to be used in simulationGeneDroppingSimulation
: A base class from which other gene-drop simulation objects inheritNaiveGeneDroppingSimulation
: Simulates genetic data for pedigrees by random gene droppingConstrainedMendelianSimulation
: Simulates genetic data for pedigrees from a prespecified inheritance structureSGSAnalysis
: A class containing the result of a shared genomic segment (SGS) analysisSGS
: A class containing the segments shared between a pair of individualsSegment
: A class describing the location of a shared segment between a pair of individuals
IterationError
: Raised when at iterative algorithm exceeds the maximum allowed number of iterationsNotMeaningfulError
: Raised when a comparison does not make sense (e.g. is one genotype greater than the other)SimulationError
: Raised when an error occurs in a simulationFileFormatError
: Raised when an input file can't be parsed successfully
Pydigree includes a few useful scripts for dealing with pedigree data including:
simulate_pedigree_data.py
: Simulates data from template pedigreesbitsize.py
: Calculates bit sizes for each pedigreekinship.py
: Caluclates inbreeding coefficients and pairwise kinship coefficients for pedigreesgenedrop.py
: Performs gene dropping simulations to approximate the actual probability of an IBD configurationpolygenic.py
: Calculates variance components for normally distributed continuous traits.
J.E. Hicks (2017) Pydigree: a python module for manipulation and simulation and of genetic datasets. biorxiv preprint doi:10.1101/213413
- Charles II of Spain: http://en.wikipedia.org/wiki/Charles_II_of_Spain