Giter Site home page Giter Site logo

magnusdv / ibdsim2 Goto Github PK

View Code? Open in Web Editor NEW
5.0 5.0 2.0 9.54 MB

Simulate patterns of DNA sharing between pedigree members

Home Page: https://magnusdv.github.io/pedsuite/

R 93.54% C++ 5.18% CSS 1.10% HTML 0.18%
simulation identity-by-descent relatedness

ibdsim2's Introduction

ibdsim2

CRAN status


NEWS! Updated online app for IBD simulations: ibdsim2-shiny

As of version 2.1.0, the shiny app is integrated as part of the package. To run it locally, simply run the command

ibdsim2::launchApp()

Introduction

The purpose of ibdsim2 is to simulate and analyse the gene flow in pedigrees. In particular, such simulations can be used to study distributions of chromosomal segments shared identical-by-descent (IBD) by pedigree members. In each meiosis, the recombination process is simulated using sex specific recombination rates in the human genome (Halldorsson et al., 2019), or with recombination maps provided by the user. Additional features include calculation of realised relatedness coefficients, distribution plots of IBD segments, and estimation of two-locus relatedness coefficients.

ibdsim2 is part of the pedsuite collection of packages for pedigree analysis in R. A detailed presentation of these packages, including a separate chapter on ibdsim2, is available in the book Pedigree analysis in R (Vigeland, 2021).

Installation

To get ibdsim2, install from CRAN as follows:

install.packages("ibdsim2")

Alternatively, the latest development version can be installed from GitHub:

# install.packages("devtools") # if needed
devtools::install_github("magnusdv/ibdsim2")

Example 1: A simple simulation

The most important function in ibdsim2 is ibdsim(), which simulates the recombination process in a given pedigree. In this example we demonstrate this for in a family quartet, and show how to visualise the result.

We start by loading ibdsim2.

library(ibdsim2)

The main input to ibdsim() is a pedigree and a recombination map. In our case we use pedtools::nuclearPed() to create the pedigree, and we load chromosome 1 of the built-in map of human recombination.

# Pedigree with two siblings
x = nuclearPed(2)

# Recombination map
chr1 = loadMap("decode19", chrom = 1)

Now run the simulation! The seed argument ensures reproducibility.

sim = ibdsim(x, N = 1, map = chr1, seed = 1, verbose = F)

The output of ibdsim() is a matrix (or a list of matrices, if N > 1). Here are the first few rows of the simulation we just made:

head(sim)
#>      chrom   startMB     endMB   startCM     endCM 1:p 1:m 2:p 2:m 3:p 3:m 4:p 4:m
#> [1,]     1  1.431813  4.578604  0.000000  8.166722   1   2   3   4   1   3   2   3
#> [2,]     1  4.578604  9.307115  8.166722 17.248719   1   2   3   4   1   4   2   3
#> [3,]     1  9.307115 19.734735 17.248719 39.094436   1   2   3   4   2   4   2   3
#> [4,]     1 19.734735 22.758328 39.094436 43.760248   1   2   3   4   2   4   1   3
#> [5,]     1 22.758328 41.436476 43.760248 66.998786   1   2   3   4   2   4   1   4
#> [6,]     1 41.436476 61.366417 66.998786 86.491160   1   2   3   4   2   4   1   3

Each row of the matrix corresponds to a segment of the genome, and describes the allelic state of the pedigree in that segment. Each individual has two columns, one with the paternal allele (marked by the suffix “:p”) and one with the maternal (suffix “:m”). The founders (the parents in our case) are assigned alleles 1, 2, 3 and 4.

The function haploDraw() interprets the founder alleles as colours and draws the resulting haplotypes onto the pedigree. See ?haploDraw for an explanation of pos and other arguments.

haploDraw(x, sim, pos = c(2, 4, 2, 4))

Example 2: Distributions of IBD segments

In this example we will compare the distributions of counts/lengths of IBD segments between the following pairwise relationships:

  • Grandparent/grandchild (GR)
  • Half siblings (HS)
  • Half uncle/nephew (HU)

Note that GR and HS have the same relatedness coefficients kappa = (1/2, 1/2, 0), meaning that they are genetically indistinguishable in the context of unlinked loci. In contrast, HU has kappa = (3/4, 1/4, 0).

For simplicity we create a pedigree containing all the three relationships we are interested in.

x = halfSibPed() |> addSon(5)
plot(x)

We store the ID labels of the three relationships in a list.

ids = list(GR = c(2,7), 
           HS = 4:5, 
           HU = c(4,7))

Next, we use ibdsim() to produce 500 simulations of the underlying IBD pattern in the entire pedigree.

s = ibdsim(x, N = 500, map = "decode19", seed = 1234)
#> Simulation parameters:
#> Simulations  : 500
#> Chromosomes  : 1-22
#> Genome length: 2753.93 Mb
#>                2602.29 cM (male)
#>                4180.42 cM (female)
#> Recomb model : chi
#> Target indivs: 1-7
#> Skip recomb  : -
#> Total time used: 2.23 secs

The plotSegmentDistribution() function, with the option type = "ibd1" analyses the IBD segments in each simulation, and makes a nice plot. Note that the names of the ids list are used in the legend.

plotSegmentDistribution(s, type = "ibd1", ids = ids, shape = 1:3)

We conclude that the three distributions are almost completely disjoint. Hence the three relationships can typically be distinguished on the basis of their IBD segments, if these can be determined accurately enough.

A Shiny app for visualising IBD distributions is available here: https://magnusdv.shinyapps.io/ibdsim2-shiny/.

ibdsim2's People

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ibdsim2's Issues

ibdsim and founder_inbreeding

The below tells me that founder inbreeding is ignored by ibdsim2
and perhaps it should, but could notify user:

library(pedtools)
library(ibdsim2)
cou = doubleCousins(0,0)  #remove 's'?
founder_inbreeding(cou, founders(cou)) = 1
cousins = leaves(cou)
s = ibdsim(cou, 1, chromosomes = 21, seed = 17, verbose = FALSE)
a = alleleSummary(s[[1]], leaves(cou), ibd.status=TRUE)
a = as.data.frame(a)
all(a$ibd == 2)
#> [1] FALSE

Created on 2018-08-25 by the reprex package (v0.2.0).

Labels/annotation in `haploDraw()`?

Is it possible to add labels to haploDraw()? Below they are shown in pedigree plot, but not in haploDraw(). Maybe it can be done better via .pedAnnotation() with e.g. textInside, but I couldn't figure out how to do it.

library(ibdsim2)
#> Indlæser krævet pakke: pedtools
x <- nuclearPed(2)
labs <- c("1" = "Dad", 
          "2" = "Mom",
          "3" = "Boy1",
          "4" = "Boy2")
x <- relabel(x, new = labs)
plot(x)

chr1 <- loadMap("decode19", chrom = 1)
set.seed(2)
sim <- ibdsim(x, N = 1, map = chr1, verbose = FALSE)
haploDraw(x, sim, unit = "cm", pos = c(2, 4, 2, 4))

Created on 2024-07-18 with reprex v2.0.2

plot_ibd1

The below plot could perhaps be even nicer with kappa_1 (or k1) placed in legend outside
when legend_inside = FALSE, and contour lines stapled differently. Optionally remove title and axis labels?

library(pedtools)
library(ibdsim2)
x1 = cousinPed(1)
x2 = halfCousinPed(1)
map = "uniform.sex.aver"
sims = 100
s1 = ibdsim(x1, sims = sims, map=map, seed = 17, verbose = FALSE)
s2 = ibdsim(x2, sims = sims, map=map, seed = 17777, verbose = FALSE)
plot_ibd1(s1, s2, labels = c("cousins", "half cousins"), 
          legend_inside = FALSE)

Created on 2018-08-25 by the reprex package (v0.2.0).

Feature request: karyo plot

It would be very nice with a plot simillar to that produced by karyo_haploid but with with different colors or shading to indicate no IBD, maternal IBD, paternal IBD or paternal and maternal IBD.

haploDraw

I need to remove (change?) the pos argument to see the colorinG

library(ribd)
#> Loading required package: pedtools
library(ibdsim2)
x = avuncularPed("u", "ni") |> addSon(parents = c(3,6))
sims = ibdsim(x, N = 1, ids = 3, seed = 123, verbose = F)
s = sims[[1]]
haploDraw(x, s, chrom = 1, cols = 2:7, pos = 2, 
          height = 5, margin = c(2,1,1,1),ids = 3)

Created on 2022-06-14 by the reprex package (v2.0.1)

estimateTwoLocusKappa

Element (1,1) of output maytrix should be 1:

library(ibdsim2)
#> Loading required package: pedtools
G = linearPed(2)
estimateTwoLocusKappa(G, id = c(1,5), rho = 0.25, Nsim = 2, Xchrom = T)
#>      ibd0 ibd1 ibd2
#> ibd0    0    0   NA
#> ibd1    0    0   NA
#> ibd2   NA   NA   NA

Created on 2022-05-19 by the reprex package (v2.0.1)

plot_ibd1

I suggest an extension of the very useful function plot_ibd1 to allow
for input of the pair of individuals to be compared. This would allow for comparison
of 'grandfather - paternal grand child' and 'grandfather - maternal grand child'.
The function returns some useful summary statistics: would it be possibleto also return
the standard deviation of the segment lengths? Currently e.g. the count and average is returned.

Error when running `profileSimIBD()`

Hi!

I get an error when I try to run profileSimIBD(). I have created a sibling pedigree, and attach 10 markers without genotype data, then use ibdsim() to simulate 100 genomes with IBD patterns . At the end I want to simulate DNA profiles for each of these IBD patterns.

library(ibdsim2)
#> Loading required package: pedtools
library(forrel)

Ind = c("ind1","ind2")
x = nuclearPed(nch = 2, children = Ind)

nm = c("D3S1358","TH01","D21S11","D18S51","PENTA_E","D5S818","D13S317","D7S820","D16S539","CSF1PO")
pos = setNames(c(45.557, 2.149, 19.476, 59.100, 95.175, 123.139, 81.620, 83.433, 84.944, 149.436),nm)
chroms = setNames(c("3","11","21","18", "15", "5", "13", "7", "16", "5"),nm)


M = 10
Mlist = vector(mode = "list", length = M)
for (i in 1:M){
  Mlist[[i]] =list(afreq = NorwegianFrequencies[[nm[i]]],name = nm[i], posMb = pos[nm[i]], chrom = chroms[nm[i]])
}

x = setMarkers(x,locusAttributes = Mlist)

IBDgenomes = ibdsim(x,N = 100,ids = Ind, seed = 44)
#> Simulation parameters:
#> Simulations  : 100
#> Chromosomes  : 1-22
#> Genome length: 2753.93 Mb
#>                2602.29 cM (male)
#>                4180.42 cM (female)
#> Recomb model : chi
#> Target indivs: ind1, ind2
#> Skip recomb  : -
#> Total time used: 1.19 secs
DNAprofiles = lapply(IBDgenomes, function(v) profileSimIBD(x = x,ibdpattern = v, markers = nm))
#> Error in vapply(seq_len(nMark), function(i) {: values must be length 1,
#>  but FUN(X[[3]]) result is length 0

Created on 2022-01-04 by the reprex package (v2.0.1)

Implement alleleSummary() for X chromosomal simulations

Need to think through what this should look like.

library(ibdsim2)
x = pedtools::nuclearPed(1)
sim = ibdsim(x, sims = 1, chrom = "X", seed = 111, verbose = F)

alleleSummary(sim[[1]])
#>      chrom    start       end   length 1p 1m 2p 2m 3p 3m
#> [1,]    23  0.00000  10.06673 10.06673  1  1  2  3  3  3
#> [2,]    23 10.06673  68.26989 58.20316  1  1  2  3  2  2
#> [3,]    23 68.26989 154.58261 86.31272  1  1  2  3  3  3

# In particular this output is meaningless for hemizygote males:
alleleSummary(sim[[1]], ids = c(1,3), ibd.status = TRUE)
#>      chrom    start       end   length 1p 1m 3p 3m ibd ibd_pp ibd_pm
#> [1,]    23  0.00000  10.06673 10.06673  1  1  3  3   0      0      0
#> [2,]    23 10.06673  68.26989 58.20316  1  1  2  2   0      0      0
#> [3,]    23 68.26989 154.58261 86.31272  1  1  3  3   0      0      0
#>      ibd_mp ibd_mm
#> [1,]      0      0
#> [2,]      0      0
#> [3,]      0      0

Created on 2018-08-10 by the reprex package (v0.2.0).

Export `uniformMap()`

Perhaps:

  • vectorise, so that e.g. uniformMap(M = 1:2) creates a list of two maps
  • create S3 classes "chromosomeMap" and "genomeMap"

plot_ibd1() and summary statistics

Regarding #7 and the wish for standard deviation I was thinking about the below
(which you perhaps get more easily in other ways):

library(ibdsim2, quietly = TRUE)
x1 = linearPed(2)
s1 = ibdsim(x1, sims = 2, verbose = FALSE)
res = plot_ibd1(s1, pairs = list(c(1,5)))
res$data
#>   count  averlen genomeLength relation ibd1_theory
#> 1    20 93.90920     2864.399        1         0.5
#> 2    25 48.74954     2864.399        1         0.5

Created on 2018-10-11 by the reprex package (v0.2.1)

Problems installing ibdsim2

I have problems installing ibdsim2.

devtools::install_github("magnusdv/ibdsim2")
#> Downloading GitHub repo magnusdv/ibdsim2@master
#> ribd (32b26100e... -> a35d08c5f...) [GitHub]
#> Downloading GitHub repo magnusdv/ribd@master
#> 
#>   
  
  
   checking for file 'C:\Users\hildla\AppData\Local\Temp\RtmpIpJT6X\remotes34a81ce82c8a\magnusdv-ribd-a35d08c/DESCRIPTION' ...
  
   checking for file 'C:\Users\hildla\AppData\Local\Temp\RtmpIpJT6X\remotes34a81ce82c8a\magnusdv-ribd-a35d08c/DESCRIPTION' ... 
  
v  checking for file 'C:\Users\hildla\AppData\Local\Temp\RtmpIpJT6X\remotes34a81ce82c8a\magnusdv-ribd-a35d08c/DESCRIPTION' (475ms)
#> 
  
  
  
-  preparing 'ribd':
#>    checking DESCRIPTION meta-information ...
  
   checking DESCRIPTION meta-information ... 
  
v  checking DESCRIPTION meta-information
#> 
  
  
  
-  checking for LF line-endings in source and make files and shell scripts
#> 
  
  
  
-  checking for empty or unneeded directories
#> 
  
  
  
-  looking to see if a 'data/datalist' file should be added
#> 
  
  
  
-  building 'ribd_0.1.0.tar.gz'
#> 
  
   
#> 
#> Installing package into 'C:/Users/hildla/Rpackages'
#> (as 'lib' is unspecified)
#>   
  
  
   checking for file 'C:\Users\hildla\AppData\Local\Temp\RtmpIpJT6X\remotes34a8518a3c20\magnusdv-ibdsim2-c387b5b/DESCRIPTION' ...
  
   checking for file 'C:\Users\hildla\AppData\Local\Temp\RtmpIpJT6X\remotes34a8518a3c20\magnusdv-ibdsim2-c387b5b/DESCRIPTION' ... 
  
v  checking for file 'C:\Users\hildla\AppData\Local\Temp\RtmpIpJT6X\remotes34a8518a3c20\magnusdv-ibdsim2-c387b5b/DESCRIPTION' (445ms)
#> 
  
  
  
-  preparing 'ibdsim2':
#>    checking DESCRIPTION meta-information ...
  
   checking DESCRIPTION meta-information ... 
  
v  checking DESCRIPTION meta-information
#> -  cleaning src
#> 
  
  
  
-  checking for LF line-endings in source and make files and shell scripts
#> 
  
  
  
-  checking for empty or unneeded directories
#> 
  
  
  
-  looking to see if a 'data/datalist' file should be added
#> 
  
  
  
-  building 'ibdsim2_1.0.9000.tar.gz'
#> 
  
   
#> 
#> Installing package into 'C:/Users/hildla/Rpackages'
#> (as 'lib' is unspecified)
#> Error in i.p(...): (converted from warning) installation of package 'C:/Users/hildla/AppData/Local/Temp/RtmpIpJT6X/file34a87beaae1/ibdsim2_1.0.9000.tar.gz' had non-zero exit status

Created on 2019-03-25 by the reprex package (v0.2.1)

plot_ibd1

plot_ibd1 fails, apparently because there are no ibd segments

library(ibdsim2, quietly = TRUE)
x = halfCousinPed(10)
s = ibdsim(x, sims = 1, verbose = FALSE)
plot_ibd1(s)
#> Error in seq.default(xmin, max.x, length = 10): 'from' must be a finite number

Created on 2018-10-09 by the reprex package (v0.2.1)

Inaccurate scaling in `haploDraw()`

library(ibdsim2)
#> Loading required package: pedtools
x = nuclearPed()
s = ibdsim(x, verbose = F)
haploDraw(x, s, chrom = 1, pos = c(3,3,1))

(By default, haplotypes should be 1 'symbol width' away from the pedigree symbols.)

Caused by a rescaling step performed after calculation of the haplotype positions. Not trivial to fix,

karyo_diploid

karyo_diploid understandably fails below , but could respond differently.
The below works for cou = cousinPed(0) but all chromosomes are then shown,
not only the requested 1 and 2.

library(pedtools)
library(ibdsim2)
library(ggplot2)
cou = cousinPed(1)
cousins = leaves(cou)
s = ibdsim(cou, 1, chromosomes = 1:2, seed=17, verbose = FALSE)
a = alleleSummary(s[[1]], leaves(cou), ibd.status=TRUE)
pat = a[a[,"ibd_pp"] == 1, ]
mat = a[a[,"ibd_mm"] == 1, ]
karyo_diploid(pat, mat, color = c(paternal = "blue", 
                                maternal = "red"),alpha = 0.8)
#> Error in `$<-.data.frame`(`*tmp*`, "fill", value = structure(1L, .Label = "1", class = "factor")): replacement has 1 row, data has 0

Created on 2018-08-25 by the reprex package (v0.2.0).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.