Giter Site home page Giter Site logo

theopannetier / comrad Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 2.0 12.91 MB

An R package to simulate an individual-based model with diversity-dependent diversification emerging from competition

License: GNU General Public License v3.0

R 35.80% C++ 1.23% HTML 62.97%

comrad's Introduction

comrad: A more mechanistic approach to diversity-dependent diversification

An example output of the simulation

Branch Travis CI logo Codecov logo
master R-CMD-check Codecov test coverage
develop R-CMD-check Codecov test coverage
brute_force* R-CMD-check Codecov test coverage

*see the Installation section

Description

comrad is an individual-based model of diversity-dependent diversification, where branching (and by extension, speciation), extinction and equilibrium diversity emerge from Lotka-Volterra like competition between individuals.

This package contains the algorithm for the model and functions used to run it, extract, analyse and visualise its output, including the phylogenetic tree of the community.

An overview of the model and the main features of the package can be found in the vignette.

The package accompanies the third chapter of my doctoral dissertation, and the corresponding manuscript, currently in prep. The functions it contains were used to produce the results and figures presented through the paper.

Installation

I maintain two versions of the package. The default version, suitable for all applications that do not include running the simulation for large communities, can be installed from the master branch:

remotes::install_github("TheoPannetier/comrad")

An alternative version, which lives on branch brute_force, is more suitable for running simulations with large communities ($N > 10^4$). Indeed, because the pairwise effects of competition must be calculated at every generation, the number of calculations grows quadratically with the size of the community, and the simulation may take a very long time to complete in such cases.

A solution has been implemented by @HHildenbrandt to address this using parallel- (via xsmid) and multi-processing (via OpenMP), which greatly reduces the time the simulation takes to process the whole community. This version of the package lives on branch `brute_force`, and can be installed with

remotes::install_github("TheoPannetier/comrad", ref = "brute_force")

*To best exploit these solutions, compilation is optimised for the local system on which comrad is run via the compiler flags -march=native, -mtune=native. These flags are not portable, so this version of the package cannot pass the CRAN check.

In case of doubt over which version is installed, the user can call has_brute_force_opt(), which evaluates to FALSE for the main version and to TRUE for the brute_force option.

Contributing / Bug report / Support

  • For questions or bug reports, please open an Issue
  • For contributions, please make a branch for yourself and open a pull request when you are done

comrad's People

Contributors

hhildenbrandt avatar theopannetier avatar thijsjanzen avatar

Stargazers

Shelly Gaynor avatar

Watchers

James Cloos avatar  avatar

comrad's Issues

Allow simulation to start from pre-existing community

I would like to be able to run the simulation starting from a point where the simulation previously exited, e.g. with a community read from an .csv file.

The file from which the initial community was read from, if different from default, should also be recorded in metadata.

#19 Get comrad to branch

So far, it seems that the population evolves and occupies the whole niche space, but no branching appears to be produced.
Instead, the trait values spread continuously along the spectrum of possible values
output2

This is surprising, the parameter values I've used are the same as in the original publication.
Maybe I haven't run the simulation for enough generations, but I rather suspect that this has to do with the way I sample mutations, which is quite different from how Pontarp proceeded.

#3 Make carrying capacity parameters consistent

carr_cap parameters are currently provided as a single vector (carr_cap_pars) to fitness(), but as separate elements to carr_cap.

Make this consistent across the two functions and tests.

#6 Make fitness function population-level

fitness() returns the fitness of a single individual and compares its trait with the rest of the population.

For the generation step perspective, it would be simpler and more efficient to instead return the fitness for the entire population.

#8 Write faster effective pop size function

Current get_eff_pop_sizes sums comp_coeff() for each individual against each other individual in the population.

comp_coeff() is symmetric - comp_coeff(z_i, z_j) == comp_coeff(z_j z_i) so we don't need to compute all the possible interactions - just the unique one, which would then make get_eff_pop_sizes loop over less values and make the fitness computation faster.

The first step is to extract all unique competing trait combinations in a population, using either expand.grid() or combn()

#13 Consider Rcpp

If speed proves unsufficient, one resort is to throw the main computations in C++ and call them with Rcpp

library(Rcpp)
sourceCpp("MyFunc.cpp")
system.time (output <- myFunc(df))

where a minimal MyFunc.cpp would be

#include <rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
void myFunc() {
 return 0;
}

Write tests for `convert_sim_to_phylo()`

Simulated communities produced with run_simulation() can be converted to ape-style phylo class objects with assemble_phylo_tbl() + convert_to_newick() + ape::read.tree(), all piped in convert_sim_to_phylo(). I'm not confident this will always work as intended so let's experiment with a bit of TDD.

#25 Treat multiple simultaneous speciation events

As of now, apply_speciation() can only apply one speciation event per species per generation. That is there are multiple gaps in the trait values of a species, only one will result in speciation.

Although this is no ideal, this should not be an issue for the simulation, because an untreated speciation event will be caught at the next generation.
In other, phylogenetic words, polytomies are treated as soft polytomies.

Make fitness function #1

The fitness function links the trait value of a focal individual to the number of offspring it will produces, based on values of traits of other individuals in the population, and the overall fitness of the trait without competition, as defined by the carrying capacity.

#12 Fix LaTeX blocks

Mathematical exponents such as exponents and greater that don't appear like in LaTeX in the compiled docs.

#7 Pick consistent model variable and parameter names

Dilemma: descriptive variable names, or concise names reflecting conventions?
E.g.

  • n_eff or eff_pop_size ?
  • sigma_alpha or competition_width; or the current trade-off sigma_comp ?
  • trait_ind and traits_pop or z_ind and z_pop ?

#5 Implement generation step

Pseudocode
— For t in 1:nb_gen
—— For each individual

  • Get fitness G(z)
  • Draw the nb of offspring from Pois(G(z))
    ——— For each offspring
  • Apply mutation
    ——— done

—— done
Replace current generation with new generation
— done

Automate read_comrad_tbl skip

The nb of lines of metadata in the simulation output of comrad keeps changing across versions.
So far I specify the numebr of lines that should be skipped via an argument, but this causes frequent bugs and is annoying.
Is there any regexp magic I could use to automatically find where the table starts?

#14 Solve anticipated merge conflict

I made a mistake and worked on master on last commits. Now back on develop but I forgot to merge master into develop now, so I anticipate a merge conflict when I do it.

#26 Change species identity

charlatan::ch_taxonomic_epithet() is great fun, but unfortunately there is a pretty high chance of drawing twice the same name for different species.
I need an alternative that provides random, unique identifiers for new species.

Add vignette?

Salut @TheoPannetier,

I checked this package and I am missing a vignette that illustrates what the package does. Do you consider adding it?

Cheers, Richel

#4 Solve the issue of NaN fitness

As of 0.3.0, fitness() can evaluate to a NaN if both n_eff and k take value 0.

This can happen when an individual takes a trait value that is very far from the optimal value and from the trait values of other individuals in the population; in which case carr_cap() evaluates to 0 (poor fitness) and comp_coeff() evaluates to 0 for all individuals (as there is no individual nearby in the trait space).

This is currently caught before fitness() evaluation, but still crashes the simulation.
The error can be called by running fitness(trait_ind = 30, traits_pop = rep(0, 1000)).

#2 Nest numeric testargs

testarg_pos, testarg_prop, testarg_not_zero should call testarg_num to assert that the provided argument is a numeric in the first place.

Nesting testargs in this way is not compatible with a single argument to testargs, because the latter relies on substitute which does not do great in arguments passed from upper functions.

For now, the solution was to unnest the statements, but finding a way to make both features compatible would save a few lines of code in every function.

Some attempts can be found in testargs_test.

#15 Fix carrying capacity

The population currently reaches a size of around 2500 individuals, vs the K = 1000 specified carrying capacity. That makes an overshoot of x2.5!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.