Tree estimation using discrete molecular data is often based on very precise observational procedures (choosing conserved DNA regions, sequencing the same region many times, and sequencing techniques are robust). In other words, it is often safe to assume that the observation error is zero, or negligibly small. Therefore, it has not been necessary to incorporate observation error in the evolutionary model (with some exceptions, see single-cell sequencing literature).
With morphological data, it is not as reasonable to assume that the observation error is zero. Several factors can contribute:
- Among-person scoring variation
- Within-person scoring variation
- Within-species phenotypic variation
- Taphonomic processes, differences in preservation
- Specimens vary in condition, for example in developmental stage, developmental deformities, injuries, tumors etc
Since it is not safe to assume that the observational error is negligibly zero, one approach is to incorporate it into the model. This repository
- Simulates some trees under the BD model
- Simulates some character data under the Markov-chain model, and adds random observation error on the tip data
- Inference of unrooted trees using RevBayes, modeling the observation error as a random variable
- Evaluation of bias and precision of a. Branch lengths b. Topology
- Also a test using an empirical dataset. What is the best estimate for the magnitude of the observation error (
$\beta$ )?