top of page

Rocks and Clocks: From fossils to genomes

  • Writer: Elise Baugh
    Elise Baugh
  • Mar 6, 2024
  • 11 min read

Updated: Aug 27


Comparative Morphology of Marine Vertebrates
Comparative Morphology of Marine Vertebrates

Approaches to estimating evolutionary timelines:


Fossils and Geologic records

Fossil and geologic records once served as the main temporal anchor point for understanding evolutionary timelines. (Donoghue & Benton

, 2007). However, fossil evidence sources have limitations. These include gaps in the fossil record, taxonomic biases (the preservation of some taxa more than others), and stratigraphic uncertainties.  Additionally, The age estimates of fossil occurrences may not represent the true origin of the group. The oldest fossil record of a lineage may only reflect the time when a stable population with diagnostic morphological traits became widespread enough to be preserved in the fossil record.  Incorporating different phylogenetic approaches from multiple biological fields may enable a more comprehensive view of macroevolutionary patterns. 


ree


Comparative Morphology

Comparative morphology practices provide excellent insight into phenotypic relationships, revealing changes in form and function over time in extant taxonomies. However, researchers in this field face their own set of challenges. For example, convergent, or homoplastic evolution, is a potential source of error that can result in misleading phylogenetic inferences (Lee & Palci, 2015). This field also deals with human errors introduced via incomplete data sampling and the misidentification of species and characters, resulting in modeling inadequacies. 


ree


The molecular clock

Mid-century advancements in molecular biology shifted the focus towards molecular analysts to provide the biological world with answers to questions about the mechanistic basis that underlie organismal divergence. This presented new pieces of evidence to the puzzle of life on earth (Martinez, 2018.) In 1965, Zuckerkandl and Pauling introduced the idea of a molecular clock. Their research inferred a constant rate of amino acid substitutions over time. They compared this constant rate of change to the ''ticks'' of a clock (Zuckerkandl & Pauling, 1965). In 1968, Motoo Kimura introduced the ''neutral theory of molecular evolution''. He suggested that the clock reflects the action of random drift, not natural selection (Kimura, 1968). Today, the term 'molecular clock' describes genome sequence changes over time. 

ree

Nearly Neutral Theory

Ohta, In 1973, presented a modified version of neutral theory calling it the ‘nearly neutral theory’. She proposed a modified version of the neutral theory by stating that most genetic changes are neutral, but a small proportion is either beneficial or deleterious. The ‘nearly neutral theory’ allows for variable rates of evolution and changes in heterozygosity and has provided the framework for developing a relaxed-molecular clock model. Another major advance in molecular dating was 'The coalescent theory' a basis of statistical techniques that provide extensions of classical population genetics with mathematical models that reflect genomic data and biological influences (Kingman, 1982) 


Total Evidence Dating

These modifications are widely used in population genetics because they account for variable substitution rates and population sizes. Based on the above mentioned research, Total Evidence Dating, a modeling methods that accounts for these uncertainties and acknowledges fossil evidence as an indicator of relative dating,  Bridging fields of data, in ways that incorporate morphological and molecular, fossil and other relevant lines of evidence, reduces discrepancies creating more accurate divergence time estimateS


Benefits and challenges of Interdisiplinary 'Big Data' Approaches

The collaboration between fields of study and datasets has led to more informed conclusions and drawing more contextually meaningful inferences.  However, despite advances in our ability to create accurate timelines of complex lineages, limitations, inaccuracies, and discrepancies remain prevalent. Recent studies have found the potential for substantial errors when substitution and speciation rates vary within lineages (Ritchie, Hua, & Bromham, 2022). Improvements in applied methodologies are needed in order to increase levels of fidelity measurements. This paper aims to identify both the challenges and innovations in phylogenetics that successfully merge unique datasets from different fields of science, each with its strengths and weaknesses, to form well-rounded conclusions. 


  We review recent research tools and approaches in molecular dating and phylogenetics by highlighting recent studies in phylogenetics that progress the combined effort of calibrating the molecular clock. The studies chosen utilize (or argue in support of, big data approaches, utilizing many lines of evidence (i.e., morphological, fossil record/stratigraphic, climate, and biogeographic) datasets and combine them with relative dating methods such as morphological and molecular. The following research studies (Hipsley & Muller, 2014; May et al., 2021; Fernández et al., 2017) demonstrate 'big-data' approaches to their phylogenetic analysis studies.


  Collaborative efforts between scientific fields cultivate a deeper understanding of historic evolutionary relationships and influences through space and time. Understanding the past is a preface to being able to predict future outcomes. These are vital tools for conservation decision-making in a future filled with dynamic environmental change. Developing robust, accurate phylogenetic modeling tools is relevant to all fields of science conducting biological investigations. 


STATE OF THE FIELD

  A macroevolutionary lens and interdisciplinary knowledge are required to understand the events influencing earth's evolutionary history. By integrating molecular, phenotypical, morphological, and paleontological lines of evidence, estimations about biological systematics have improved significantly, providing a clearer picture of phylogenetic relationships. Advances in computational biology have increased our ability to develop increased accuracy when estimating divergence times. Key advancements include:

1) Phylogenomics: or genome-scale sequencing datasets are used in phylogenetic studies for analyzing relative rates of change in the genome over time. (Kumar et al., 2012) . 


2) Coalescent theory models:  a statistical approach to simulating the process of genetic drift and lineage coalescence over time and enabling the estimation of population parameters and the reconstruction of phylogenetic trees.(Kingman, 1982). 


3) The multispecies coalescent model describes gene trees as independent random variables generated along the lineages of the species tree. Since the multi-species coalescent model allows gene trees to vary across genes, coalescent-based methods account for heterogeneous gene trees in phylogenomic data analysis (Jiao, Flouri, & Yang, 2021).


4) Bayesian Markov Chain Monte Carlo (MCMC) based algorithms account for flexible parameters, and applying prior distributions (anchoring calibration points provide temporal bounds for estimating divergence times among clades)  combined with extant morphology data enables the exploration of the posterior distribution patterns. (Larget & Simon, 1999) 



Case Studies in Calibration: Lessons from Recent Research


  1. Beyond fossil calibrations: realities of molecular clock practices in evolutionary biology (Hipsley & Muller, 2014)


"Variations in calibration methods resulted in the highest levels of discrepancies between divergence time estimates."
"Variations in calibration methods resulted in the highest levels of discrepancies between divergence time estimates."

This study, published in 2014, discusses the importance of accuracy and precision in divergence dating analyses and the roles involved in calibrating the molecular clock. Fossils are a way to date the molecular clock externally (absolute dating), while the molecular clock provides relative dating (Hipsley & MÃller, 2014). Variations in calibration methods resulted in the highest levels of discrepancies between divergence time estimates. The authors aimed to identify recent patterns in published clock calibration studies and analyze potential pitfalls associated with each methodology. To do this, they conducted a literature survey of 600 publications from 2007-2013. At the time of the study, they found that the most commonly used methods for absolute dating were led by fossil evidence (approximately 50%), geologic events, and secondary dating methods.


"Variations in calibration methods resulted in the highest levels of discrepancies between divergence time estimates."

The researchers in this study found that using fossil evidence as anchoring points is a potential source of error in estimation. In addition, due to the prevalence of taxonomic biases, many taxonomic groups were under-represented within the fossil record and therefore neglected in calibration implementation guidelines. The authors also warned against using geologic events alone to explain allopatric dispersal patterns suggesting that many clades may be older than assumed from geographic isolation evidence. Finally, they also discuss the strengths and weaknesses in molecular dating and the need for continued progress in developing standards for substitution and mutation rates per clade.  2007 marked the release of a revolutionary computational tool BEAST or Bayesian evolutionary analysis, by sampling trees. Its developers describe this tool as an "evolutionary analysis package for molecular sequence variation. It also provides a resource for further developing new models and statistical methods" (Drummond & Rambaut, 2007). In 2007 this novel tool enabled the incorporation of multiple sources of evidence and various statistical analysis models. The authors of this paper encourage the use of combined evidence methodologies for improved estimation, stating that "Age constraints based on other types of data provide alternative means that, when well justified, can contribute critical information on the evolutionary history of life." 

  1. Inferring the Total-Evidence Timescale of Marattialean Fern Evolution in the Face of Model Sensitivity' (May et al., 2021)


ree


This study tested how best to date the diversification of Marattialean ferns. The authors used two advances in phylogenetics: total evidence dating and the fossilized birth death model. Fossilized birth death supplies a statistical framework that links speciation, extinction, and fossil discovery, so fossils and living species can be analysed in one system. Total evidence dating brings fossils in as tips alongside living taxa, rather than only as constraints on interior nodes.The team compared tip placement of fossils with the older node based practice. The choice changed both divergence time estimates and some relationships in the tree. Tip dating gave the best fit and the most coherent results.


Fossilized birth death , or FBD models, supply a statistical framework that links speciation, extinction, and fossil discovery, so fossils and living species can be analysed in one system

Their preferred models inferred that stem divergences for extinct Marattiales lineages began in the middle Devonian and that crown divergences of living Marattiaceae occurred in the late Cretaceous. The model suggested elevated speciation during the Mississippian and elevated extinction during the Cisuralian. Diversity peaked near the end of the Carboniferous at about two thousand eight hundred species, followed by a rapid decline that ended in the extinction of Psaroniaceae, while Marattiaceae persisted to the present.


Why it matters

Node based dating leans on the quality and completeness of the fossil record at specific nodes and can bias dates. Total evidence dating lets uncertainty in fossil placement be estimated within the model, which generally yields more defensible timelines, especially when morphology is informative.

3) The Opiliones tree of life shedding light on harvestmen relationships through transcriptomics (Fernández et al., 2017). 


ree

The authors of this study analyse relationships within Opiliones, the harvestmen commonly known as daddy longlegs. This is the first study to apply total evidence dating to this lineage. The model outperformed alternatives by jointly estimating fossil positions and divergence times, using fossil evidence, biogeographic information, molecular sequence data, and transcriptomic datasets to calibrate the phylogeny. Transcriptomics here refers to RNA sequencing of the full set of transcripts in a sample, which can signal developmental stage and physiological state.


This approach mirrors the Marattialean fern study. Both papers use a total evidence dating approach and show the benefit of treating fossils as tips rather than as node constraints where accuracy hinges on fossil placement. Total evidence dating makes that uncertainty explicit by letting well described fossils enter as terminals and by estimating their divergence points along branches using morphological matrices. The study also stresses the need for dense taxon sampling in molecular dating for diverse arthropod clades..


They found the model outperformed alternatives by jointly estimating fossil positions and divergence times, using fossil evidence, biogeographic information, molecular sequence data, and transcriptomic datasets to calibrate the phylogeny.


ree

Outstanding questions and obstacles to progress


Modern statistical modelling, computational biology and greater access to quality genomic data have sharpened how we calibrate the molecular clock. Even so, every model is a simplification and only as strong as its data and its priors. Choices about clock models, tree priors, and morphology models can shift dates by large amounts. Data quality, taxon sampling, and gaps in the fossil record still set the main limits. The biggest gains in molecular dating come from better data, clearer priors, and open workflows, not from any single algorithm or tool.


Real progress also depends on open and reusable data. Reporting of alignments, morphology matrices, calibration rationales, and code is still uneven. When datasets and workflows are not shared, meta analyses stall and results cannot be checked or extended. Preregistration, registered reports, and mandatory sharing of data and code improve transparency and reproducibility (O’Dea et al., 2021).


 The biggest gains in molecular dating come from better data, clearer priors, and open workflows, not from any single algorithm or tool.

What would move the field forward
  1. Build benchmark datasets that pair morphology and molecules with vetted fossil justifications across multiple clades.

  2. Test model adequacy with posterior predictive checks and broad sensitivity analyses of priors.

  3. Expand taxon sampling for both living and fossil lineages, with clear reporting of sampling decisions.

  4. Compare alternative clock models and tree priors within the fossilized birth death framework and total evidence dating.

  5. Report full uncertainty, including alternative fossil placements and complete interval estimates.

  6. Share everything needed to rerun the study, including data, code, and containers or scripts with DOIs.



CITATIONS: 

Brown, J.W. and Smith, S.A. (2018) ‘The Past Sure is Tense: On Interpreting Phylogenetic Divergence Time Estimates’, Systematic Biology, 67(2), pp. 340–353. Available at: https://doi.org/10.1093/sysbio/syx074.


Budd, G.E. and Mann, R.P. (2022) Two notorious nodes: a critical examination of MCMCTree relaxed molecular clock estimates of the bilaterian animals and placental mammals. preprint. Paleontology. Available at: https://doi.org/10.1101/2022.07.01.498494.


Cunningham, J.A. et al. (2017) ‘The origin of animals: Can molecular clocks and the fossil record be reconciled?’, BioEssays, 39(1), p. e201600120. Available at: https://doi.org/10.1002/bies.201600120.

Donoghue, P.C.J. and Benton, M.J. (2007) ‘Rocks and clocks: calibrating the Tree of Life using fossils and molecules’, Trends in Ecology & Evolution, 22(8), pp. 424–431. Available at: https://doi.org/10.1016/j.tree.2007.05.005.


Drummond, A.J. and Rambaut, A. (2007) ‘BEAST: Bayesian evolutionary analysis by sampling trees’, BMC Evolutionary Biology, 7(1), p. 214. Available at: https://doi.org/10.1186/1471-2148-7-214.


Eaton, K. et al. (2023) ‘Plagued by a cryptic clock: insight and issues from the global phylogeny of Yersinia pestis’, Communications Biology, 6(1), p. 23. Available at: https://doi.org/10.1038/s42003-022-04394-6.

Fernández, R. et al. (2017) ‘The Opiliones tree of life: shedding light on harvestmen relationships through transcriptomics’, Proceedings of the Royal Society B: Biological Sciences, 284(1849), p. 20162340. Available at: https://doi.org/10.1098/rspb.2016.2340.


Hipsley, C.A. and Müller, J. (2014) ‘Beyond fossil calibrations: realities of molecular clock practices in evolutionary biology’, Frontiers in Genetics, 5. Available at: https://doi.org/10.3389/fgene.2014.00138.

Ho, S.Y.W. et al. (2015) ‘Biogeographic calibrations for the molecular clock’, Biology Letters, 11(9), p. 20150194. Available at: https://doi.org/10.1098/rsbl.2015.0194.


Jiao, X., Flouri, T. and Yang, Z. (2021) ‘Multispecies coalescent and its applications to infer species phylogenies and cross-species gene flow’, National Science Review, 8(12), p. nwab127. Available at: https://doi.org/10.1093/nsr/nwab127.


Kimura, M. (1968) ‘Evolutionary Rate at the Molecular Level’, Nature, 217(5129), pp. 624–626. Available at: https://doi.org/10.1038/217624a0.


Kingman, J.F.C. (1982) ‘The coalescent’, Stochastic Processes and their Applications, 13(3), pp. 235–248. Available at: https://doi.org/10.1016/0304-4149(82)90011-4.


Larget, B. and Simon, D.L. (1999) ‘Markov Chasin Monte Carlo Algorithms for the Bayesian Analysis of Phylogenetic Trees’, Molecular Biology and Evolution, 16(6), pp. 750–759. Available at: https://doi.org/10.1093/oxfordjournals.molbev.a026160.


Lee, M.S.Y. and Palci, A. (2015) ‘Morphological Phylogenetics in the Genomic Age’, Current Biology, 25(19), pp. R922–R929. Available at: https://doi.org/10.1016/j.cub.2015.07.009.


Marris, E. (2004) ‘Molecular clock tied to fossil record’, Nature, pp. news041011-2. Available at: https://doi.org/10.1038/news041011-2.


Martinez, P. (2018) ‘The Comparative Method in Biology and the Essentialist Trap’, Frontiers in Ecology and Evolution, 6, p. 130. Available at: https://doi.org/10.3389/fevo.2018.00130.


Matschiner, M. (2019) ‘Selective Sampling of Species and Fossils Influences Age Estimates Under the Fossilized Birth–Death Model’, Frontiers in Genetics, 10, p. 1064. Available at: https://doi.org/10.3389/fgene.2019.01064.


May, M.R. et al. (2021) ‘Inferring the Total-Evidence Timescale of Marattialean Fern Evolution in the Face of Model Sensitivity’, Systematic Biology. Edited by R. Folk, 70(6), pp. 1232–1255. Available at: https://doi.org/10.1093/sysbio/syab020.


Mott, T. and Vieites, D.R. (2009) ‘Molecular phylogenetics reveals extreme morphological homoplasy in Brazilian worm lizards challenging current taxonomy’, Molecular Phylogenetics and Evolution, 51(2), pp. 190–200. Available at: https://doi.org/10.1016/j.ympev.2009.01.014.


O’Dea, R.E. et al. (2021) ‘Preferred reporting items for systematic reviews and meta‐analyses in ecology and evolutionary biology: a PRISMA extension’, Biological Reviews, 96(5), pp. 1695–1722. Available at: https://doi.org/10.1111/brv.12721.


dos Reis, M., Donoghue, P.C.J. and Yang, Z. (2016) ‘Bayesian molecular clock dating of species divergences in the genomics era’, Nature Reviews Genetics, 17(2), pp. 71–80. Available at: https://doi.org/10.1038/nrg.2015.8.


Ritchie, A.M., Hua, X. and Bromham, L. (2022) ‘Investigating the reliability of molecular estimates of evolutionary time when substitution rates and speciation rates vary’, BMC Ecology and Evolution, 22(1), p. 61. Available at: https://doi.org/10.1186/s12862-022-02015-8.


Ronquist, F. et al. (2012) ‘A Total-Evidence Approach to Dating with Fossils, Applied to the Early Radiation of the Hymenoptera’, Systematic Biology, 61(6), pp. 973–999. Available at: https://doi.org/10.1093/sysbio/sys058.


Rota, J. et al. (2018) ‘A simple method for data partitioning based on relative evolutionary rates’, PeerJ, 6, p. e5498. Available at: https://doi.org/10.7717/peerj.5498.


Wang, Z., Gerstein, M. and Snyder, M. (2009) ‘RNA-Seq: a revolutionary tool for transcriptomics’, Nature Reviews Genetics, 10(1), pp. 57–63. Available at: https://doi.org/10.1038/nrg2484.


Warnock, R.C.M., Yang, Z. and Donoghue, P.C.J. (2017) ‘Testing the molecular clock using mechanistic models of fossil preservation and molecular evolution’, Proceedings of the Royal Society B: Biological Sciences, 284(1857), p. 20170227. Available at: https://doi.org/10.1098/rspb.2017.0227.


Wortel, M.T. et al. (2023) ‘Towards evolutionary predictions: Current promises and challenges’, Evolutionary Applications, 16(1), pp. 3–21. Available at: https://doi.org/10.1111/eva.13513.


Wu, X. and Schepartz, L.A. (2009) ‘Application of computed tomography in paleoanthropological research’, Progress in Natural Science, 19(8), pp. 913–921. Available at: https://doi.org/10.1016/j.pnsc.2008.10.009.


Yates, F. and Healy, M.J.R. (1964) ‘How Should we Reform the Teaching of Statistics?’, Journal of the Royal Statistical Society. Series A (General), 127(2), p. 199. Available at: https://doi.org/10.2307/2344003.

Zuckerkandl, E. and Pauling, L. (1965) ‘Evolutionary Divergence and Convergence in Proteins’, in Evolving Genes and Proteins. Elsevier, pp. 97–166. Available at: https://doi.org/10.1016/B978-1-4832-2734-4.50017-6





 
 
 

Comments


Subscribe Form

Thanks for submitting!

  • Twitter
  • LinkedIn

©2024 by Elise Baugh

bottom of page