The molecular clock is figurative term for a technique that uses the mutation rate of biomolecules to deduce the time in prehistory when two or more life forms diverged. The biomolecular data used for such calculations are usually nucleotide sequences for DNA or amino acid sequences for proteins. The benchmarks for determining the mutation rate are often fossil or archaeological dates. The molecular clock was first tested in 1962 on the hemoglobin protein variants of various animals, and is commonly used in molecular evolution to estimate times of speciation or radiation. It is sometimes called a gene clock or an evolutionary clock.
The notion of the existence of a so-called "molecular clock" was first attributed to Émile Zuckerkandl and Linus Pauling who, in 1962, noticed that the number of amino acid differences in hemoglobin between different lineages changes roughly linearly with time, as estimated from fossil evidence. They generalized this observation to assert that the rate of evolutionary change of any specified protein was approximately constant over time and over different lineages (known as the molecular clock hypothesis (MCH)).
The genetic equidistance phenomenon was first noted in 1963 by Emanuel Margoliash, who wrote: "It appears that the number of residue differences between cytochrome c of any two species is mostly conditioned by the time elapsed since the lines of evolution leading to these two species originally diverged. If this is correct, the cytochrome c of all mammals should be equally different from the cytochrome c of all birds. Since fish diverges from the main stem of vertebrate evolution earlier than either birds or mammals, the cytochrome c of both mammals and birds should be equally different from the cytochrome c of fish. Similarly, all vertebrate cytochrome c should be equally different from the yeast protein." For example, the difference between the cytochrome c of a carp and a frog, turtle, chicken, rabbit, and horse is a very constant 13% to 14%. Similarly, the difference between the cytochrome c of a bacterium and yeast, wheat, moth, tuna, pigeon, and horse ranges from 64% to 69%. Together with the work of Emile Zuckerkandl and Linus Pauling, the genetic equidistance result directly led to the formal postulation of the molecular clock hypothesis in the early 1960s.
Similarly, Vincent Sarich and Allan Wilson in 1967 demonstrated that molecular differences among modern Primates in albumin proteins showed that approximately constant rates of change had occurred in all the lineages they assessed. The basic logic of their analysis involved recognizing that if one species lineage had evolved more quickly than a sister species lineage since their common ancestor, then the molecular differences between an outgroup (more distantly related) species and the faster-evolving species should be larger (since more molecular changes would have accumulated on that lineage) than the molecular differences between the outgroup species and the slower-evolving species. This method is known as the relative rate test. Sarich and Wilson's paper reported, for example, that human (Homo sapiens) and chimpanzee (Pan troglodytes) albumin immunological cross-reactions suggested they were about equally different from Ceboidea (New World Monkey) species (within experimental error). This meant that they had both accumulated approximately equal changes in albumin since their shared common ancestor. This pattern was also found for all the primate comparisons they tested. When calibrated with the few well-documented fossil branch points (such as no Primate fossils of modern aspect found before the K-T boundary), this led Sarich and Wilson to argue that the human-chimp divergence probably occurred only ~4-6 million years ago.
The observation of a clock-like rate of molecular change was originally purely phenomenological. Later, the work of Motoo Kimura developed the neutral theory of molecular evolution, which predicted a molecular clock. Let there be N individuals, and to keep this calculation simple, let the individuals be haploid (i.e. have one copy of each gene). Let the rate of neutral mutations (i.e. mutations with no effect on fitness) in a new individual be . The probability that this new mutation will become fixed in the population is then 1/N, since each copy of the gene is as good as any other. Every generation, each individual can have new mutations, so there are N new neutral mutations in the population as a whole. That means that each generation, new neutral mutations will become fixed. If most changes seen during molecular evolution are neutral, then fixations in a population will accumulate at a clock-rate that is equal to the rate of neutral mutations in an individual.
The molecular clock alone can only say that one time period is twice as long as another: it cannot assign concrete dates. For viral phylogenetics and ancient DNA studies—two areas of evolutionary biology where it is possible to sample sequences over an evolutionary timescale—the dates of the intermediate samples can be used to more precisely calibrate the molecular clock. However, most phylogenies require that the molecular clock be calibrated against independent evidence about dates, such as the fossil record. There are two general methods for calibrating the molecular clock using fossil data: node calibration and tip calibration.
Sometimes referred to as node dating, node calibration is a method for phylogeny calibration that is done by placing fossil constraints at nodes. A node calibration fossil is the oldest discovered representative of that clade, which is used to constrain its minimum age. Due to the fragmentary nature of the fossil record, the true most recent common ancestor of a clade will likely never be found. In order to account for this in node calibration analyses, a maximum clade age must be estimated. Determining the maximum clade age is challenging because it relies on negative evidence—the absence of older fossils in that clade. There are a number of methods for deriving the maximum clade age using birth-death models, fossil stratigraphic distribution analyses, or taphonomic controls. Alternatively, instead of a maximum and a minimum, a prior probability of the divergence time can be established and used to calibrate the clock. There are several prior probability distributions including normal, lognormal, exponential, gamma, uniform, etc.) that can be used to express the probability of the true age of divergence relative to the age of the fossil; however, there are very few methods for estimating the shape and parameters of the probability distribution empirically. The placement of calibration nodes on the tree informs the placement of the unconstrained nodes, giving divergence date estimates across the phylogeny. Historical methods of clock calibration could only make use of a single fossil constraint (non-parametric rate smoothing), while modern analyses (BEAST and r8s) allow for the use of multiple fossils to calibrate the molecular clock. Simulation studies have shown that increasing the number of fossil constraints increases the accuracy of divergence time estimation.
Sometimes referred to as tip dating, tip calibration is a method of molecular clock calibration in which fossils are treated as taxa and placed on the tips of the tree. This is achieved by creating a matrix that includes a molecular dataset for the extant taxa along with a morphological dataset for both the extinct and the extant taxa. Unlike node calibration, this method reconstructs the tree topology and places the fossils simultaneously. Molecular and morphological models work together simultaneously, allowing morphology to inform the placement of fossils. Tip calibration makes use of all relevant fossil taxa during clock calibration, rather than relying on only the oldest fossil of each clade. This method does not rely on the interpretation of negative evidence to infer maximum clade ages.
This approach to tip calibration goes a step further by simultaneously estimating fossil placement, topology, and the evolutionary timescale. In this method, the age of a fossil can inform its phylogenetic position in addition to morphology. By allowing all aspects of tree reconstruction to occur simultaneously, the risk of biased results is decreased. This approach has been improved upon by pairing it with different models. One current method of molecular clock calibration is total evidence dating paired with the fossilized birth-death (FBD) model and a model of morphological evolution. The FBD model is novel in that it allows for “sampled ancestors,” which are fossil taxa that are the direct ancestor of a living taxon or lineage. This allows fossils to be placed on a branch above an extant organism, rather than being confined to the tips.
Sometimes only a single divergence date can be estimated from fossils, with all other dates inferred from that. Other sets of species have abundant fossils available, allowing the MCH of constant divergence rates to be tested. DNA sequences experiencing low levels of negative selection showed divergence rates of 0.7–0.8% per Myr in bacteria, mammals, invertebrates, and plants. In the same study, genomic regions experiencing very high negative or purifying selection (encoding rRNA) were considerably slower (1% per 50 Myr).
In addition to such variation in rate with genomic position, since the early 1990s variation among taxa has proven fertile ground for research too, even over comparatively short periods of evolutionary time (for example mockingbirds). Tube-nosed seabirds have molecular clocks that on average run at half speed of many other birds, possibly due to long generation times, and many turtles have a molecular clock running at one-eighth the speed it does in small mammals, or even slower. Effects of small population size are also likely to confound molecular clock analyses. Researchers such as Francisco Ayala have more fundamentally challenged the molecular clock hypothesis. According to Ayala's 1999 study, five factors combine to limit the application of molecular clock models:
Molecular clock users have developed workaround solutions using a number of statistical approaches including maximum likelihood techniques and later Bayesian modeling. In particular, models that take into account rate variation across lineages have been proposed in order to obtain better estimates of divergence times. These models are called relaxed molecular clocks because they represent an intermediate position between the 'strict' molecular clock hypothesis and Joseph Felsenstein's many-rates model and are made possible through MCMC techniques that explore a weighted range of tree topologies and simultaneously estimate parameters of the chosen substitution model. It must be remembered that divergence dates inferred using a molecular clock are based on statistical inference and not on direct evidence.
The molecular clock runs into particular challenges at very short and very long timescales. At long timescales, the problem is saturation. When enough time has passed, many sites have undergone more than one change, but it is impossible to detect more than one. This means that the observed number of changes is no longer linear with time, but instead flattens out. Even at intermediate genetic distances, with phylogenetic data still sufficient to estimate topology, signal for the overall scale of the tree can be weak under complex likelihood models, leading to highly uncertain molecular clock estimates.
At very short time scales, many differences between samples do not represent fixation of different sequences in the different populations. Instead, they represent alternative alleles that were both present as part of a polymorphism in the common ancestor. The inclusion of differences that have not yet become fixed leads to a potentially dramatic inflation of the apparent rate of the molecular clock at very short timescales.
The molecular clock technique is an important tool in molecular systematics, the use of molecular genetics information to determine the correct scientific classification of organisms or to study variation in selective forces. Knowledge of approximately constant rate of molecular evolution in particular sets of lineages also facilitates establishing the dates of phylogenetic events, including those not documented by fossils, such as the divergence of living taxa and the formation of the phylogenetic tree. In these cases—especially over long stretches of time—the limitations of MCH (above) must be considered; such estimates may be off by 50% or more.
Allan Charles Wilson (18 October 1934 – 21 July 1991) was a Professor of Biochemistry at the University of California, Berkeley, a pioneer in the use of molecular approaches to understand evolutionary change and reconstruct phylogenies, and a revolutionary contributor to the study of human evolution. He was one of the most controversial figures in post-war biology; his work attracted a great deal of attention both from within and outside the academic world. He is the only New Zealander to have won the MacArthur Fellowship.He is best known for experimental demonstration of the concept of the molecular clock (with his doctoral student Vincent Sarich), which was theoretically postulated by Linus Pauling and Emile Zuckerkandl, revolutionary insights into the nature of the molecular anthropology of higher primates and human evolution, and the so-called Mitochondrial Eve hypothesis (with his doctoral students Rebecca L. Cann and Mark Stoneking).Archaeamphora
Archaeamphora longicervia is a disputed fossil plant species attributed extinct species of flowering plants and the only member of the genus Archaeamphora. Fossil material assigned to this taxon originates from the Yixian Formation of northeastern China, dated to the Early Cretaceous (around 145 to 101 million years ago).The species was originally described as a pitcher plant with close affinities to extant members of the family Sarraceniaceae. This would make it the earliest known carnivorous plant and the only known fossil record of sarraceniacea, or the pitcher plants. . Archaeamphora is also one of the three oldest known genera of angiosperms (flowering plants). Li (2005) wrote that "the existence of a so highly derived Angiosperm in the Early Cretaceous suggests that Angiosperms should have originated much earlier, maybe back to 280 mya as the molecular clock studies suggested".Subsequent authors have questioned the identification of Archaeamphora as a pitcher plant.Cichliformes
Cichliformes is an order of fishes, previously classifies under the order Perciformes but now many authorities consider it to be an order within the subseries Ovalentaria.Circa
Circa (from Latin, meaning 'around, about, roughly, approximately') – frequently abbreviated c., ca., or ca and less frequently circ. or cca. – signifies "approximately" in several European languages and as a loanword in English, usually in reference to a date. Circa is widely used in historical writing when the dates of events are not accurately known.
When used in date ranges, circa is applied before each approximate date, while dates without circa immediately preceding them are generally assumed to be known with certainty.
1732–1799: Both years are known precisely.
c. 1732 – 1799: The beginning year is approximate; the end year is known precisely.
1732 – c. 1799: The beginning year is known precisely ; the end year is approximate.
c. 1732 – c. 1799: Both years are approximate.Emile Zuckerkandl
Émile Zuckerkandl (July 4, 1922 – November 9, 2013) was an Austrian-born French biologist considered one of the founders of the field of molecular evolution. He is best known for introducing, with Linus Pauling, the concept of the "molecular clock", which enabled the neutral theory of molecular evolution.Euarchonta
The Euarchonta are a proposed grandorder of mammals containing four orders: the Scandentia or treeshrews, the Dermoptera or colugos, the extinct Plesiadapiformes, and the Primates.
The term "Euarchonta" (Waddell et al. 1999, meaning "true ancestors") appeared in 1999, when molecular evidence suggested that the morphology-based Archonta should be trimmed down to exclude Chiroptera (Waddell et al. 1999b). Further DNA sequence analyses (Madsen et al. 2001, Murphy et al., 2001 Waddell et al. 2001) supported the Euarchonta hypothesis. Despite multiple papers pointing out that some mitochondrial sequences showed unusual properties (particularly murid rodents and hedgehogs) and were likely distorting the overall tree (Sullivan and Swofford 1997, Waddell et al. 1999c), and despite Waddell et al. (2001) showing near total congruence of mtDNA-based and nuclear-based trees when such sequences were excluded, some authors continued to produce misleading trees (Arnason et al., 2002). A study investigating retrotransposon presence/absence data has claimed strong support for Euarchonta (Kriegs et al., 2007). Some interpretations of the molecular data link Primates and Dermoptera in a clade (mirorder) known as Primatomorpha, which is the sister of Scandentia. In some, the Dermoptera are a member of the primates rather than a sister group. Other interpretations link the Dermoptera and Scandentia together in a group called Sundatheria as the sister group of the primates.
Euarchonta and Glires together form the Euarchontoglires, one of the four Eutherian clades.
The current hypothesis, based on molecular clock evidence, suggests that the Euarchonta arose in the Cretaceous period, about 88 million years ago, and diverged 86.2 million years ago into the groups of tree shrews and Primatomorpha. The latter diverged prior to 79.6 million years into the orders of Primates and Dermoptera. The earliest fossil species often ascribed to Euarchonta (Purgatorius coracis) dates to the early Paleocene, 65 million years ago, but it appears to have been a non-placental eutherian. Although it is known that Scandentia is one of the most basal Euarchontoglire clades, the exact phylogenetic position is not yet considered resolved, and it may be a sister of Glires, Primatomorpha or Dermoptera or to all other Euarchontoglires.Gene orders
Gene orders are the permutation of genome arrangement. A fair amount of research has been done trying to determine whether gene orders evolve according to a molecular clock (molecular clock hypothesis) or in jumps (punctuated equilibrium).
Some research on gene orders in animals' mitochondrial genomes reveal that the mutation rate of gene orders is not a constant in some degrees.History of molecular evolution
The history of molecular evolution starts in the early 20th century with "comparative biochemistry", but the field of molecular evolution came into its own in the 1960s and 1970s, following the rise of molecular biology. The advent of protein sequencing allowed molecular biologists to create phylogenies based on sequence comparison, and to use the differences between homologous sequences as a molecular clock to estimate the time since the last common ancestor. In the late 1960s, the neutral theory of molecular evolution provided a theoretical basis for the molecular clock, though both the clock and the neutral theory were controversial, since most evolutionary biologists held strongly to panselectionism, with natural selection as the only important cause of evolutionary change. After the 1970s, nucleic acid sequencing allowed molecular evolution to reach beyond proteins to highly conserved ribosomal RNA sequences, the foundation of a reconceptualization of the early history of life.Human mitochondrial DNA haplogroup
In human genetics, a human mitochondrial DNA haplogroup is a haplogroup defined by differences in human mitochondrial DNA. Haplogroups are used to represent the major branch points on the mitochondrial phylogenetic tree. Understanding the evolutionary path of the female lineage has helped population geneticists trace the matrilineal inheritance of modern humans back to human origins in Africa and the subsequent spread around the globe.
The letter names of the haplogroups (not just mitochondrial DNA haplogroups) run from A to Z. As haplogroups were named in the order of their discovery, the alphabetical ordering does not have any meaning in terms of actual genetic relationships.
The hypothetical woman at the root of all these groups (meaning just the mitochondrial DNA haplogroups) is the matrilineal most recent common ancestor (MRCA) for all currently living humans. She is commonly called Mitochondrial Eve.
The rate at which mitochondrial DNA mutates is known as the mitochondrial molecular clock. It's an area of ongoing research with one study reporting one mutation per 8000 years.Human mitochondrial molecular clock
The human mitochondrial molecular clock is the rate at which mutations have been accumulating in the mitochondrial genome of hominids during the course of human evolution. The archeological record of human activity from early periods in human prehistory is relatively limited and its interpretation has been controversial. Because of the uncertainties from the archeological record, scientists have turned to molecular dating techniques in order to refine the timeline of human evolution. A major goal of scientists in the field is to develop an accurate hominid mitochondrial molecular clock which could then be used to confidently date events that occurred during the course of human evolution.
Estimates of the mutation rate of human mitochondrial DNA (mtDNA) vary greatly depending on the available data and the method used for estimation. The two main methods of estimation, phylogeny based methods and pedigree based methods, have produced mutation rates that differ by almost an order of magnitude. Current research has been focused on resolving the high variability obtained from different rate estimates.Malpighiales
The Malpighiales comprise one of the largest orders of flowering plants, containing about 16,000 species, about 7.8% of the eudicots. The order is very diverse, containing plants as different as the willow, violet, poinsettia, and coca plant, and are hard to recognize except with molecular phylogenetic evidence. It is not part of any of the classification systems based only on plant morphology. Molecular clock calculations estimate the origin of stem group Malpighiales at around 100 million years ago (Mya) and the origin of crown group Malpighiales at about 90 Mya.The Malpighiales are divided into 32 to 42 families, depending upon which clades in the order are given the taxonomic rank of family. In the APG III system, 35 families are recognized. Medusagynaceae, Quiinaceae, Peraceae, Malesherbiaceae, Turneraceae, Samydaceae, and Scyphostegiaceae are consolidated into other families. The largest family, by far, is the Euphorbiaceae, with about 6300 species in about 245 genera.In a 2009 study of DNA sequences of 13 genes, 42 families were placed into 16 groups, ranging in size from one to 10 families. Almost nothing is known about the relationships among these 16 groups. Malpighiales and Lamiales are the two large orders whose phylogeny remains mostly unresolved.Myzopoda
Myzopoda, which has two described species, is the only genus in the bat family Myzopodidae. Myzopodidae is unique as the only family of bats presently endemic to Madagascar. However, fossil discoveries indicate that the family has an ancient lineage in Africa, extending from the Pleistocene as far back as the late Eocene. Based on nuclear DNA sequence data, Myzopodidae appears to be basal in the Gondwanan superfamily Noctilionoidea, most of whose members are neotropical. The origin and initial diversification of Noctilionoidea may have occurred in Africa prior to their dispersal to Australia and South America, probably via Antarctica. On the basis of fossil and molecular clock evidence, myzopodids are estimated to have split off from the rest of Noctilionoidea about 50 (46 to 57) million years ago.Oriental magpie
The Oriental magpie (Pica sericea) is a species of magpie found from south-eastern Russia and Myanmar to eastern China, Taiwan and northern Indochina. It is also a common symbol of the Korean identity, and has been adopted as the "official bird" of numerous South Korean cities, counties and provinces. Other names for the Oriental magpie include Korean magpie and Asian magpie.Rosids
The rosids are members of a large clade (monophyletic group) of flowering plants, containing about 70,000 species, more than a quarter of all angiosperms.The clade is divided into 16 to 20 orders, depending upon circumscription and classification. These orders, in turn, together comprise about 140 families.Fossil rosids are known from the Cretaceous period. Molecular clock estimates indicate that the rosids originated in the Aptian or Albian stages of the Cretaceous, between 125 and 99.6 million years ago.Salientia
The Salientia (Latin salere (salio), "to jump") are a total group of amphibians that includes the order Anura, the frogs and toads, and various extinct proto-frogs that are more closely related to the frogs than they are to the Urodela, the salamanders and newts. The oldest fossil "proto-frog" appeared in the early Triassic of Madagascar, but molecular clock dating suggests their origins may extend further back to the Permian, 265 million years ago.Schizomida
Schizomida (common name shorttailed whipscorpion) is an order of arachnids, generally less than 5 millimetres (0.20 in) in length.
The order is not yet widely studied. As of 2005, more than 230 species of schizomids have been described worldwide, most belonging to the Hubbardiidae family. A systematic review including a full catalogue may be found in Reddell & Cokendolpher (1995). The Schizomida is sister to the order Uropygi, the two clades together forming the Thelyphonida. Based on molecular clock dates, both orders likely originated in the late Carboniferous somewhere in the tropics of Pangea, and the Schizomida underwent substantial diversification starting in the Cretaceous.Shimanskya
Shimanskya is a late Carboniferous fossil tentatively interpreted as an early spirulid.This identification was based on:
the well-developed phragmocone [which] possesses comparatively long camerae and [a] comparatively wide marginal siphuncle, the [absence of the] rostrum (at adult stages at least), and the [construction of the] shell wall, which is as thin as septa, has no nacreous layer and is subdivided into the inner and outer plates
Doguzhaeva et al. also identify these features in living Spirula, and the fossil 'Spirulida' Naefia, Groenlandibelus and Adygeya -- though see these respective articles for discussion as to whether or not these extinct genera are themselves Spiruliids.
Some authors are happy to accept this designation.But others have argued that none of the characters observed in Shimanskya is clearly diagnostic of the Spirulids.For example, a nacreous layer may have been lost more than once in cephalopod evolution.Others view the microstructural evidence as ambiguous.Interpreting Shimanskya as a spirulid creates a large gap in the fossil record of the lineage. Moreover, some molecular clock results predict that spirulids evolved much later than the Carboniferous, leading some to suggest that Shimanskya ought to be assigned to the coleoid stem group. Other clock analyses, however, are consistent with its position in the spirulid lineage.Substitution model
In biology, a substitution model describes the process from which a sequence of symbols changes into another set of traits. For example, in cladistics, each position in the sequence might correspond to a property of a species which can either be present or absent. The alphabet could then consist of "0" for absence and "1" for presence. Then the sequence 00110 could mean, for example, that a species does not have feathers or lay eggs, does have fur, is warm-blooded, and cannot breathe underwater. Another sequence 11010 would mean that a species has feathers, lays eggs, does not have fur, is warm-blooded, and cannot breathe underwater. In phylogenetics, sequences are often obtained by firstly obtaining a nucleotide or protein sequence alignment, and then taking the bases or amino acids at corresponding positions in the alignment as the characters. Sequences achieved by this might look like AGCGGAGCTTA and GCCGTAGACGC.
Substitution models are used for a number of things:
Constructing evolutionary trees in phylogenetics or cladistics.
Simulating sequences to test other methods and algorithms.Yinpterochiroptera
The Yinpterochiroptera, or Pteropodiformes, is a suborder of the Chiroptera, which includes taxa formerly known as megabats and five of the microbat families: Rhinopomatidae, Rhinolophidae, Hipposideridae, Craseonycteridae, and Megadermatidae. This suborder is primarily based on molecular genetics data. This proposal challenged the traditional view that megabats and microbats form monophyletic groups of bats. Further studies are being conducted, using both molecular and morphological cladistic methodology, to assess its merit.The term Yinpterochiroptera is constructed from the words Pteropodidae (the family of megabats) and Yinochiroptera (a term proposed in 1984 by Karl F. Koopman to refer to certain families of microbats).
Recent studies using transcriptome data have found strong support for the Yinpterochiroptera-Yangochiroptera classification system.
Researchers have created a relaxed molecular clock that estimates the divergence between Yinpterochiroptera and Yangochiroptera around 63 million years ago. The most recent common ancestor of Yinpterochiroptera, corresponding to the split between Rhinolophoidea and Pteropodidae (Old World Fruit bats), is estimated to have occurred 60 million years ago.The first appearance of the term Yinpterochiroptera was in 2001, in an article by Mark Springer and colleagues. As an alternative to the subordinal names Yinpterochiroptera and Yangochiroptera, some researchers use the terms Pteropodiformes and Vespertilioniformes, basing the names on the oldest valid genus description in each group, Pteropus and Vespertilio. Under this new proposed nomenclature, Pteropodiformes is the suborder that would replace Yinpterochiroptera.