Cladistics (Greek: klados = branch) is a branch of biology that determines the evolutionary relationships between organisms based on derived similarities. It is the most prominent of several forms of phylogenetic systematics, which study the evolutionary relationships between organisms. Cladistics is a method of rigorous analysis, using "shared derived properties" (synapomorphies: see below) of the organisms being studied. Cladistic analysis forms the basis for most modern systems of biological classification, which seek to group organisms by evolutionary relationships. In contrast, phenetics groups organisms based on their overall similarity, while approaches that are more traditional tend to rely on key characters. Willi Hennig (1913 - 1976) is widely regarded as the founder of cladistics.
This cladogram shows the relationship among various insect groups. In some cladograms, the length of the horizontal lines indicates time elapsed since the last common ancestor.
As the end result of a cladistic analysis, treelike relationship-diagrams called "cladograms" are drawn up to show different hypotheses of relationships. A cladistic analysis can be based on as much or as little information as the researcher selects. Modern systematic research is likely to be based on a wide variety of information, including DNA-sequences (so called "molecular data"), biochemical data and morphological data.
This representation emphasises that cladograms are trees.
In a cladogram, all organisms lie at the leaves, and each inner node is ideally binary (two-way). The two taxa on either side of a split are called sister taxa or sister groups. Each subtree, whether it contains one item or a hundred thousand items, is called a clade. A natural group has all the organisms contained in any one clade that share a unique ancestor (one which they do not share with any other organisms on the diagram) for that clade. Each clade is set off by a series of characteristics that appear in its members, but not in the other forms from which it diverged. These identifying characteristics of a clade are called synapomorphies (shared, derived characters). For instance, hardened front wings (elytra) are a synapomorphy of beetles, while circinate vernation, or the unrolling of new fronds, is a synapomorphy of ferns.
A character state (see below) that is present in both the outgroups (the nearest relatives of the group, that are not part of the group itself) and in the ancestors is called a plesiomorphy (meaning "close form", also called ancestral state). A character state that occurs only in later descendants is called an apomorphy (meaning "separate form", also called the "derived" state) for that group. The adjectives plesiomorphic and apomorphic are used instead of "primitive" and "advanced" to avoid placing value-judgements on the evolution of the character states, since both may be advantageous in different circumstances.
Several more terms are defined for the description of cladograms and the positions of items within them. A species or clade is basal to another clade if holds more plesiomorphic characters than that other clade. Usually a basal group is very species-poor as compared to a more derived group. It is not a requirement that a basal group is present. For example when considering birds and mammals together, neither is basal to the other: both have many derived characters.
A clade or species located within another clade can be described as nested within that clade.
A cladistic analysis is applied to a certain set of information. To organize this information a distinction is made between characters, and character states. Consider the color of feathers, this may be blue in one species but red in another. Thus, "red feathers" and "blue feathers" are two character states of the character "feather-color."
In the old days, the researcher would decide which character states were present before the last common ancestor of the species group (plesiomorphies) and which were present in the last common ancestor (synapomorphies). Usually this is done by considering one or more outgroups (organisms that are considered not to be part of the group in question, but that are related to the group). Only synapomorphies are of any use in characterising cladistic divisions.
Next, different possible cladograms were drawn up and evaluated. Clades should have as many synapomorphies as possible. The hope is that a sufficiently large number of true synapomorphies will be large enough to overwhelm any unintended symplesiomorphies (homoplasies), caused by convergent evolution (i.e. characters that resemble each other because of environmental conditions or function, not because of common ancestry). A well-known example of homoplasy due to convergent evolution is the character wings. Though the wings of birds and insects may superficially resemble one another and serve the same function, each evolved independently. If a bird and an insect are both accidentally scored "POSITIVE" for the character "presence of wings", a homoplasy would be introduced into the dataset, which may cause erroneous results.
When equivalent possibilities turn up, one is usually chosen based on the principle of parsimony: the most compact arrangement is likely the best hypothesis of relationship (a variation of Occam's razor). Another approach, particularly useful in molecular evolution, is maximum likelihood, which selects the optimal cladogram that has the highest likelihood based on a specific probability model of changes.
Of course, it is no longer done this way: researcher bias is something to be avoided. These days much of the analysis is done by software: besides the software to calculate the trees themselves, there is sophisticated statistical software to provide a more objective basis.
Cladistics has taken a while to settle in, and there is still wide debate over how to apply Hennig's ideas in the real world. There is concern that use of widely different data sets (for instance, structural versus genetic characteristics) may produce widely different trees. However, largely, cladistictics has proven useful in resolving phylogenies and has gained widespread support.
As DNA sequencing has become easier, phylogenies are increasingly constructed with the aid of molecular data. Computational systematics allows the use of these large data sets to construct objective phylogenies. These can filter out some true synapomorphies from parallel evolution more accurately. A powerful method of reconstructing phylogenies is the use of genomic Retrotransposon_Markers, which are virtually ambiguity-free according to current knowledge (though this is simply an assumption based on statistics and may, although unlikely, not be true in a specific case). Ideally, morphological, molecular and possibly other (behavioral etc) phylogenies should be combined: none of the methods is "superior", but all have different intrinsic sources of error. For example, true character convergence is much more common in morphology than in molecular sequences, but true character reversions do usually only occur in the latter (see Long branch attraction). Dating based on molecular information is usually more precise than dating of fossils, but more fraught with error (see Molecular clock). By combining and comparing, many errors can be eliminated.
Cladistics does not assume any particular theory of evolution, only the background knowledge of descent with modification. Thus, cladistic methods can be, and recently have been, usefully applied to non-biological systems, including determining language families in historical linguistics and the filiation of manuscripts in textual criticism.
Three ways to define a clade for use in a cladistic taxonomy.
- Node-based: the most recent common ancestor of A and B and all its descendants.
- Stem-based: all descendants of the oldest common ancestor of A and B that is not also an ancestor of Z.
- Apomorphy-based: the most recent common ancestor of A and B possessing a certain apomorphy (derived character), and all its descendants.
A recent trend in biology since the 1960s, called cladism or cladistic taxonomy, requires taxa to be clades. In other words, cladists argue that the classification system should be reformed to eliminate all non-clades. In contrast, other taxonomists insist that groups reflect phylogenies and often make use of cladistic techniques, but allow both monophyletic and paraphyletic groups as taxa.
A monophyletic group is a clade, comprising an ancestral form and all of its descendants, and so forming one (and only one) evolutionary group. A paraphyletic group is similar, but excludes some of the descendants that have undergone significant changes. For instance, the traditional class Reptilia excludes birds even though they evolved from the ancestral reptile. Similarly, the traditional Invertebrates are paraphyletic because Vertebrates are excluded, although the latter evolved from an Invertebrate.
A group with members from separate evolutionary lines is called polyphyletic. For instance, the once-recognized Pachydermata was found to be polyphyletic because elephants and rhinoceroses arose from non-pachyderms separately. Evolutionary taxonomists consider polyphyletic groups to be errors in classification, often occurring because convergence or other homoplasy was misinterpreted as homology.
Following Hennig, cladists argue that paraphyly is as harmful as polyphyly. The idea is that monophyletic groups can be defined objectively, in terms of common ancestors or the presence of synapomorphies. In contrast, paraphyletic and polyphyletic groups are both defined based on key characters, and the decision of which characters are of taxonomic import is inherently subjective. Many argue that they lead to "gradistic" thinking, where groups advance from "lowly" grades to "advanced" grades, which can in turn lead to teleology. In Evolutionary studies, Teleology is usually avoided because it implies a plan that cannot be empirically demonstrated.
Going further, some cladists argue that ranks for groups above species are too subjective to present any meaningful information, and so argue that they should be abandoned. Thus they have moved away from Linnaean taxonomy towards a simple hierarchy of clades.
Other evolutionary systematists argue that all taxa are inherently subjective, even when they reflect evolutionary relationships, since living things form an essentially continuous tree. Any dividing line is artificial, and creates both a monophyletic section above and a paraphyletic section below. Paraphyletic taxa are necessary for classifying earlier sections of the tree – for instance, the early vertebrates that would someday evolve into the family Hominidae cannot be placed in any other monophyletic family. They also argue that paraphyletic taxa provide information about significant changes in organisms' morphology, ecology, or life history – in short, that both taxa and clades are valuable but distinct notions, with separate purposes. Many use the term monophyly in its older sense, where it includes paraphyly, and use the alternate term holophyly to describe clades (monophyly in Hennig's sense).
A formal code of phylogenetic nomenclature, the PhyloCode, is currently under development for cladistic taxonomy. It is intended for use by both those who would like to abandon Linnaean taxonomy and those who would like to use taxa and clades side by side.