To analyze the evolutionary emergence of structural complexity in physical processes we introduce a general, but tractable, model of objects that interact to produce new objects. Since the objects - epsilon-machines - have well defined structural properties, we demonstrate that complexity in the resulting population-dynamical system emerges on several distinct organizational scales during evolution - from individuals to nested levels of mutually self- sustaining interaction. The evolution to increased organization is dominated by the spontaneous creation of structural hierarchies and this, in turn, is facilitated by the innovation and maintenance of relatively low-complexity, but general individuals.
J. P. Crutchfield and Olof Gornerup, "Objects That Make Objects: The Population Dynamics of Structural Complexity". Santa Fe Institute Working Paper 04-06-020. arxiv.org e-print adap- org/0406058.
Understanding the origin of diversity is a fundamental problem in biology. Traditional evolutionary theory predicts uniformity: acting on organisms under given environmental conditions and developmental constraints, natural selection produces a unique, optimally adapted phenotype. According to this view, different types only come about through a change in conditions over space or time. In particular, the process of diversification, that is, the split of an ancestral population into distinct descendent lineages, is a by-product of geographical separation. This traditional view misses out on the important perspective that diversification itself can be an adaptive process. In this talk I will review recent theoretical work showing that diversification as an adaptive response to biological interactions is a plausible evolutionary process. This work is based on the mathematical framework of adaptive dynamics, and in particular on the phenomenon of evolutionary branching due to frequency-dependent ecological interactions. I will describe basic models for evolutionary branching based on resource competition, as well as models of diversification in spatially structured populations. I will then describe ongoing efforts to test the theory of evolutionary branching in evolving Escherichia coli populations, which provide a promising experimental model system for studying adaptive diversification.
Viroids are small (246-399 nucleotides), single-stranded, covalently closed circular RNAs that replicate autonomously in susceptible plant cells. Usually they induce strong pathologies in crops and, hence, represent a tremendous agronomical problem. It has been proposed that the different viroids species have a monophyletic origin and may represent relics of the hypothetical precellular RNA world. A key characteristic of viroids, and perhaps their only known phenotype, is a highly complex secondary structure in which different domains are involved into different phases of replication cycle and into interaction with host-factors. While members of the Pospiviroidae family require cellular factors to complete every step in their replication cycle, members of the Avsunviroidae family contain a hammerhead-type ribozyme that self-catalyze the production of monomers from the multimeric intermediates of replication.
In the last few years, we have been using viroid species to explore the evolution of mutational robustness in simple replicons. Our approach is twofold. First, we are using an in silico approach to quantify the effect of all possible mutations in RNA folding shape and stability. We are also exploring the sign and strength of epistasis among pairs of random mutations. Our results show that the two families of viroids have significant differences in terms of robustness. On average, the size of the neutral neighbourhood for the Avsunviroidae is about twice as larger as for the Pospiviroidae. While antagonistic epistasis is a common feature for all viroids, on average Avsunviroidae show a larger epistasis coefficient than Pospiviroidae. Finally, we found a negative correlation between epistasis coefficient and the average effect of deleterious mutations.
Our second approach is experimental. We have focus in the validation of the "survival of the flattest" hypothesis. The "survival of the fittest" represents the classic paradigm of Darwinian evolution by which genotypes with high growth rates are favored by natural selection. However, if mutation rate is so high that each newly synthesized genotype carries more mutations than its progenitor, a genotype showing robustness against deleterious mutational effects would be favored by natural selection instead of the faster replicator, even at the cost of a low growth rate. This situation has been dubbed as "the survival of the flattest" by C. O. Wilke et al. in a reference to the low and flat fitness peaks occupied by robust organisms. So far, this concept has only been proved in digital organisms. Here, we show that it is of application for biological entities by analyzing the accumulation of two viroid species coinfecting the same plant. Under optimal growth conditions, CSVd, a pospiviroid characterized by a high population growth rate and genetic homogeneity out-competed an avsunviroid species, CChMVd, with low population growth rate and high genetic variability. In contrast, CChMVd was able of out-competing CSVd when mutation rate was artificially increased. The experimental results are confirmed by an in silico model of competing quasispecies.
RNA viruses are of great biological importance because of their role as agents of disease and their presumed similarity to the replicating molecules that inhabited the "RNA world". Herein I will present an overview of the patterns and mechanisms of genome evolution in RNA viruses. I will begin by reviewing what we know of the phylogenetic history of RNA viruses, revealed from whole genome analyses. The bad news is that these genomes are so diverse in sequence and genome structure that constructing a virus "tree of life" may ultimately be a futile exercise. I will then discuss how RNA genomes evolve. In contrast to eukaryotes and bacteria, gene duplication and lateral gene transfer do not appear to be important mechanism of evolutionary change (although I will show that cellular genomes have "captured" a variety of viral genes). Rather, the process of genome evolution in RNA viruses is in a large part determined by the need to retain a small genome size and by a high rate of deleterious mutation. This, in turn, means that the evolution of RNA viruses is characterized by complex fitness trade-offs and epistatic interactions. Finally I will show that recombination is unlikely to be an adaptation for sexual reproduction but rather acts to control gene expression.
Given an organism with a set of identical parts, in the absence of selection, random variation will produce differences among the parts in the next generation. And in the next generation, random variation will tend to make the parts even more different from each other. In the absence of selection, this spontaneous rise in internal differentiation should continue indefinitely, without limit. Internal differentiation is a type of complexity, what I call pure complexity, divorced from any notion of function. It follows that there should be a vector in evolution tending to increase the internal pure-complexity of organisms. The vector should act pervasively. In the absence of selection, complexity should increase in every set of parts (not just initially identical ones), in every property of those parts, in every species, over the entire history of life. What about natural selection? I will make two points: 1) Complexity in the sense of differentiation - pure complexity - doesn't need natural selection. It arises spontaneously, from the accumulation of variation. 2) We often marvel at the complexity of modern organisms. Given a universal complexity vector, however, the question is not why they are complex, but why they are not more complex. Complexity does increase sometimes, but not in every generation in every species in history, so something must be opposing the vector. The only force we know that could oppose it is natural selection, selection against complexity. It must be that selection opposes complexity, and hugely.
Vesicular stomatitis virus (VSV) is a useful model to unravel the rules that govern the evolution of RNA virus populations. We carried out different experiments to test whether alternation between insect infection and mammalian infection could delay the rate of evolution in VSV and other arthropod-borne viruses (arboviruses). We observed that during acute experimental passages there is no tradeoff between infection of the two cell types, nor decreased evolutionary rates in time-dependent regimes. In contrast, there was tradeoff between acute and persistent infections. Persistent infections had a dominant role in determining how VSV evolved, independently of the inclusion of periodic mammalian replication. All selective regimes resulted in a high degree of phenotypic and genotypic parallel evolution. We determined how low-fitness, persistent populations recovered in mammalian cells, and observed that the changes in the sequences of higher-fitness populations could only be explained by the preexistence of a minority of mammalian-adapted genomes in the persistent populations that become dominant after the environmental switch. Furthermore, these minority genomes are likely to coexist during persistence with dominant, persistence-adapted genomes for relatively long periods of time. We propose that complementation contributes to the extended survival of minority, mammalian-adapted genomes in our studies.
Digital Organisms are self-replicating computer programs that exist in a complex digital environment. The organisms are subject to mutations and limited resources leading the populations to evolve by natural selection. We have been able to use to digital organisms for a variety of fundamental experiments to study evolutionary dynamics. In this talk, I present an overview of research with digital organism, with a special focus on the evolutionary origin of complex traits.
When Darwin first proposed his theory of evolution by natural selection, he realized that it had a problem explaining the origins of complex features such as the vertebrate eye. Darwin noted that "In considering transitions of organs, it is so important to bear in mind the probability of conversion from one function to another.'' That is, populations do not evolve complex new features de novo, but instead modify existing, less complex features for use as building blocks of the new feature. Darwin further hypothesized that "Different kinds of modification would [...] serve for the same general purpose'', noting that just because any one particular complex solution may be unlikely, there may be many other possible solutions, and we only witness the single one lying on the path evolution took. As long as the aggregate probability of all solutions is high enough, the individual probabilities of the possible solutions are almost irrelevant.
Substantial evidence now exists that supports Darwin's general model for the evolution of complexity, but it is still difficult to provide a complete account of the origin of any complex feature due to the extinction of the intermediate forms, imperfection of the fossil record, and incomplete knowledge of the genetic and developmental mechanisms that produce such features. Digital evolution has allowed us to surmount these difficulties and track all genotypic and phenotypic changes during the evolution of a complex trait, with enough replication to obtain statistically powerful results.
Molecular interaction networks at the genome scale have evolved to exhibit rich and complex structures with a number of interesting and clean topological features, including the power law behavior of the distributions of certain centrality measures and the small-world effect. Perhaps the most powerful evidence that these networks are not really "random" but rather organized in some fashion comes from studies in the recent past that report correlations between aspects of gene/protein function and the position of the corresponding gene/protein in an interaction network. Examples of such correlations include the preponderance of essential proteins among the hubs of a protein interaction network, evolutionary conservation of "date" hubs, and a positive correlation between the number of common nearest neighbors of two genes in a synthetic genetic network and the presence of a physical interaction between their protein products. We are therefore interested in the question of whether it is possible to systematically elucidate gene/protein function using topological properties within interaction networks. In this context, we present results from our work on three problems: the problem of extracting functionally meaningful sub-networks from protein interaction networks in a systematic manner, the problem of predicting synthetic lethality from protein interaction networks, and an analysis of correlations between single-node properties in protein-protein, DNA-protein, and genetic networks. Our results show that mathematical network analysis can often serve to elucidate function and is therefore complementary to other homology-based methods.
Information theory was introduced by Claude Shannon in 1948 to precisely characterize data flows in communications systems. The same mathematics can also be fruitfully applied to molecular biology problems. We start with the problem of understanding how proteins interact with DNA at specific sequences called binding sites. Information theory allows us to make an average picture of the binding sites and this can be shown with a computer graphic called a sequence logo (http://www.ccrnp.ncifcrf.gov/~toms/glossary.html#sequence_logo).
Sequence logos show how strongly parts of a binding site are conserved in bits of information. They have been used to study a variety of genetic control systems. More recently the same mathematics has been used to look at individual binding sites using another computer graphic called a sequence walker (http://www.ccrnp.ncifcrf.gov/~toms/glossary.html#sequence_walker). Sequence walkers are being used to predict whether changes in human genes cause mutations or are neutral polymorphisms. It may be possible to predict the degree of colon cancer by this method.
How do genetic systems gain information by evolutionary processes? Information theory was used to observe information gain in the binding sites for an artificial `protein' in a computer model of evolution. The model begins with zero information and, as in naturally occurring genetic systems, the information measured in the fully evolved binding sites is close to that needed to locate the sites in the genome. The transition is rapid, demonstrating that information gain can occur by punctuated equilibrium. (http://www.ccrnp.ncifcrf.gov/~toms/paper/ev).
Mutational (genetic) robustness is phenotypic constancy in the face of mutational changes to the genome. Robustness is critical to the understanding of evolution because phenotypically expressed genetic variation is the fuel of natural selection. Nonetheless, the evidence for adaptive evolution of mutational robustness in biological populations is controversial. Robustness should be selectively favored when mutation rates are high, a common feature of RNA viruses. However, selection for robustness may be relaxed under virus co-infection because complementation between virus genotypes can buffer mutational effects. We therefore hypothesized that selection for genetic robustness in viruses will be weakened with increasing frequency of co-infection. To test this idea, we used populations of RNA phage phi-6 that were experimentally evolved at low and high levels of co-infection and subjected lineages of these viruses to mutation accumulation through population bottlenecking. The data demonstrate that viruses evolved under high co-infection show relatively greater mean magnitude and variance in the fitness changes generated by addition of random mutations, confirming our hypothesis that they experience weakened selection for robustness. Our study further suggests that co-infection of host cells may be advantageous to RNA viruses only in the short term. In addition, we observed higher mutation frequencies in the more robust viruses, indicating that evolution of robustness might foster less-accurate genome replication in RNA viruses.
Different genes in yeast evolve at dramatically different rates, spanning three orders of magnitude from the fastest evolving to the slowest evolving gene. What is the cause of these differences in evolutionary rate, and what can we learn about genes from their evolutionary rate? It is generally believed that the slowly evolving genes do so because of strong evolutionary constraints. Many different constraints have been proposed, including the genes' importance to the cell (dispensability), the number of interaction partners in the protein interaction network, the genes' size, and the genes' expression levels. I will review the evidence for and against these theories, and will demonstrate that the major determinant of evolutionary rate in yeast is related to how frequently a gene is translated. I will then explain how selective pressure to avoid translation-error-induced misfolding of proteins can slow down the rate of evolution of frequently translated genes, and demonstrate that the predictions from this hypothesis agree well with the available data from yeast.
Recent genome analyses revealed intriguing correlations between variables characterizing the functioning of a gene, such as expression level, connectivity of genetic and protein-protein interaction networks, and knockout effect, and variables describing gene evolution, such as sequence evolution rate and propensity for gene loss. Typically, variables within each of these classes are positively correlated, e.g., products of highly expressed genes also tend to have many protein-protein interactions, whereas variables between classes are negatively correlated, e.g., highly expressed genes tend to evolve slowly. Here we describe principal component (PC) analysis of 7 genome-related variables and propose biological interpretations for the first three principal components. The first two PCs together reflect the intuitive notion of a gene's "importance", or the "status" of a gene in the genomic community, with positive contributions from knockout lethality, expression level and the number of paralogs, and negative contributions from sequence evolution rate and gene loss propensity. The third PC may be interpreted as a gene's "adaptability" whereby genes with high adaptability evolve fast, are relatively often lost during evolution, readily duplicate and are highly expressed, but only under certain conditions. Functional classes of genes substantially vary in status and adaptability, with the highest status characteristic of the translation system and cytoskeletal proteins, and highest adaptability seen in metabolic enzymes and transporters.