September 3, 2002
Neutral evolution is the simplest model of molecular evolution and thus it is most amenable to a comprehensive theoretical investigation. In this paper, we characterize the statistical properties of neutral evolution of proteins under the requirement that the native state remains thermodynamically stable, and compare them to the ones of Kimura's model of neutral evolution. Our study is based on the Structurally Constrained Neutral (SCN) model which we recently proposed. We sh...
October 10, 2024
Evolutionary theorizing resembles building an aircraft while also piloting it; new results change the scaffold for older ideas, requiring revised strategy to remain airborne. A calculated kinetic pathway exists that, under explicit quantitative assumptions, delivers the SGC (Standard Genetic Code). The pathway and evidence for it is summarized below, striving for a clearer, more complete account than was possible during its construction. Beginning with experimental amino acid...
November 24, 2017
The standard genetic code is well known to be optimized for minimizing the phenotypic effects of single nucleotide substitutions, a property that was likely selected for during the emergence of a universal code. Given the fitness advantage afforded by high standing genetic diversity in a population in a dynamic environment, it is possible that selection to explore a large fraction of the space of functional proteins also occurred. To determine whether selection for such a pro...
February 14, 2001
The evolution in coding DNA sequences brings new flexibility and freedom to the codon words, even as the underlying nucleotides get significantly ordered. These curious contra-rules of gene organisation are observed from the distribution of words and the second moments of the nucleotide letters. These statistical data give us the physics behind the classification of bacteria.
January 14, 2003
We have found that the effective survival time of amino acids in organisms follows a power law with respect to frequency of their occurrence in genes. We have used mutation data matrix PAM1 PET91 to calculate selection pressure on each kind of amino acid. The results have been compared to MPM1 matrix (Mutation Probability Matrix) representing the pure mutational pressure in the Borrelia burgdorferi genome.The results are universal in the sense that the survival time of amino ...
December 30, 2015
The common understanding of protein evolution has been that neutral or slightly deleterious mutations are fixed by random drift, and evolutionary rate is determined primarily by the proportion of neutral mutations. However, recent studies have revealed that highly expressed genes evolve slowly because of fitness costs due to misfolded proteins. Here we study selection maintaining protein stability. Protein fitness is taken to be $s = \kappa \exp(\beta\Delta G) (1 - \exp(\be...
September 29, 2011
Complete genome sequences contain valuable information about natural selection, but extracting this information for short, widely scattered noncoding elements remains a challenging problem. Here we introduce a new computational method for addressing this problem called Inference of Natural Selection from Interspersed Genomically coHerent elemenTs (INSIGHT). INSIGHT uses a generative probabilistic model to contrast patterns of polymorphism and divergence in the elements of int...
August 6, 1997
We argue that the phenomenon of symmetry breaking in genetics can enhance the adaptability of a species to changes in the environment. In the case of a virus, the claim is that the codon bias in the neutralization epitope improves the virus' ability to generate mutants that evade the induced immune response. We support our claim with a simple ``toy model'' of a viral epitope evolving in competition with the immune system. The effective selective advantage of a higher mutabili...
June 4, 2021
During their evolution, proteins explore sequence space via an interplay between random mutations and phenotypic selection. Here we build upon recent progress in reconstructing data-driven fitness landscapes for families of homologous proteins, to propose stochastic models of experimental protein evolution. These models predict quantitatively important features of experimentally evolved sequence libraries, like fitness distributions and position-specific mutational spectra. T...
July 29, 2008
The genetic code is nearly universal, and the arrangement of the codons in the standard codon table is highly non-random. The three main concepts on origin and evolution of the code are the stereochemical theory; the coevolution theory; and the error minimization theory. These theories are not mutually exclusive and are also compatible with the frozen accident hypothesis. Mathematical analysis of the structure and possible evolutionary trajectories of the code shows that it i...