December 30, 2016
Assuming that mutation and fixation processes are reversible Markov processes, we prove that the equilibrium ensemble of sequences obeys a Boltzmann distribution with $\exp(4N_e m(1 - 1/(2N)))$, where $m$ is Malthusian fitness and $N_e$ and $N$ are effective and actual population sizes. On the other hand, the probability distribution of sequences with maximum entropy that satisfies a given amino acid composition at each site and a given pairwise amino acid frequency at each s...
October 19, 2004
Many modified genetic codes are found in specific genomes in which one or more codons have been reassigned to a different amino acid from that in the canonical code. We present a model that unifies four possible mechanisms for reassignment, based on the observation that reassignment involves a gain and a loss. The loss could be the deletion or loss of function of a tRNA or release factor. The gain could be the gain of a new type of tRNA for the reassigned codon, or the gain o...
October 16, 2012
Weak purifying selection, acting on many linked mutations, may play a major role in shaping patterns of molecular evolution in natural populations. Yet efforts to infer these effects from DNA sequence data are limited by our incomplete understanding of weak selection on local genomic scales. Here, we demonstrate a natural symmetry between weak and strong selection, in which the effects of many weakly selected mutations on patterns of molecular evolution are equivalent to a sm...
March 22, 2018
These lecture notes introduce key concepts of mathematical population genetics within the most elementary setting and describe a few recent applications to microbial evolution experiments. Pointers to the literature for further reading are provided, and some of the derivations are left as exercises for the reader.
March 31, 2017
Recent experiments and simulations have demonstrated that proteins can fold on the ribosome. However, the extent and generality of fitness effects resulting from co-translational folding remain open questions. Here we report a genome-wide analysis that uncovers evidence of evolutionary selection for co-translational folding. We describe a robust statistical approach to identify loci within genes that are both significantly enriched in slowly translated codons and evolutionari...
October 20, 2004
Evolving genomes increase a number of their genes by gene duplications. To escape degradation in a functionless pseudogene, any gene duplicate needs to be guarded by negative (purifying) selection from otherwise inevitable fixation of degenerative mutations. In the present study we focus on the evolutionary stage at which new duplicates come under such surveillance. Our analyses of several genomes indicate that in about 10% gene pairs, selection begins to guard a new gene c...
September 25, 2019
A major goal of molecular evolutionary biology is to identify loci or regions of the genome under selection versus those evolving in a neutral manner. Correct identification allows accurate inference of the evolutionary process and thus comprehension of historical and contemporary processes driving phenotypic change and adaptation. A fundamental difficulty lies in distinguishing sites targeted by selection from both sites linked to these targets and sites fully independent of...
August 21, 2009
A quantitative theory on the construction and the evolution of the genetic code is proposed. Through introducing the concept of mutational deterioration (MD) and developing a theoretical formalism on MD minimization we have proved: 1, the redundancy distribution of codons in the genetic code obeys MD minimization principle; 2, the hydrophilic-hydrophobic distribution of amino acids on the code table is global MD (GMD) minimal; 3, the standard genetic code can be deduced from ...
August 6, 1997
A survey of the patterns of synonymous codon preferences in the HIV env gene reveals a relation between the codon bias and the mutability requirements in different regions in the protein. At hypervariable regions in $gp120$, one finds a greater proportion of codons that tend to mutate non-synonymously, but to a target that is similar in hydrophobicity and volume. We argue that this strategy results from a compromise between the selective pressure placed on the virus by the in...
December 9, 2004
In special coordinates (codon position--specific nucleotide frequencies) bacterial genomes form two straight lines in 9-dimensional space: one line for eubacterial genomes, another for archaeal genomes. All the 348 distinct bacterial genomes available in Genbank in April 2007, belong to these lines with high accuracy. The main challenge now is to explain the observed high accuracy. The new phenomenon of complementary symmetry for codon position--specific nucleotide frequencie...