May 21, 1999
With the help of a simple 20 letters, lattice model of heteropolymers, we investigate the energy landscape in the space of designed good-folder sequences. Low-energy sequences form clusters, interconnected via neutral networks, in the space of sequences. Residues which play a key role in the foldability of the chain and in the stability of the native state are highly conserved, even among the chains belonging to different clusters. If, according to the interaction matrix, som...
May 14, 2004
We have exactly enumerated all sequences and conformations of HP proteins with chains of up to 19 monomers on the simple cubic lattice. For two variants of the hydrophobic-polar (HP) model, where only two types of monomers are distinguished, we determined and statistically analyzed designing sequences, i.e., sequences that have a non-degenerate ground state. Furthermore we were interested in characteristic thermodynamic properties of HP proteins with designing sequences. In o...
January 2, 2002
The hydrophobic/polar HP model on the square lattice has been widely used to investigate basics of protein folding. In the cases where all designing sequences (sequences with unique ground states) were enumerated without restrictions on the number of contacts, the upper limit on the chain length N has been 18-20 because of the rapid exponential growth of the numbers of conformations and sequences. We show how a few optimizations push this limit by about 5 units. Based on thes...
March 28, 2003
Only about 1,000 qualitatively different protein folds are believed to exist in nature. Here, we review theoretical studies which suggest that some folds are intrinsically more designable than others, {\it i.e.} are lowest energy states of an unusually large number of sequences. The sequences associated with these folds are also found to be unusually thermally stable. The connection between highly designable structures and highly stable sequences is generally known as the "de...
December 15, 1998
Analysis of the geometric properties of a mean-field HP model on a square lattice for protein structure shows that structures with large number of switch backs between surface and core sites are chosen favorably by peptides as unique ground states. Global comparison of model (binary) peptide sequences with concatenated (binary) protein sequences listed in the Protein Data Bank and the Dali Domain Dictionary indicates that the highest correlation occurs between model peptides ...
July 10, 2002
We study the space of all compact structures on a two-dimensional square lattice of size $N=6\times6$. Each structure is mapped onto a vector in $N$-dimensions according to a hydrophobic model. Previous work has shown that the designabilities of structures are closely related to the distribution of the structure vectors in the $N$-dimensional space, with highly designable structures predominantly found in low density regions. We use principal component analysis to probe and c...
September 6, 1997
Protein structures are a very special class among all possible structures. It was suggested that a ``designability principle'' plays a crucial role in nature's selection of protein sequences and structures. Here we provide a theoretical base for such a selection principle, using a novel formulation of the protein folding problem based on hydrophobic interactions. A structure is reduced to a string of 0's and 1's which represent the surface and core sites, respectively, as the...
August 23, 2002
Here we present an approximate analytical theory for the relationship between a protein structure's contact matrix and the shape of its energy spectrum in amino acid sequence space. We demonstrate a dependence of the number of sequences of low energy in a structure on the eigenvalues of the structure's contact matrix, and then use a Monte Carlo simulation to test the applicability of this analytical result to cubic lattice proteins. We find that the lattice structures with th...
September 17, 2001
Despite the variety of protein sizes, shapes, and backbone configurations found in nature, the design of novel protein folds remains an open problem. Within simple lattice models it has been shown that all structures are not equally suitable for design. Rather, certain structures are distinguished by unusually high designability: the number of amino-acid sequences for which they represent the unique ground state; sequences associated with such structures possess both robustne...
November 25, 1998
Making use of a simplified model for protein folding, it can be shown that conformations which are particularly stable when their energy is minimized with respect to amino acid sequence (in the sense that they display a large energy gap to the lowest structrally dissimilar conformation), aside from leading to fast folding, are highly designable (in the sense that many sequences target onto it in the folding process). These results are quite general, do not depend on the parti...