January 2, 2025
Proteins are sequences of amino acids that serve as the basic building blocks of living organisms. Despite rapidly growing databases documenting structural and functional information for various protein sequences, our understanding of proteins remains limited because of the large possible sequence space and the complex inter- and intra-molecular forces. Deep learning, which is characterized by its ability to learn relevant features directly from large datasets, has demonstrat...
June 11, 2020
The prediction of protein secondary and tertiary structures from the primary amino acid sequence is both an incredibly important and incredibly difficult problem. Accurate prediction of a protein's native structure can provide critical insights about its function, ultimately leading to breakthoughs in drug design and disease diagnosis. The field has a rich history, from the earliest folding experiments in the 1960's to the use of state-of-the-art algorithms today; this articl...
April 17, 2018
MOTIVATION: Proteins fold into complex structures that are crucial for their biological functions. Experimental determination of protein structures is costly and therefore limited to a small fraction of all known proteins. Hence, different computational structure prediction methods are necessary for the modelling of the vast majority of all proteins. In most structure prediction pipelines, the last step is to select the best available model and to estimate its accuracy. This ...
October 27, 2022
Deep Learning and big data have shown tremendous success in bioinformatics and computational biology in recent years; artificial intelligence methods have also significantly contributed in the task of protein function classification. This review paper analyzes the recent developments in approaches for the task of predicting protein function using deep learning. We explain the importance of determining the protein function and why automating the following task is crucial. Then...
August 13, 2018
Beta-turn prediction is useful in protein function studies and experimental design. Although recent approaches using machine-learning techniques such as SVM, neural networks, and K-NN have achieved good results for beta-turn pre-diction, there is still significant room for improvement. As previous predictors utilized features in a sliding window of 4-20 residues to capture interactions among sequentially neighboring residues, such feature engineering may result in incomplete ...
August 28, 2017
Computational elucidation of membrane protein (MP) structures is challenging partially due to lack of sufficient solved structures for homology modeling. Here we describe a high-throughput deep transfer learning method that first predicts MP contacts by learning from non-membrane proteins (non-MPs) and then predicting three-dimensional structure models using the predicted contacts as distance restraints. Tested on 510 non-redundant MPs, our method has contact prediction accur...
November 6, 2013
The thesis is aimed to solve the template-free protein folding problem by tackling two important components: efficient sampling in vast conformation space, and design of knowledge-based potentials with high accuracy. We have proposed the first-order and second-order CRF-Sampler to sample structures from the continuous local dihedral angles space by modeling the lower and higher order conditional dependency between neighboring dihedral angles given the primary sequence informa...
September 12, 2017
Motivation: Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning, which has been successfully applied to various research fields such as image classification and voice recognition, provides a new opportunity to significantly improve the secondary structure prediction accuracy. Although several deep-learning methods have been developed for secondary structure prediction, there is room ...
December 2, 2015
Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Ra...
November 29, 2018
The inapplicability of amino acid covariation methods to small protein families has limited their use for structural annotation of whole genomes. Recently, deep learning has shown promise in allowing accurate residue-residue contact prediction even for shallow sequence alignments. Here we introduce DMPfold, which uses deep learning to predict inter-atomic distance bounds, the main chain hydrogen bond network, and torsion angles, which it uses to build models in an iterative f...