Computing Reviews, the leading online review service for computing literature.

Search

Accurate prediction of protein disordered regions by mining protein structure data
Cheng J., Sweredoski M., Baldi P. Data Mining and Knowledge Discovery11 (3):213-222,2005.Type:Article

Date Reviewed: Oct 11 2006

Disordered regions in proteins have an essential role, at least with respect to their biological properties and behavior, and methods of investigation. Therefore, it is important to determine and predict the locations of these regions. The authors report on their method, program, and results for predicting intrinsically disordered regions in proteins. The method is based on the structural protein data bank (PDB) available at the University of California. This project is not the first the authors have completed: the program DISpro is one of the protein data mining tools developed at the Institute for Genomics and Bioinformatics at the University of California, Irvine. These tools are available for non-profit applications through the Internet. Tutorials on bioinformatic themes are also provided. Some of these can be of help to newcomers wanting to understand this paper. The authors have developed a sophisticated method, based on recursive neural networks, to take into account long-range contextual information for determining a fixed number of weights during the learning process. Starting with the structural properties, predicted secondary structure class, and predicted relative solvent accessibility of nonredundant protein chains selected from the PDB, the network was trained and tested by ten-fold cross-validation. The resulting network was tested on CASP5, containing essentially different proteins from PDB. The precision of the prediction power of DISpro overtakes those of the other predictors tested on CASP5. Finally, the paper lists some ideas about how to refine the method, by taking into account short and long disordered regions separately (proven by the authors, using DISpro, to behave differently), and also presents predictions for homolog proteins. Further variations of the method could be incorporated into these for protein tertiary structure prediction. The method might be used to cross-relate different types of protein databases: structural, pathway, and protein interaction.

Reviewers: K. Balogh, Zsofia Balogh	Review #: CR133422 (0708-0813)

Data Mining (H.2.8 ... )

Biology And Genetics (J.3 ... )

Pattern Analysis (I.5.2 ... )

Pattern Matching (F.2.2 ... )

Structural (I.5.1 ... )

Design Methodology (I.5.2 )

Would you recommend this review?

yes

Other reviews under "Data Mining":	Date

Feature selection and effective classifiers Deogun J. (ed), Choubey S., Raghavan V. (ed), Sever H. (ed) Journal of the American Society for Information Science 49(5): 423-434, 1998. Type: Article	May 1 1999

Rule induction with extension matrices Wu X. (ed) Journal of the American Society for Information Science 49(5): 435-454, 1998. Type: Article	Jul 1 1998

Predictive data mining Weiss S., Indurkhya N., Morgan Kaufmann Publishers Inc., San Francisco, CA, 1998. Type: Book (9781558604032)	Feb 1 1999

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy