Computing Reviews, the leading online review service for computing literature.

Search

Literature extraction of protein functions using sentence pattern mining
Chiang J., Yu H. IEEE Transactions on Knowledge and Data Engineering17 (8):1088-1098,2005.Type:Article

Date Reviewed: Feb 2 2006

Modern high-throughput biology has demanded that computationally accessible descriptions of the roles of proteins be available. The gene ontology (GO) controlled vocabulary for annotation of protein function, biological process, and cellular location has become the community standard for protein annotation. Chiang and Yu describe an approach to sentence pattern mining that automatically extracts GO terms for proteins from the biomedical literature. GO annotation has been one of the recent tasks for the Text Retrieval Conference (TREC) information retrieval competition. Automated methods for extraction of GO term-protein relationships must first identify the natural language expressions that correspond to both protein names and GO terms. Extraction of protein names is relatively well studied, and Chiang and Yu found it to be the easier of the two tasks. Extraction of GO terms relies on recognition of variants based on morphological, syntactic, and semantic rules. After GO terms and protein names are identified, sentences in which both a GO term and protein name co-occur are parsed to obtain phrases describing protein function. The phrase structure is used as input for sentence pattern mining. Chiang and Yu’s results with the TREC 2003 data are comparable to the benchmark results, although the comparison is somewhat difficult to make with their figures. This paper provides a nice overview of the critical issues that must be addressed for extracting GO-protein relationships from literature, describes a promising approach for automated extraction, illustrates the difficulty of extracting GO terms at the same depth in the hierarchy as obtained by manual annotation, and points out promising avenues for future research.

Reviewer: Susan Bridges	Review #: CR132392 (0609-0962)

Selection Process (H.3.3 ... )

Biology And Genetics (J.3 ... )

Data Mining (H.2.8 ... )

Search Process (H.3.3 ... )

Text Processing (I.5.4 ... )

Applications (I.5.4 )

Would you recommend this review?

yes

Other reviews under "Selection Process":	Date

Natural-language retrieval of images based on descriptive captions Guglielmo E., Rowe N. ACM Transactions on Information Systems 14(3): 237-267, 1996. Type: Article	Apr 1 1997

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy