Computing Reviews, the leading online review service for computing literature.

Search

Probabilistic parsing strategies
Nederhof M., Satta G. Journal of the ACM53 (3):406-436,2006.Type:Article

Date Reviewed: May 29 2007

Probabilistic parsing is an area of natural language processing that has attracted considerable attention from researchers in the last decade. This growing interest has been fostered by widespread availability of corpora (or treebanks), which have served as benchmarks for much empirical work. Yet, while heuristics for improving parsing accuracy have been intensively investigated, the theoretical foundations of probabilistic parsing have received far more scattered scrutiny. Nederhof and Satta’s paper tackles the latter issue, exploring the relationship between well-understood, purely symbolic context-free parsing strategies and their less-understood probabilistic counterparts. The authors start by characterizing a general notion of parsing strategy as a mapping from context-free grammars to equivalent push-down automata, thus abstracting away the manner in which particular parsing algorithms handle nondeterminism. According to this view, a probabilistic parsing strategy is one such mapping where rules and transitions are assigned probabilities. The main question addressed in the paper is therefore the following: Under which conditions can a parsing strategy be employed to map probabilistic context-free grammars to probabilistic push-down automata so that the probability distribution induced on the former is preserved in the latter? The paper shows, through mapping to push-down transducers, that parsing strategies whose transducers allow “dead” computations (those that lack the correct prefix property) cannot be extended to become probabilistic parsing strategies. A “strong predictiveness property” is also defined, which, together with the correct prefix property, gives a sufficient condition for the extendibility of a parsing strategy. In light of these properties, the authors analyze various parsing algorithms, showing for instance that generalized LR parsing, which has the correct prefix but lacks the strong predictiveness property, cannot be extended to become a probabilistic parsing strategy. It is slightly disappointing that the paper says very little about lexicalized parsing, which is currently the topic of greatest practical interest in the area. Still, the paper is timely and contains plenty of practical as well as theoretical contributions to recommend its reading.

Reviewer: Saturnino Luz	Review #: CR134330 (0804-0381)

Parsing (F.4.2 ... )

Language Models (I.2.7 ... )

Statistical Computing (G.3 ... )

Natural Language Processing (I.2.7 )

Probability And Statistics (G.3 )

Would you recommend this review?

yes

Other reviews under "Parsing":	Date

Parsing theory volume 2: LR(K) and LL(K) parsing Sippu S., Soisalon-Soininen E., Springer-Verlag New York, Inc., New York, NY, 1990. Type: Book (9780387517322)	Oct 1 1991

Upper bounds on the size of LR(k) parsers Ukkonen E. Information Processing Letters 20(2): 99-103, 1985. Type: Article	Nov 1 1985

LR parsing: theory and practice Chapman N., Cambridge University Press, New York, NY, 1987. Type: Book (9789780521304139)	Sep 1 1988

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy