Computing Reviews, the leading online review service for computing literature.

Search

Graph data management : fundamental issues and recent developments
Fletcher G., Hidders J., Larriba-Pey J., Springer International Publishing, New York, NY, 2018. 186 pp. Type: Book (978-3-319961-92-7)

Date Reviewed: Nov 13 2020

Graph databases are now key in supporting semantic web and linked open data, as well as in application domains for software engineering, geographical databases, social networks, telecommunication, bioinformatics, chemistry, and datasets like DBLP. Leading experts in the area of graph data management give an overview of current developments and recent advances in both theory and practice. Chapter 1 provides an introduction to graph data management. It includes “the research and development of powerful technologies for storing, processing, and analyzing large volumes of graph data.” It introduces typical graph database models, including a classification of graph queries based on the usual formal apparatus. While it presents “a short review of graph query languages,” it only contains a few short examples of queries. The chapter also mentions graph database management systems (GDBMS), graph database systems (GDBS), and graph processing frameworks, and includes a list of related software products. Chapter 2 is devoted to graph visualization, that is, the processes of creating and drawing a graph in a humanly understandable way. It discusses the geometric properties of graph visualization methods. The authors present two general methods: the topology-shape-metrics approach, and the energy-based layout method. Then they shortly discuss some other methods appropriate for special types of graphs, such as directed graphs and trees. A special problem related to GDBS is motif discovery, that is, finding subgraphs that often appear statistically significant. This task occurs, for example, “in biological, natural, and even economic systems.” Chapter 3 presents “a brief [overview] of the state of the art in motif finding.” The approach is based on gTrie data structures extended to support labels; this enables the fast finding of the subgraphs because the number of possible subgraphs could grow exponentially. The chapter offers algorithms in detail and possibilities of indexes for querying motifs. The rest of the chapter is devoted to alternative methods for calculating statistical significance. The longest chapter (4), “Applications of Flexible Querying to Graph Data,” covers graph query languages, applications, [and] flexible querying techniques and implementations.” Flexibility in this context refers to automatic “changes to the user’s query so as to find additional or different answers,” thus helping users retrieve relevant information. The author focuses on query relaxation and approximate answering in the context of resource description framework (RDF) data and the SPARQL query language, including the computation of answers for associated queries. All these approaches are based on sound theoretical foundations and nontrivial figures. This chapter is one of the most successful parts of the book. Chapter 5 focuses on the parallel processing of large graphs; representative graph processing systems and general design principles; and various graph computation paradigms. Typically, for GDBS, “different graph representations lead to different computation [problems], each of which “may be suitable for solving a certain range of problems.” The authors review “a few graph computation paradigms for online query processing and offline analytics.” For the former, they consider distributed large graphs with an asynchronous fan-out search and index-free query processing. The latter is based on the “MapReduce computation paradigm and vertex-centric computation paradigm.” The chapter ends with valuable tables containing, for example, representative graph processing systems and their properties. A survey of subgraph matching algorithms documents that indexing is not feasible in large graphs for some types of queries; consequently, index-free query processing is necessary. Chapter 6 finalizes the book and offers a survey of “benchmarking approaches for graph processing systems.” The authors describe the main features of such benchmarks and then look at “benchmarks for RDF databases, benchmarks for graph databases, benchmarks for parallel and distributed graph processing systems, and data-only benchmarks.” The detailed tables sufficiently document this analysis. In conclusion, the book is recommended for almost anyone working on the storage and processing of graph data--not beginners, but rather information technology (IT) professionals. The book may benefit specialists developing new GDBMS, GDBS, and graph processing frameworks.

Reviewer: J. Pokorny	Review #: CR147107 (2104-0072)

Database Manager (H.2.4 ... )

Distributed Databases (H.2.4 ... )

General (H.2.0 )

Data Structures (E.1 )

Would you recommend this review?

yes

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy