Compact data structures address two aspects of the problem posed by large quantities of data: efficient storage of the data, and efficient access and processing of the data by the algorithms suitable for defined classes of problems. (The concept is closely related to succinct data structures, but allows for more divergence from information-theoretic lower bounds.) The field deals with the design of data structures that manage the interactions between space requirements for data storage and supporting the desired operations (generally query) as efficiently as is feasible. These interactions are not necessarily conflicts, and managing them efficiently requires that consideration be given to machine architecture issues (such as cache capacity, main memory, and “disk” capacity--even if the last is a modern solid-state device) as well as representational structure, algorithms for the desired operations, and pure space and time complexity. This book focuses on theory, data structures, and algorithms, and is intended as an introduction to the field of compact data structures, focusing on those that have been implemented or whose implementation is believed by the author to be feasible and efficient.

The introductory chapter is followed by an introduction to basic concepts in information theory and data compression. Ten chapters follow, dealing with increasingly advanced aspects of the field, followed by a final chapter that surveys research topics and specialized classes of problems. Concepts are discussed progressively and even readers interested primarily in one of the more advanced applications mentioned in one of the later chapters will find it necessary to read--or at least skim--some of the material in the preceding chapters. Chapter 12 deals with dynamic data structures, which allow updates as well as queries. Examples of applications of the theory and techniques are briefly discussed in each appropriate chapter. For example, chapters 7 and 8 include (among other things) examples of applying the concepts to storage of Extensible Markup Language (XML) documents; Chapter 10, on grid problems, discusses window queries for map services for street networks; and chapter 11 covers more advanced text retrieval applications.

Bibliographic material is included in each chapter, some of which will be familiar to computer science (CS) undergraduate and graduate students, researchers, and professionals. A summary of relevant current software repositories is included in chapter 1.

The intended audience includes advanced CS undergraduates, graduate students, researchers, and practitioners faced with the problem of implementing applications operating on large quantities of data. Prior training and programming experience in algorithms and data structures, such as may be obtained from at least a mid-level CS undergraduate course, is essential to getting the most out of this book.

The explanations are clear and accessible, the content substantial and the information level high, the illustrations appropriate and informative, and the subject matter timely and increasingly relevant. This text will provide a good foundation in theory and practical issues and should be a useful introduction to the field for students, researchers, and developers desiring a solid foundational text. Those with specific problems not covered here should be able to find their way to the relevant advanced literature with the guidance of the basic theory and bibliographic notes and references provided in this volume.