Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Big data analytics in genomics
Wong K., Springer International Publishing, New York, NY, 2016. 428 pp. Type: Book (978-3-319412-78-8)
Date Reviewed: Jul 10 2017

The discovery of the DNA molecule’s structure and sequence underlines the importance of its diverse functions within living organisms. The problem is analogous to, but much more complicated than, reverse-engineering abstract data types implemented in object-oriented languages from their binary code. Even partial understanding of the gene structure, the process of gene expression, and its regulation is essential from both scientific and medical points of view. The huge number of nucleotides along the DNA molecule is an immediately apparent burden of the problem. The complexity of their role in the eukaryotic cell cycle, proliferation, and differentiation makes the problem extremely difficult. For instance, genotype-phenotype correlations, polygenic causes of numerous diseases, and the many-to-many relationship among genotypes and phenotypes in cancer, for example, make the problem one of the most difficult intellectual challenges.

To be able to handle complexity, research and practice initially have dealt only with selected groups of genes (candidate-/dedicated-gene approaches) in the DNA, within the context given by pathological/morphological analysis of the tissue sample. This has been extended to investigation of RNA or protein biomarkers enabling prognostic and theranostic applications. Classical molecular diagnostic methods range from real-time polymerase chain reaction (PCR), fluorescent in situ hybridization (FISH), and classical DNA and RNA sequencing methods to immunohistochemistry (IHC) and flow cytometry analysis. During recent decades, new molecular methods were developed, such as comparative genomic hybridization (CGH), single nucleotide polymorphism (SNP) arrays, and new platforms for high-throughput next- and third-generation sequencing approaches, being efficient for analyzing the whole genome/exome. These techniques produce increasingly growing amount of datasets, but only a fraction of them are well understood and properly annotated. Direct manual annotation is hindered by limited human capacity, so computer-aided methods should be applied. Because of data size and complexity, different statistical and machine learning methods are used that are dedicated to specific subgoals and suited to scientific or therapeutic purposes. For instance, the tools of Illumina sequencing are too laborious and time-consuming for clinical application, but are useful for research. On the contrary, the Ion Torrent targeted sequencing services support daily clinical practice in an efficient way.

As a consequence of the size and the complexity of the problems, international consortia create build and share datasets equipped with interoperable access methods, and having standardized structure and annotations and being at a clinical-quality level. Besides The Cancer Genome Atlas (TCGA; http://cancergenome.nih.gov/) mentioned in this book, a new one for similar purposes is promising: Genomics Evidence Neoplasia Information Exchange (GENIE, http://www.aacr.org/RESEARCH/RESEARCH/PAGES/AACR-PROJECT-GENIE.ASPX from the American Association for Cancer Research (AACR)).

This book refers to most of the previously mentioned methods and tools, and presents many whole genome/exome (therefore big data) related ones in 13 chapters. Some dimensions of the rich repertoire covered are outlined in the following:

  • the object of the method/implementation is either the DNA, the RNA, or the protein;
  • investigations are applied to somatic and constitutional genetics;
  • the themes are research or applications in healthcare; and
  • the type, purpose, and efficiency of the method/tool/knowledge base are described.

The themes are arranged in three parts, from introductory to advanced/specific: statistical, computational, and cancer analytics. The structure and the style of chapters are clear; the used notions, aims, solutions, and results are well described; and the problems and the methods are understandably discussed. The chapters deal with their themes in a straightforward way, giving details without being lost in them. Each chapter has a long list of references (but only some of them are referenced in the chapter).

Some shortcomings are of minor importance: A global index and a list of abbreviations are lacking. The book does not contain color pages; pictures are grey-scale only. A lot of pictures obviously had a color ancestor (and some of their descriptions refer to specific colors even now); however, the tone of the greys gives enough information to deliver the intended message in most cases.

In my opinion, the main value of this book is the introduction to the information and communications technology (ICT) support of genetics: a broad range of state-of-the-art methods and developed tools applied in diagnostics and research. I propose this rich overview both for newcomers and professionals, either with ICT or medical backgrounds, who tackle research problems or treat patients in the field of molecular biology/genetics/oncology.

Reviewers:  K. BaloghZsofia Balogh Review #: CR145410 (1709-0589)
Bookmark and Share
  Reviewer Selected
 
 
Content Analysis And Indexing (H.3.1 )
 
 
Biology And Genetics (J.3 ... )
 
 
Data Mining (H.2.8 ... )
 
 
Database Applications (H.2.8 )
 
Would you recommend this review?
yes
no
Other reviews under "Content Analysis And Indexing": Date
Personal bibliographic indexes and their computerisation
Heeks R., Taylor Graham Publishing, London, UK, 1986. Type: Book (9789780947568115)
Sep 1 1987
Development of a term association interface for browsing bibliographic data bases based on end users’ word associations
Pejtersen A., Olsen S., Zunde P., Taylor Graham Publishing, London, UK, 1987. Type: Book (9780947568306)
Nov 1 1989
Transforming text into hypertext for a compact disc encyclopedia
Glushko R. ACM SIGCHI Bulletin 20(SI): 293-298, 1989. Type: Article
May 1 1990
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy