Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
EpiMiner: a three-stage co-information based method for detecting and visualizing epistatic interactions
Shang J., Zhang J., Sun Y., Zhang Y. Digital Signal Processing24 1-13,2014.Type:Article
Date Reviewed: Jul 7 2014

Common human diseases are the result of complex interactions between many genetic and environmental factors. Parametric statistical methods such as logistic regression are underpowered to detect interactions due to unstable parameter estimates in sparse data. Computational alternatives such as those from machine learning and artificial intelligence have a very important role to play in embracing the complexity of the genotype-to-phenotype relationship. Previous work in this area has focused on existing methods such as association rule mining, neural networks, and random forests and on novel approaches such as multifactor dimensionality reduction. Despite this prior work, there is still a tremendous need for new and novel genetic analysis strategies.

The study by Shang et al. approaches the problem from a digital signal processing (DSP) point of view. That is, the true genetic variations that confer risk need to be detected and amplified from a genome-wide sea of noisy genetic variation. They present here a three-stage approach called epistasis Miner or epiMiner, whose name comes from the Latin word epistasis, which literally means one gene standing upon (interacting with) another gene. Stage one of this method employs a co-information index (CII) approach that filters out a subset of genetic variants from a ranked list. The CII algorithm is an entropy-based measure that captures the information about disease risk that is captured by one genetic variant interacting with one or more other variants. Stage two employs a permutation test to identify those genetic variants that have CII values different than what would be expected under the null hypothesis of no association. The final stage is to visualize the genetic interactions using networks.

The epiMiner approach was implemented in MATLAB and applied to simulated gene-gene interactions where the truth is known. The simulation results were generally positive, highlighting the usefulness of the algorithm. A strength of the study was the application to a real dataset consisting of 103,611 genetic variants measured in 96 subjects with age-related macular degeneration (AMD) and 50 health controls. The algorithm confirmed several known genetic associations and revealed some new candidates. There is tremendous potential for this algorithm and other similar approaches that have been previously published in this area.

It is important to point out some of the limitations of the study and some of the important questions to motivate future work in this area. One concern is that the sample size of the real data used was exceedingly small given current standards in human genetics. Numerous publicly available datasets include thousands of human subjects with hundreds of thousands or even more than a million genetic measurements. Human genetics is squarely in the era of big data. It will be important to apply this algorithm to AMD datasets of higher dimension and quality. Furthermore, the gold standard in human genetics is to only report genetic associations when they replicate across multiple independent datasets. This presents an interesting issue that needs to be explored further in the context of detecting nonlinear gene-gene interactions since these types of genetic effects may not replicate as allele frequencies shift from sample to sample. Another opportunity for future work is to explore ways in which experimental data can be directly integrated into methods such as epiMiner to provide expert knowledge for guiding search algorithms and, importantly, to assist with the interpretation of results.

There is great opportunity to develop and apply computational methods in this domain. The study by Shang et al. makes an important contribution to this growing area.

Reviewer:  Jason Moore Review #: CR142477 (1410-0893)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Signal Processing (I.5.4 ... )
 
 
Biology And Genetics (J.3 ... )
 
Would you recommend this review?
yes
no
Other reviews under "Signal Processing": Date
Vector signal processors and digital filters in data compression for electronic publishing
King G., Picton P. Microprocessors & Microsystems 12(7): 555-564, 1988. Type: Article
Feb 1 1992
Signal processing algorithms
Stearns S., David R., Prentice-Hall, Inc., Upper Saddle River, NJ, 1988. Type: Book (9789780138094355)
Jan 1 1989
Synthetic aperture radar
Fitch J., Springer-Verlag New York, Inc., New York, NY, 1988. Type: Book (9789780387966656)
May 1 1989
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy