Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Best of 2016 Recommended by Editor Recommended by Reviewer Recommended by Reader
Search
Pattern recognition of big nutritional data in RCT
Wang J., Fang H., Wang H., Olendzki G., Wang C., Ma Y.  BodyNets 2013 (Proceedings of the 8th International Conference on Body Area Networks, Boston, MA, Sep 30-Oct 2, 2013)394-400.2013.Type:Proceedings
Date Reviewed: Feb 18 2014

Categorized broadly as unsupervised learning algorithms, clustering techniques are often used to help reveal underlying patterns, particularly in large datasets. This paper describes research to analyze the effectiveness of different dietary prescriptions on a limited set of patients of different races and genders. The authors chose to use a nutritional dataset, categorized as a randomized controlled trial (RCT) and obtained from the medical diagnosis domain, for a detailed analysis in this study. That choice makes this research different from similar studies. The authors claim that their multi-clustering approach to the analysis of this nutritional data in an RCT, along with multiple validation steps, is a first of its kind.

The multi-clustering approach uses some popular and emerging unsupervised clustering techniques, such as the probability-based Gaussian mixture model (GMM), hidden Markov random fields (HMRFs), neural networks based on self-organizing maps (SOMs), k-means, and agglomerative hierarchical methods. With these tools, the authors attempt to identify hidden dietary patterns in the nutritional dataset. The work aims to provide a more accurate and comprehensive recognition of patterns in big data from RCTs, especially when they contain a large number of variables (over one hundred, in this case). I found it interesting that while most of these clustering techniques have been used independently on prior occasions, seldom have they been used collectively on a single problem for comparative studies. The authors also list some interesting highlights and drawbacks of each clustering approach when used independently.

For this study, the authors deployed multi-validation criteria in a cross-validation exercise, by varying the cluster size from two to eight. To determine the optimal cluster size, the Bayesian information criterion (BIC) was used for GMM, k-means, and SOM, and the deviance information criterion (DIC) was used for HMRFs. The values for the BIC and DIC were lowest with a cluster size of three, yielding the optimal cluster size for the nutritional dataset. This optimal value was then used to assess the accuracy for each of the clustering techniques. The k-means approach achieved the highest accuracy (over 91 percent) and the hierarchical method received the lowest (below 62 percent).

Once they determined the optimal cluster size, the authors proceeded in a step-wise method, testing to find the optimal number of variables in the nutritional data, which yield the same clustering accuracy rate. Identification of significant variables helped establish the relationship among dietary behaviors, weights, and trial conditions.

Overall, I was impressed by the artful comparison of the popular and emerging clustering algorithms in terms of their suitability for tasks involving nutritional big data. However, the readability of the paper is greatly reduced by the presentation of medical diagnostics, and in particular, the interlacing of medical analysis with the discussions toward the end. This situation could have been avoided if the authors had included supporting figures and descriptions to clarify the medical problem and make the diagnostics simpler and easier to understand.

Reviewer:  CK Raju Review #: CR142012 (1405-0371)
Bookmark and Share
  Reviewer Selected
Editor Recommended
 
 
Mathematical Software (G.4 )
 
 
Clustering (I.5.3 )
 
 
Model Development (I.6.5 )
 
 
Life And Medical Sciences (J.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Mathematical Software": Date
Mathematical applications of electronic spreadsheets
Arganbright D., McGraw-Hill, Inc., New York, NY, 1984. Type: Book (9789780070024298)
May 1 1985
The NAG Library: a beginners guide
Phillips J., Oxford University Press, Inc., New York, NY, 1987. Type: Book (9789780198532637)
May 1 1988
Numerical software tools in C
Kempf J., Prentice-Hall, Inc., Upper Saddle River, NJ, 1987. Type: Book (9789780136272748)
Apr 1 1988
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy