The classification of gene data is one of the most interesting areas in biomedical signal processing. The efficiency of the classification algorithm depends mainly on feature extraction, feature selection, and the classification mechanisms employed. Support vector machines (SVMs) have been a popular choice among classifiers for over a decade. Top scoring pair (TSP) was introduced in 2004, for classifying gene expression profiles.
The authors performed extensive experiments, employing 10 classifiers (SVM, TSP, and its variants) on 10 publicly available microarray datasets. They comprehensively present the TSP classifier and its variants in this paper. Yoon and Kim used feature selection methods for SVM, namely recursive feature elimination (RFE) and mutual information. They also used bagging and boosting ensemble methods for SVM.
Data from the Kent Ridge Biomedical Dataset Repository was used in the experiment. The authors compared the classification efficiency of the 10 classifiers using parameters such as classification accuracy, number of features used, Cohen’s Kappa coefficients, and standard error. According to the authors, “TSP family classifiers serve as good feature selection schemes.” In addition, a variant of SVM, SVM-RFE, is the best for practical usage, if an appropriate number of features is used for the classification.
The presentation is good and the comparison is also interesting. If the authors had critically analyzed the results of the comparison, this paper would have become sought-after material. This paper will be useful for bioinformatics researchers working on microarray classification.