Computing Reviews

A novel malware analysis for malware detection and classification using machine learning algorithms
Sethi K., Chaudhary S., Tripathy B., Bera P.  SIN 2017 (Proceedings of the 10th International Conference on Security of Information and Networks, Jaipur, India, Oct 13-15, 2017)107-113,2017.Type:Proceedings
Date Reviewed: 11/08/18

The authors describe a system that identifies malware. The system works both at the macro level, which distinguishes malware from clean software, and at the micro level, which distinguishes between types of malware, for example, Trojan horse, spyware, and so on. The approach uses machine learning, more specifically various algorithms taken from the Weka package.

An outline of the process follows. A collection of files was scraped from the web to provide the dataset. Sixty percent were used for training and the remaining 40 percent for testing. The files were executed in Cuckoo Sandbox; this run identified which examples were malware for the purposes of training. Custom software extracted the application program interface (API) calls used by the software in the Cuckoo Sandbox runs. This analysis gave a feature vector based on the API calls and the count of each API call. The vector for each file was the feature vector used by the machine learning tools. The authors used J48 decision tree, random forest, and sequential minimal optimization (SMO) from the Weka framework.

While all three algorithms produced good results on what the authors admit is a relatively small dataset of 220 files, J48 proved to be 100 percent accurate on both the macro and micro classifications. SMO was 90 percent accurate (or better) on both problems, whereas random forest tree achieved 97 percent accuracy on detection and only 66 percent accuracy on classification. The paper concludes with guidance, based on time complexity, accuracy, and precision, as to which algorithm is most suited to differing application situations.

Reviewer:  J. P. E. Hodgson Review #: CR146312 (1902-0067)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy