Malware Analytics: Review of Data Mining, Machine Learning and Big Data Perspectives


Recent advances in cyber technologies have made human life's easier, but it may lead to a heavy cost in terms of economic, psychological or reputation damage. For instance, these damages may be caused by variants of malware propagated in a hidden and mostly untraceable way. Malware analytics deals with the approaches and techniques utilized to generate the distinguishing characteristics of the malware for robust cyber defenses. This paper aims at presenting the current status of the malware research, challenges, and methods used to overcome those challenges using data mining, machine learning and big data perspectives. We have considered these three perspectives because of its extensive computation value, mostly fused to solve a wide range of problems from security to medical, finance and industry. These domains as an independent technique and their interrelationships depend on the nature of the dataset considered. We have also proposed a framework to overcome the challenges and open issues prevalent in malware analytics. It is hoped that this paper with the simplified presentation of the most vital approaches of malware analytics will help the inspiring researcher or a newbie in the security field to explore more as well as budding engineers to choose malware analysis as their field of study. Specifically, analysis of state-of-the-art approaches with evaluation, pros and cons discussion and the current challenges and future directions will empower all the malware enthusiasts.

Publication Title

2019 IEEE Symposium Series on Computational Intelligence, SSCI 2019