A Framework for Analyzing Ransomware using Machine Learning


Ransomware attacks increased in recent years causing significant damages and disruptions to businesses. Forensic analysis such as reverse engineering of executables (or binary files) is the common practice of examining such malware characteristics. In this work, we developed a reverse engineering framework incorporating feature generation engines and machine learning (ML) to efficiently detect ransomware. This framework is used to perform multi-level analysis (such as raw binaries, assembly codes, libraries, and function calls) in order to better examine and interpret the purpose of malware code segments. We leverage the object-code dump tool (Linux) and portable executable (PE) parser to decode binaries to assembly level instructions and dynamic link libraries (DLLs). Both ransomware and normal binaries are considered to conduct experiments where samples are first pre-processed to extract features and then different (supervised) ML techniques are applied to classify these samples. Experimental results reported the performance i.e., the detection accuracy of ransomware samples which varied from 76% to 97% based on the ML technique used. In particular, among the eight ML classifiers tested, seven of these performed well with detection rate of at least 90%. This study also demonstrated that the combination of static level analysis at the ASM-level and DLL-level can better distinguish ransomware from normal binaries.

Publication Title

Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence, SSCI 2018