Machine learning in cybersecurity: a comprehensive survey


Today’s world is highly network interconnected owing to the pervasiveness of small personal devices (e.g., smartphones) as well as large computing devices or services (e.g., cloud computing or online banking), and thereby each passing minute millions of data bytes are being generated, processed, exchanged, shared, and utilized to yield outcomes in specific applications. Thus, securing the data, machines (devices), and user’s privacy in cyberspace has become an utmost concern for individuals, business organizations, and national governments. In recent years, machine learning (ML) has been widely employed in cybersecurity, for example, intrusion or malware detection and biometric-based user authentication. However, ML algorithms are vulnerable to attacks both in the training and testing phases, which usually leads to remarkable performance decreases and security breaches. Comparatively, limited studies have been conducted to understand the essence and degree of the vulnerabilities of ML techniques against security threats and their defensive mechanisms. It is imperative to systematize recent works related to cybersecurity using ML to seek the attention of researchers, scientists, and engineers. Therefore, in this paper, we provide a comprehensive survey of the works that have been carried out most recently (from 2013 to 2018) on ML in cybersecurity, describing the basics of cyber-attacks and corresponding defenses, the basics of the most commonly used ML algorithms, and proposed ML and data mining schemes for cybersecurity in terms of features, dimensionality reduction, and classification/detection techniques. In this context, this article also provides an overview of adversarial ML, including the security characteristics of deep learning methods. Finally, open issues and challenges in cybersecurity are highlighted and potential future research directions are discussed.

Publication Title

Journal of Defense Modeling and Simulation