Volume 2018, Number May (2018), Pages 1-14
While machine learning has proven to be promising in several application domains, our understanding of its behavior and limitations is still in its nascent stages. One such domain is that of cybersecurity, where machine learning models are replacing traditional rule based systems, owing to their ability to generalize and deal with large scale attacks which are not seen before. However, the naive transfer of machine learning principles to the domain of security needs to be taken with caution. Machine learning was not designed with security in mind and as such is prone to adversarial manipulation and reverse engineering. While most data based learning models rely on a static assumption of the world, the security landscape is one that is especially dynamic, with an ongoing never ending arms race between the system designer and the attackers. Any solution designed for such a domain needs to take into account an active adversary and needs to evolve over time, in the face of emerging threats. We term this as the "Dynamic Adversarial Mining" problem, and this paper provides motivation and foundation for this new interdisciplinary area of research, at the crossroads of machine learning, cybersecurity, and streaming data mining.
Living in an "always-on" world, where 40 percent of the population is reachable at the click of a button has provided us with immense economic and social opportunities. The growing scale and impact of modern day digital applications has led to the adoption of data-driven techniques, at the core of several complex systems, promising to be a tractable and scalable solution. Machine learning, with its ability to extrapolate from previously seen data, has become an integral part of modern-day digital applications. Whether for detecting fraudulent transactions, eliminating spam, performing high frequency trading, or forecasting market trends, the use of machine learning is ubiquitous. The domain of cybersecurity has also benefited from the use of machine-learning techniques; as traditional rule/signature-based approaches have proven to be inadequate at the scale and level of sophistication of modern-day attacks.
However, as is common with any new technological advancement, the overenthusiasm to develop and deploy a solution often leads to the overlooking of its flaws and security risks. The early success and adoption of machine-learning techniques, also suffers from the same affliction. Machine learning was not designed with security in mind and it has recently been recognized that the goals of machine learning are often in conflict with the notion of system security [1, 2]. With commercial-off-the-shelf solutions being advertised by technological giants, like Amazon AWS and Google Cloud Platform, it seems machine learning has becoming a general engineering solution to most problems. These companies offer black box "Machine Learning-as-a-Service" models, which developers can utilize to harness the potential of machine learning, for their application needs. However, this domain agnostic usage of machine learning is flawed, as there is "no free lunch" when it comes to machine learning . Recent works in , have demonstrated a greater than 89 percent attack rate on these black-box services, by demonstrating the vulnerabilities of machine learning to adversarial manipulation. It was shown accuracy of a model has little meaning in an adversarial environment, because even a model having greater than 99 percent accuracy was found to be easily evadable (more than 99 percent of the times), using commodity hardware and limited probing effort .
Machine learning is susceptible to adversarial activity, where an attacker can manipulate the input data to deceive the deployed machine-learning model. Machine learning is increasingly being used at the core of several critical applications, such as for self-driving cars, drug recommendation systems, and high-volume trading algorithms. These systems impact lives and adversarial manipulation on them, can lead to devastating results. At the scale of solution that machine learning promises, we are now more susceptible than ever to adversarial activity, as the next attack can originate from anywhere and can take any form, previously unseen. These vulnerabilities that machine learning introduces to a system's integrity have largely been overlooked, due to its perceived benefits, in many application domains. This problem is especially critical when machine learning is used in cybersecurity applications, as it introduces a whole new dimension of security risks, which it had set out to fix. There is therefore a need for a thorough analysis of the vulnerabilities of machine learning in adversarial domains, leading to a timely understanding of its limitations and possibilities.
This article will highlight the innate vulnerabilities of machine learning (especially classification systems), which throws caution to its usage in adversarial environments. Additionally, we also talk about future directions promoting research in the area of dynamic adversarial machine learning.
Evading Machine Learning Models
A machine-learning-based security model (classifier considered here) is developed when it is trained on a set of labeled data samples called the training dataset. In a binary classification task, these samples can be considered to belong to either of the two classes, malicious or benign, without loss of generality. A classifier trained on this dataset, intends to maximize prediction performance, typically measured as accuracy or F-score. A well-trained model is expected to perform well on future, unseen testing data, which it encounters after being deployed in the real world. However, this is a static view of the machine learning process, which is often violated in a dynamic and adversarial real-world setting. In an adversarial domain, the attackers do not gain anything by generating samples, which follow distributions similar to the testing dataset, as these are readily blocked out by the defender's model. Instead, the attacker tries to morph its input attack samples, so they can avoid being detected. More formally, the attacker wants to modify its test set data samples of class malicious, so as to masquerade as benign, thereby leading to an increase in the misclassification rate by the defender's machine learning model. These general class of attacks are termed as evasion attacks, and they affect the performance of a deployed model at test time.
Attacks on machine learning begin with a reconnaissance effort, where the adversary tries to observe the behavior of the model, by making limited probes to it, posing as a benign client user. The learned information is then leveraged to morph the attack payload, so as to avoid detection by the machine-learning model. An example of evasion is shown in Figure 1, where an attacker tries to evade a spam detection system, which uses a cloud-based machine-learning service. The attacker learns the behavior of the model, by sending the two probe emails and observing the response of the detection system. Both emails differ on a single word and solicit opposite responses from the service. As such, the importance of the word to the classification process is inferred. An intelligent adversary can then modify the word "sale" to "sa1e," which looks visually similar but fools the detection model. This example shows the general observe-evade process, under which adversarial manipulations on test data can be performed, leading to degradation in the performance and credibility of the machine-learning model.
Evasion leads to a degradation of the test-time performance of a classifier, even though it had a high initial accuracy at the time of deployment. Evasion attacks are common in email spam, in mimicry attacks on biometric systems, in adaptive malware binaries, and in "crowdturfing" applications . The development of smart-click bots, which aim at defrauding the click-ads based economy of the internet, also rely on learning the behavior of the fraud detection algorithms, to modify attack characteristics over time . Speed and click time are features that the bot can modify, as it gains more information about what the defender's systems blocks and what it permits. Evasion attacks are possible on machine-learning systems, even if the only feedback available is a tacit accept/reject, from the black-box machine-learning service. Generic black box attack frameworks are starting to be developed, to evade machine learning in a domain independent and data-driven manner [2, 7]. An example of this framework is the EvadeML framework , which used genetic algorithms to morph PDF documents to avoid detection, with an evasion success rate of 100 percent.
There is considerable scope in the understanding and research of general-purpose evasion strategies on machine learning models, as a first step toward understanding its vulnerabilities.
From Evasion to Reverse Engineering
While evasion emphasizes increasing the error rate of the model, by causing it to misclassify on a select group of morphed samples, a dedicated adversary can do long-term damage by successfully reverse engineering the model parameters. Reverse engineering is especially dangerous because: a) It helps the adversary to better understand the prediction landscape, enabling it to launch a large-scale evasion attack, and b) Reverse engineering discloses the important learned attributes of the data, leading to privacy concerns and intellectual property leakage. As an example of the latter case, an adversary of a financial market system can reverse engineer the trading model of a competitor, so as to force them into making deals that indirectly benefit their competitors. In another scenario, an adversary can trick the machine-learning-based model of a real-time advertisement bidding system into making exorbitant bids on non-revenue generating pages, thereby draining their advertisement budget. Reverse engineering can also be a first step toward large-scale evasion attacks, which are harder to detect and to stop, as the reverse engineered model provides higher confidence and extra validation on the submitted attack samples.
Reverse engineering a machine-learning model presents a symmetric flipside to the task of learning from presented data . Instead of inferring from the provided data, the task here is to generate new samples that will convey maximum information about the defender's model. In doing so, it is similar to the problem of active learning , where unlabeled samples are selected for labeling based on how much information they might impart to learning the underlying model. Constrained by the number of probes in both these problems, a set of intelligent algorithms can be developed, which generate probes to be submitted to the model and receive feedback and assimilate the received information to learn a surrogate model to mimic the behavior of the original model. This process is shown in Figure 2, where the trained model (green line) is probed, and then the reverse engineered model is obtained (red line). Probing is submitting input data samples to the black-box model, and observing its response. By sufficiently spreading out the samples in the data space, a decent understanding of the prediction space is obtained. These informative probes enable the reverse engineering of the original model, which is summarized as the surrogate model (red). This is an extension of the observe-evade paradigm, depicted in Figure 1, with more focus toward learning the model characteristics, as opposed to simple modifications to targeted attack samples. It should be noted, although reverse engineering might be only partially successful, it could still be effective for the task of evasion and adversarial perturbation.
Adversarial reverse engineering is a threat to machine-learning-based security, as it enables an attacker to use the same tools as the defender, to subvert it. This problem is not specific to the task of classification alone. In regression-based problems, reverse engineering conveys information about attribute coefficients and their impact on the predicted value. Consider for example a simple regression model for price of a commodity: a*Supply - b*Demand = Price. Reverse engineering this model would give us approximations for the coefficients a and b. An adversary could then control price by effectively modifying a or b, based on whichever is easier to manipulate. In this example, a competitor adversary could manipulate a supply critical model (a > b) by creating source bottlenecks, either by coercion or by hoarding. Reverse engineering is also possible in case of time-series-based systems, where an alerting algorithm could be attacked by manipulating the rate of attack, so as to gradually impact the system, causing the attacks to go unnoticed for a longer time. In outlier detection systems, reverse engineering the model could direct an attacker straight to the restrictive space of training data, leading to leakage of private data. Even clustering based models are susceptible to reverse engineering, as they reveal information about spatial data density and regions of interest. At the very least, reverse engineering is an extreme case of evasion, where new samples can be created so they will go undetected by the defender's model. However, the exploitation of reverse engineering to generate novel attack scenarios is unbounded.
Avoiding reverse engineering is not an easy task, as it is non-intrusive and can occur independent of the defender's learning process. However, the ability to thwart learning by obfuscating output response, and the ability to mislead adversarial learning by including honeypots, preemptively in the training phase, are possible solutions that require further research to ascertain their applicability.
Why Machine Learning is at Risk?
Machine learning proposes promising solutions, to complex and large-scale problems. However, machine learning itself introduces a new gamut of vulnerabilities, when used in an adversarial environment. We recognize two core assumptions of machine learning, which makes it susceptible to adversarial activity: a) stationarity and b) generalizability. Stationarity refers to the equivalence in distributions, between the training and the testing data, which is necessary to ensure the generalization and extrapolation guarantees of machine-learning models [10, 11]. Why each of these assumptions are suspect is discussed next.
Stationarity implies a static environment, where analysis done at the training stage can be extrapolated to the testing stage. However, real-world data is dynamic and it can change over time, based on uncontrollable environmental factors. This leads to a phenomenon, referred to as "concept drift" [10, 11], which is a result of distribution changes in data. These changes lead to the degradation of performance over time, as the trained models become obsolete and ineffective over the new distribution. This change is depicted in Figure 3a), where a binary classification model classifies samples into class 0 or 1. The original margin depicts the learned probabilistic classifier. Over time, due to concept drift, the data distribution is seen to have shifted, causing the margin to lose its predictive ability. Concept drift warrants a dynamic view of the ML problem, with retraining and detection being important and ongoing steps over the course of prediction. In an adversarial domain the problem of concept drift is exacerbated, as adversary actively try to change the attack samples, in their attempt to avoid detection. In such a milieu, nonstationary is not just a possibility but in fact a norm. As such, static ML models fail to uphold system integrity, in dynamic environments.
The other limitation of machine learning is the overlooking of adversarial possibilities and intentions at test time. Learning is primarily concerned with the problem of Generalizability, which is the ability to extrapolate from known data to the unknown realm. However, this central goal of the learning process is directly in conflict with its ability to ensure security [3, 12]. As seen in Figure 3, the broader the generalization margin of a classifier, the more space adversaries receive to evade and go undetected. This is a direct result of a models ability to provide a best guess over unseen spaces of data, without explicit validation. While maximizing the margin width or generalizability has been the goal of any traditional machine-learning model, its application in the domain of security would benefit from a novel definition of its objectives and evaluation.
The inability to account for a dynamic and adversarial nature puts machine learning at risk to a new class of attacks, which were not possible before. By extension, systems employing machine learning at their core are also exposed to this class of risks. System designers need to make themselves aware of the limitations of machine learning and the unique requirements of their field, which could need a more customized solution. As underscored by Andrew Ng, simply applying machine learning models to a dataset does not guarantee results ; machine learning needs to be customized for the business context and end goals.
Toward a Dynamic-Adversarial-Learning Paradigm
The vulnerabilities of machine learning are a result of its naive adoption across different application domains. The benefits of machine learning are immense, as it seems to be a promising general purpose, scalable solution, to modern day scale requirements. To harness these benefits, while ensuring security and reliability, a new paradigm in the applicability of machine learning is needed. This paradigm needs to account for the dynamic and adversarial nature of the problem, to ensure a scalable and safe learning process.
Existing research activities in adversarial mining approach the problem from the following two perspectives:
- As a streaming data mining problem [8, 10, 11]. Here, adversarial activity is regarded as a concept drift, which needs to be detected and fixed, over time. The specific nature of adversarial drift, which differentiates it from a natural change, such as seasonality or aging, is not considered or understood by this class of research. Nevertheless, the dynamics of the system is understood as an important problem, and has led to the development of streaming data algorithms. Spam classification , network intrusion detection  and detecting click fraud , are some of the domains, which have benefited from this research.
- As an adversarial learning problem [8, 12]. In this class of research, the effect of adversary is considered at the training time of a ML model, with the goal of making the models more robust to attacks. The concentration is on delaying attacks and making them more expensive for the adversary. However, this research fails to account for the dynamics of the problem, as nothing is done once a model is deployed and an attack starts. Game theory approaches and obfuscation techniques [12, 15] are popular approaches to increase robustness of the learned models.
There is a research gap between these two communities, as they fail to undertake a complete view of the security of a system. A holistic approach would understand security is a cyclic and ongoing process, in which delaying attacks, detecting them, and then recovering from them, are all equally important phases. A dynamic-adversarial approach to the security of machine learning (see Figure 4), would take proactive measures to delay the onset of attacks, and would take preemptive measures to fix vulnerabilities, on the fly. These approaches would actively engage the adversary, in an attempt to mislead them and trick them into getting detected. In such an environment, metrics such as difficulty to reverse engineer, recover time, and detection rate would take precedence over the traditional metrics of accuracy and precision. Also, the effective involvement of a human in the loop, in a never-ending learning scheme providing expertise for retraining and detection, would enhance the practical appeal of such systems.
A holistic view of machine learning security will require research contributions in the interdisciplinary field of dynamic adversarial mining (see Figure 4), which incorporates and integrates lessons learned from the vast background of work in machine learning, cybersecurity and streaming data mining. This new paradigm of machine learning will enable a reinvention of machine learning in adversarial domains, leading to a "security by design" development for long-term benefits. Some core ideas that can benefit from immediate research are:
- What preemptive strategies at the training phase can enable attack detection and faster retraining? Could data honeypots be integrated to trick adversaries at test time?
- How effective is randomization and obfuscation to long-term security?
- How to measure security in machine learning from a streaming and cyclic perspective?
- How to understand and formalize the attack vulnerabilities, from a purely data driven perspective, to effectively test future developed counter measures?
This new direction of research poses many questions, the pursuit of which can lead to several interesting research projects, in the near future. Our understanding of machine learning and its abilities has thus far been largely unidirectional. We need to analyze machine learning from different views and only then will we be able to truly understand its potential and long-term effects. The adversarial analysis of machine learning has shown although machine learning has set out to solve some of the most sophisticated problems in security, it has led to the introduction of a whole new set of challenges waiting to be tackled. These challenges aim to reaffirm the respect machine learning deserves, and the need to account for customization and expertise in its applicability. Geared with a better understanding of machine learning, we can usher into a new era of digital economy, which will benefit immensely from the creation of secure and trust-worthy machines for the next generation of cashless and digital societies.
 Wang, G., Wang, T., Zheng, H., and Zhao, B. Y. Man vs. machine: Practical adversarial detection of malicious crowdsourcing workers. In 23rd USENIX Security Symposium (USENIX Security '14). USENIX Association, Berkeley, CA, 2014, 239-254.
 Tramèr, F., Zhang, F., Juels, A., Reiter, M. K., and Ristenpart, T. Stealing machine learning models via prediction APIs. In In the Proceedings of the 25th USENIX Security Symposium (Aug. 10-12, Austin)). USENIX Association, Berkeley, CA, 2016, 601-616.
 Sethi, T. S., Kantardzic, M., and Hu, H. A grid density based framework for classifying streaming data in the presence of concept drift. Journal of Intelligent Information Systems 46, 1 (2016), 179-211.
 Ng, A. What Artificial Intelligence Can and Can't Do Right Now. Harvard Business Review. https://hbr.org/2016/11/what-artificial-intelligence-can-and-cant-do-right-now. Nov. 9, 2016.
 Sethi, T. S., and Kantardzic, M. Monitoring Classification Blindspots to Detect Drifts from Unlabeled Data. In 17th IEEE International Conference on Information Reuse and Integration (IRI). IEEE, Washington D.C., 2016.
Tegjyot Singh Sethi is a final year Ph.D. student at the Department of Computer Engineering and Computer Science, University of Louisville, USA. His research interests lie in the area of data mining, adversarial machine learning, change detection, learning with limited labeling, and data stream mining. He is a program committee member for INNS – Big Data, and has published several international conference and journal papers.
Mehmed Kantardzic is a Full Professor of the department of Computer Engineering and Computer Science (CECS) at the University of Louisville. Currently, he is the Director of the Data Mining Lab as well as the Director of CECS Graduate Studies at the CECS Department. His research focuses on data mining and knowledge discovery, machine learning, soft computing, click fraud detection and prevention, concept drift in streaming data, and distributed intelligent systems. He is the author of six books including the textbook: Data Mining: Concepts, Models, Methods, and Algorithms (John Wiley, second edition, 2011). He has served on the editorial boards for several international journals, and he is currently Associate Editor for WIREs Data Mining and Knowledge Discovery Journal.
©2018 ACM $15.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2018 ACM, Inc.