Getting into machine learning is all fine and well. And, yes, it’s something that has revolutionized businesses and started so many opportunities for better products and services. But, like with everything that brings good, there are threats to it. And in comes adversarial machine learning (AML), as a threat to machine learning and its outcomes. Since ML is a field that is constantly evolving, much is still unknown. And this presents a playground for possible future attacks on ML integrity and model accuracy.
What actually is adversarial machine learning
Adversarial machine learning (AML) is a technique used in machine learning mostly to manipulate ML models or fool and misguide them with malicious input. It’s the process of extracting information on ML systems to rig and modulate outcomes, which reduces the accuracy and influences the model’s performance.
AML has been on the rise since ML started gaining momentum. Even now, we can’t talk about all the ways adversarial attacks can influence machine learning training and modeling since new ways of attacking are turning up.
Machine learning and AI are not simple. Thus, each new form of attack on them will be harder and harder to identify and eliminate.
When it gets to the attack
In machine learning, there are two main types of attacks: black box attacks and white box attacks.
Black box attack – This kind of attack happens when the attackers do not have all the information on the targeted model, its architecture, parameters, or inner workings.
White box attack – This attack is the opposite of black box. Here the attackers have information about the model and architecture.
It is obvious which ones are more easily performed and more dangerous than the others, but both present issues in machine learning progress. And companies that deal in machine learning are investing more and more into battling such threats.
But, when we talk about attacks we can say that there are those that happen at the learning level and modeling level, meaning that one affects learning data (train time attacks) and the other after model deployment (inference attacks).
Data poisoning or poisoning attacks
These kinds of attacks influence the data during training time. Attackers infiltrate data by changing existing data or introducing incorrectly labeled data. The model will make incorrect predictions since it was fed wrong or corrupted data to train the model. The point of the attack is to reduce the accuracy of the model and its predictions.
These attacks directly impact data used in models. It refers to designing an input that seems normal to humans but is wrongly classified by machine learning models. Most often it happens with image processing where a layer of noise is added to the original image. So one image might seem fine to the human eye, but with the layer of noise, it’s actually different.
Model stealing or extraction attacks
This one as well focuses on the model after training. It’s oriented on either reconstructing the model, basically stealing the model structure, or extracting the data it was trained on.
How to combat adversarial attacks?
If one of the above scenarios worries you, as it should, there are certain methods to use in combating adversarial attacks. Let it be noticed, that with each day and each machine learning progress, new types of attacks emerge. So, it’s vital to stay on top of every point where your model could get affected by malicious intents.
One could train the model to recognize adversarial attacks. It’s a supervised method that feeds adversarial examples into the model so it recognizes them as threats and prevents future attacks. Here, the model continuously learns about possible adversarial examples and tries to diminish their effects in the future.
Here, you should use multiple models in your system, so it’s harder for the attackers to target one. Since the models are switching, the attacker won’t know which one is in use and where he should strike. The attacker could target all models, but it’s harder to poison all of them in comparison to poisoning one.
In this case, multiple models are used as well, but they are combined into one generalized one. Here, all models contribute to the final results. Attackers might hit one of the models, but it would be harder to get them all.
In distillation, you train one model to predict the outputs of another model that was trained on real data. Here we have a teacher model that’s trained on real data and a student model trained on teacher’s outputs. Often, student models are smaller and faster than teacher models. This approach to combating attacks needs less human intervention and is adaptable to unknown threats.
How do adversarial attacks impact decision-making?
If the model is attacked and final results or predictions are skewed, then it’s obvious that every subsequent decision will be wrongly based. Those decisions can range from having minor influences to catastrophic ones, depending on the model use case.
If the decision is related to financial outcomes, the result will be a monetary loss. But it can also manifest in lost opportunities.
You, as a decision-maker, expect certain outcomes from the machine learning model. When maliciously affected, those models don’t accurately present results. So, each insight is wrong and leads to making other related things wrong as well. Decision-makers need to work on accurate data and insights to create strategies based on true findings.
If the model is oriented externally, to serve users, their experience will suffer and this will result in negative feedback or lost customers – in case the model’s outcomes have to serve those customers (recommendations, shopping experience, taxi services, delivery, etc.).
Other issues are delays. If the model is affected it needs to be reworked. These kinds of setbacks delay decisions moving forwards or features being deployed.
Whichever way you look at it, inferences derived from machine learning influence further actions. And their accuracy or relevancy defines the positivity or negativity of those actions.
How can you remain responsible in ML and AI?
There is no perfect recipe for avoiding attacks since machine learning and AI are continuously evolving. Each attack is different, so consequently, each approach to it is different.
So, you have to be alert all the time and keep up with changing trends and possible ways of entry for any attacks. You have to be aware of the possible weaknesses of your systems and how they can be exploited.
Feature stores and feature engineering seem to be a way to minimize adversarial attacks and threats. So, for major companies, this might be one part of the solution.
The rising question of ethics in ML and AI, especially in terms of private data protection, is bringing those issues of attacks to the forefront. So this emphasizes the importance of social responsibility towards those affected by ML and AI models and outcomes. Providing extra security for something so popular and impactful is non-negotiable.
How to be responsible? Well, firstly, as we said, be aware of possible attacks and weaknesses. Secondly, train your models to recognize and prevent attacks. Third, don’t let your model run wild. Don’t let it be a one-and-done, but monitor it and continuously check what’s going on and if it’s working correctly. It’s an ongoing process of making sure your models and data aren’t compromised. Think ahead and research adversarial machine learning.