Ceci n’est pas un pipe

Author: Ariel Waissbein
Senior Researcher

Magritte was right. A representation of a pipe is, simply put, not a pipe.

This is not a cat

Image classification has improved greatly with artificial intelligence over the last years. When it comes to detecting cats out of random images, software does marvels. Image recognition has improved greatly not limiting to image classification, but also with respect to face recognition in various realms.

Adversarial machine learning is the study of the attacks on machine learning algorithms, and of the defenses against such attacks. Generative adversarial networks (GANs) are adversarial networks that ([2]) given a training set, learn to generate new data with the same statistics as the training set. For example, a GAN trained on photographs can generate new photographs that look at least superficially authentic to human observers, having many realistic characteristics. In the same way that machine learning can be used to classify images, one may design machine learning algorithms to generate new images that a given classification algorithm fails to recognise.

This is not spam

Spam detection is now much easier than a decade ago, being at its best today. Massive spammers get filtered out easily by the mail-as-a-service giants like GMail. They got good at detecting spam when it is massive. They block the spammers as they are sending their email (because similar emails come from the same outgoing mail server or same account), or they block the spam as it is received (because a single vendor gets thousands of samples some of which are manually marked as spam by their users). We have become good at detecting massive events with the data science toolkit, so a spammer is very likely to fail in sending a few thousand emails without them being marked as spam.

However, it would be easy to manually design an (unwanted) email and send it to a few recipients passing their spam filters.

Once again, automated recognition can be defeated.

This is not an attack

Throughout the years we have added logging capabilities to network devices and applications that run in a network. Desktops and other endpoints, routers, web servers and other network equipment, and the applications that run therein produce logs. SIEMs (Security Information and Event Management) tools receive (or retrieve) these logs.

While producing and properly recording logs is a requirement for many security capabilities, which include protection, detection and probably recovery, it is further required to do something with these logs. The logs are practically meaningless without an information extraction step. Basically, logs are raw data and one can answer different questions by processing them. In particular, one can detect a breach while it is happening or shortly after by processing logs. This processing is done with machine assistance, which sometimes may entail simply applying a set of rules over the logs (e.g., every time the antivirus of a desktop generates a log with a compromise attempt, the rule detects an attack). However, as attacks get more sophisticated, they are not detected thoroughly by a sensor (i.e., source of logs), but by a combination of logs that may come from more than one source if necessary.

Eventually, it is a human or a system that will receive the collected logs and determine if there has been an attack. The volume of logs produced is so high that it becomes a problem to tune what logs are to be examined. Tuning is done so that either the SIEM shows only a portion of the logs to the humans which review this information, or shows all. The tradeoff here is obvious: reduce workload at the cost of losing the possibility of detecting an attack.

Enter Artificial Intelligence. Yes, one can train an artificial intelligence to detect attacks. An artificial intelligence that receives logs can be trained to detect attacks. Any number of different artificial intelligence designs and training techniques may be used. In the end, a (dynamic) mechanism that includes the sensors, the tubes used to transmit the logs to a SIEM, the SIEM itself, and the AI processing the logs in the SIEM, becomes a unit whose purpose is detecting attacks. The advertisements write themselves: an AI that learns how to detect attacks! It no longer circumscribes to known patterns or known attacks; it further detects new attacks!

Is AI the silver bullet of computer security?

Certainly not. As we showed that GANs may be used to defeat image classification, we may do the same with attack classification. Image recognition is not the sole example (see, e.g., [3]). It simply boils down to using GANs with a training set for attack recognition, and while doing this to defeat a specific SIEM + AI pair that is deployed for a company may be difficult (e.g., because we may ignore the training set or the AI configuration), this difficulty does not guarantee that these attacks will not happen. In computer security we do not build defenses so that they probably work. In many cases this means that an attacker with enough time can win.

* * *

We can come up with more examples. Artificial Intelligence cannot possibly detect all instances of X, and more importantly, once the attacker knows we are using technique Y to detect attacks, he may train an AI to circumvent Y’s detection. Hence, this double edged sword that is artificial intelligence.

The Berryville Institute of Machine Learning (BIML) focuses on building a taxonomy of known attacks on ML and different aspects of ML risk. Their website includes extensive bibliography on the pitfalls of using machine learning to build defenses.

This is open research. Frontiers are being explored daily as it becomes clear today that -a priori- every classification machine learning algorithm is vulnerable to attacks. There is no silver bullet.


[1] “Image recognition algorithm based on artificial intelligence,” by Hong Chen, Liwei Geng, Hongdong Zhao, Cuijie Zhao & Aiyong Liu. In Neural Computing and Applications volume 34, pp 6661–6672 (2022)

[2] “Generative adversarial networks” Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio. Communications of the ACM, Volume 63, Issue 11. pp 139–144 (2020)

[3] “Bad Characters: Imperceptible NLP Attacks”, by Nicholas Boucher, Ilia Shumailov, Ross Anderson,Nicolas Papernot. https://arxiv.org/abs/2106.09898. 2021.

Contact us