Thursday, April 13, 2017

"Magic AI: These are the Optical Illusions that Trick, Fool, and Flummox Computers"

Following up on yesterday's  "Another Way To Fool The Facial Recognition Algos" in general and more specifically the MIT-linked "Adversarial Images, Or How To Fool Machine Vision" post.

First though a bit of housekeeping.
Just so you know, I don't actually use the make-up techniques featured in the earlier posts. Despite the fact they have some efficacy at fooling the camera they make you look like a moron to human observers on the street. Better to just put on some glasses and blend into the crowd.

More on Adversarial Images, this time at The Verge, April 12, 2017:
There’s a scene in William Gibson’s 2010 novel Zero History, in which a character embarking on a high-stakes raid dons what the narrator refers to as the “ugliest T-shirt” in existence — a garment which renders him invisible to CCTV. In Neal Stephenson’s Snow Crash, a bitmap image is used to transmit a virus that scrambles the brains of hackers, leaping through computer-augmented optic nerves to rot the target’s mind. These stories, and many others, tap into a recurring sci-fi trope: that a simple image has the power to crash computers. 

But the concept isn’t fiction — not completely, anyway. Last year, researchers were able to fool a commercial facial recognition system into thinking they were someone else just by wearing a pair of patterned glasses. A sticker overlay with a hallucinogenic print was stuck onto the frames of the specs. The twists and curves of the pattern look random to humans, but to a computer designed to pick out noses, mouths, eyes, and ears, they resembled the contours of someone’s face — any face the researchers chose, in fact. These glasses won’t delete your presence from CCTV like Gibson’s ugly T-shirt, but they can trick an AI into thinking you’re the Pope. Or anyone you like.

Researchers wearing simulated pairs of fooling glasses, and the people the facial recognition system thought they were.
Image by Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, and Michael K. Reiter

These types of attacks are bracketed within a broad category of AI cybersecurity known as “adversarial machine learning,” so called because it presupposes the existence of an adversary of some sort — in this case, a hacker. Within this field, the sci-fi tropes of ugly T-shirts and brain-rotting bitmaps manifest as “adversarial images” or “fooling images,” but adversarial attacks can take forms, including audio and perhaps even text. The existence of these phenomena were discovered independently by a number of teams in the early 2010s. They usually target a type of machine learning system known as a “classifier,” something that sorts data into different categories, like the algorithms in Google Photos that tag pictures on your phone as “food,” “holiday,” and “pets.”

To a human, a fooling image might look like a random tie-dye pattern or a burst of TV static, but show it to an AI image classifier and it’ll say with confidence: “Yep, that’s a gibbon,” or “My, what a shiny red motorbike.” Just as with the facial recognition system that was fooled by the psychedelic glasses, the classifier picks up visual features of the image that are so distorted a human would never recognize them.

These patterns can be used in all sorts of ways to bypass AI systems, and have substantial implications for future security systems, factory robots, and self-driving cars — all places where AI’s ability to identify objects is crucial. “Imagine you’re in the military and you’re using a system that autonomously decides what to target,” Jeff Clune, co-author of a 2015 paper on fooling images, tells The Verge. “What you don’t want is your enemy putting an adversarial image on top of a hospital so that you strike that hospital. Or if you are using the same system to track your enemies; you don’t want to be easily fooled [and] start following the wrong car with your drone.”

These scenarios are hypothetical, but perfectly viable if we continue down our current path of AI development. “It’s a big problem, yes,” Clune says, “and I think it’s a problem the research community needs to solve.”

The challenge of defending from adversarial attacks is twofold: not only are we unsure how to effectively counter existing attacks, but we keep discovering more effective attack variations. The fooling images described by Clune and his co-authors, Jason Yosinski and Anh Nguyen, are easily spotted by humans. They look like optical illusions or early web art, all blocky color and overlapping patterns, but there are far more subtle approaches to be used.

One type of adversarial image — referred to by researchers as a “perturbation” — is all but invisible to the human eye. It exists as a ripple of pixels on the surface of a photo, and can be applied to an image as easily as an Instagram filter. These perturbations were first described in 2013, and in a 2014 paper titled “Explaining and Harnessing Adversarial Examples,” researchers demonstrated how flexible they were. That pixely shimmer is capable of fooling a whole range of different classifiers, even ones it hasn’t been trained to counter. A recently revised study named “Universal Adversarial Perturbations” made this feature explicit by successfully testing the perturbations against a number of different neural nets — exciting a lot of researchers last month....

On the left is the original image; in the middle, the perturbation; and on the right, the final, perturbed image.
Image by Ian Goodfellow, Jonathon Shlens, and Christian Szegedy