Some researchers at Google Brain have discovered a technique by which a black-box decider that has been successfully trained for one task can be used to perform an unrelated computation by embedding the inputs for that computation in the input to the black-box decider and extracting the result of the unrelated computation from the output of the black-box decider.
One of the proof-of-concept experiments that the paper describes uses ImageNet for recognition of handwritten numerals. The inputs for the numeral-recognition problem are small images (twenty-eight pixels high and twenty-eight pixels wide), and the task is to determine which of the ten decimal numerals each input represents. Normally ImageNet takes much larger, full-color images as inputs and outputs a tag identifying what's in the picture, chosen from a list of a thousand fixed tags. Numerals aren't included in that list, so ImageNet never outputs a numeral. It's not designed to be a recognizer for handwritten numerals.
But ImageNet can be coopted. The researchers took the first ten tags from the ImageNet tag list and associated them with numerals (tench ↦ 0, goldfish ↦ 1, etc.). Then they set up an optimization problem: Find the pattern of pixels making up a large image so as to maximize the ImageNet's success in “interpreting” the images that result when each small image from the training set for the numeral-recognition task is embedded at the center of the large image. An interpretation counts as correct, for this purpose, if ImageNet returns the tag that is mapped to the correct numeral.
The pixel pattern that emerges from this optimization problem looks like video snow; it doesn't have any human-recognizable elements. When one of the small handwritten numerals is embedded at the center, the image looks to a human being like a white handwritten numeral in a small black square surrounded by this random-looking video snow. But if the numeral is a 9, ImageNet thinks that it looks very like an ostrich, whereas if it's a 3, then ImageNet thinks that it depicts a tiger shark.
Note that ImageNet is not being retrained here and isn't doing anything that it wouldn't do right out of the box. The “training” step here is just finding the solution to the optimization problem: What pattern of pixels will most effectively trick ImageNet into doing the computation we want it to do when the input data for our problem is embedded into that pattern of pixels?
The researchers call the optimized pixel patterns “adversarial programs.”
Besides the numeral-recognition task, the researchers were also able to trick ImageNet — six different variants of ImageNet, in fact — into doing two other standard classification tasks, just by finding optimal pixel patterns — adversarial programs — in which to embed the input data.
“Adversarial Reprogramming of Neural Networks”
Gamaleldin F. Elsayed, Ian Goodfellow, and Jascha Sohl-Dickstein, arXiv, June 28, 2018