The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models. Is intended as a way to fight back against AI companies that use artists’ work to train their models without the creator’s permission.
ARTICLE - Technology Review
ARTICLE - Mashable
ARTICLE - Gizmodo
The researchers tested the attack on Stable Diffusion’s latest models and on an AI model they trained themselves from scratch. When they fed Stable Diffusion just 50 poisoned images of dogs and then prompted it to create images of dogs itself, the output started looking weird—creatures with too many limbs and cartoonish faces. With 300 poisoned samples, an attacker can manipulate Stable Diffusion to generate images of dogs to look like cats.
I’m interested to know how they fool the AI while keeping it invisible to the human eye. Do they make additional layers? Do they change every nth pixel? Is every poisoning associated with another poisoned object? (Will a dog always be poisoned towards a cat?, etc…)
Interesting, but a bit hard to understand.
how they fool the AI while keeping it invisible to the human eye
My guess is that AI companies will try to scrape as much as possible without a human ever looking at the data.
When poisoned data start to become enough of a problem, that humans have to look over very sample, then this would increase training cost to to a point where it’s no longer worth to bother with it in the first place.
Boy, these conservative srtists just keep trying, bless their little hearts. Nobody tell them adversarial training was invented by us already.
I absolutely love this. I’m not even an artist, but I’m giddy over this.