Stock Photo Companies Randomize Their Watermarks to Foil Google’s Thieving Algorithm

Last month at the Computer Vision and Pattern Recognition conference, Google showed off an algorithm capable removing watermarks from photos. Using neural networks, researchers in the company’s artificial intelligence lab could train an algorithm to identify recurring visual patterns in a watermark (the sans-serif type in Shutterstock’s logo, or the dense logomark of Adobe Stock, for instance) and automatically strip them from an image.

To anyone who produces or sells stock photography, this was troubling news—watermarks have since the early 1990s provided the first line of defense against the theft of unlicensed photos. Granted, anyone with serious Photoshop skills can eliminate a watermark in about an hour. But Google’s technology makes it possible for a computer to remove watermarks from hundreds of images in just minutes, essentially automating the wholesale theft of copyrighted images.

Google didn’t set out to undermine stock photo companies. It was simply doing yet more research into machine learning and how it might apply to images. In fact, the algorithm could one day help make your photos a little better. And the researchers gave the stock photo firms a heads-up, contacting them weeks before demonstrating the algorithm to explain the research and show how, as they outline in their paper, the firms might protect themselves. “They gave us sufficient time so we could mitigate the risk,” says Sultan Mahmood, director of engineering for content at Shutterstock, one of the web’s stock photo giants.

Mahmood and his team have spent the past month developing an algorithm capable of tricking Google’s technology by giving each of Shutterstock’s 150 million photos a unique watermark. To understand why that works, you must know what Google’s algorithm does.

The issue, as Google explains it, is that most stock photo companies apply the same watermark to their entire image library. Feed enough images—fewer than 1,000 in this case—into a neural network, and eventually the network discerns patterns within the watermark. It can identify, for example, gradients, opacity, and shadows, which means even the most complex geometries can be isolated.

Google’s algorithm can separate the foreground image (the watermark) from the background image (the photograph) and start removing it. “If a similar watermark is embedded in many images, the watermark becomes the signal in the collection and the images become the noise, and simple image operations can be used to pull out a rough estimation of the watermark pattern,” the researchers write.

After isolating the watermark, the algorithm can erase the overlay and fill in the blank pixels by extrapolating the surrounding image data.

Condividi:

Correlati