Content Protection Benchmarks

Our Veil Content Protection Benchmarks

Introduction to Adversarial Content Protection

There are several content protection technique available to content creators from digital watermarking services to more advanced technologies like Glaze and Mist.

At Living Assets, we've developed Veil to allow you to protect image files from non-consensual AI scraping easily through our Discord. In this section, we'll briefly discuss Veil through Discord, our intention with this service, and our benchmarks of effectiveness of the Veil.

If this topic interests you, we encourage you to Join Our Discord which you can do here.

Veil Examples

In the three drop downs below, you can see an Original image, an example of an adversarial Veil that is applied to the image, and the Veiled Image, after the Veil has been applied.

Original Image

This is an example of an Original image. No perturbations are added. This image can be scrapped and used to train an AI.

Perturbations

This is an example of Adversarial Perturbations that are custom generated and added to media files (images in this case) to disrupt AI model training processes. The perturbations disrupt AI models' ability to make accurate associations between the subject of images and the words we might use to describe an image.

Veiled Image

This image has Living Asset's Veil applied. If an AI model ingests this image for training purposes, it is likely to have a disruptive affect on the model's training and performance. While the looks the exact same to a human, to an AI the "parameters" of this image will be undetectably disrupted or shifted.

Our Benchmarks

Computers see images and videos as rows and columns of numbers.

If we wanted to know the impact Veil has on those numbers, there are two ways Living Assets measures that impact.

Structural Dissimilarity

Structural dissimilarity tells us how dissimilar images are to one another.
Big differences (more than 10%) verify that we've created a new data object.
We can be assured that the data object being ingested into a machine learning model, without our permission, is fundamentally a different object than what a scraper is looking for.
The percentage is "level of dissimilarity".
High dissimiliarity implies greater stochasticity and entropy in statistical models.

Structural Dissimilarity: To an AI model, the Veiled image is 56.41% different than the original . (That's great for you, and bad for non-consensual AI trainers! 😉)

Mutual Information (MI)

Measuring Mutual Information tells us how much the Veiled image and the original have in common.

We know how different it is, but how adversarial is that for an AI scraper?

We answer the question, "How disrupted will an AI model be after it ingests a veiled content?"
We've shifted the concepts that a statistical model try to guess when comparing our Veiled asset to things it already knows.

Normalized Mutual Information (NMI): 1.42 In the example above, this is the NMI measurement between the Original and the Veiled image.

NMI < 1 represents no longer being recognizable to a human, as much information is lost.
NMI == 2 represents near identifcal correlation between information spaces.
1 < NMI < 2 represents a "sweet spot", maximum disruption for illicit training and no changes to the user experience.

.

🌿 Going deeper... 🌿

We can use structural dissimilarity and mutual information to model the impact our adversarial techniques will have on a statistical AI model.

Impact=α⋅D(E A ​ ,E B ​ )+β⋅ΔH−γ⋅NMI(A,B)

🖼️ The Big Picture 🖼️

By introducing these slightly changed images via Veil, we're able to confuse an AI model.
It starts to see things differently, leading to more mistakes in recognizing what’s in the images.
This is like a librarian putting books on the wrong shelves because of minor changes in their appearance, leading to confusion and misplacement.

One misplaced library book is not a big deal. But most AI models are trained on millions or billions of parameters. Imagine more and more misplaced books in a giant library of four billion books. Over time, the misplacements add up and ultimately the library is no longer useful. It is simply a maze of books without organization.

The Embedding Distance

In an AI model's high-dimensional embedding space, each feature (e.g., fur texture, snout shape) corresponds to certain dimensions. The Veil changes the values along these dimensions, causing the embedding to move from the original "cat" position.

In any AI model with 100 million paramters, the example Veiled image is 2168637.7 embeddings away from the original in an AI model with 100 parameters.

Take Home Test: Dictionary Analogy 📖

In a dictionary, the definition of "cat" might be something like:

Cat: A small domesticated carnivorous mammal with soft fur, a short snout, and retractile claws. Often kept as a pet or for catching mice.

With the Veil, the embedding shifts slightly. Imagine this as adding a few additional words or altering a few terms in the dictionary definition:

Cat: A small domesticated carnivorous mammal with slightly rough fur, an unusual snout shape, and claws that are not entirely retractile. Sometimes seen in atypical environments.

This metric tells us how much the librarian's confusion affects the organization of the library.
A higher score (greater than 1000) means the librarian is making more mistakes in placing books on the correct shelves.

Impact Metric:

What is the average parameter shift likely to be observed in a statistical model?
How much we're "throwing off" an AI model.

Impact Estimate: 2,168,637.47 With a single Veiled image, we've impacted the accuracy of an AI model with 100 million parameters by shifting around 2 million of those parameters.

Parameter Shift:

By what factors are we estimated to shift each individual parameter associated with the veiled image when compared with the original?

Parameter Shift Statistic: -0.4 In other words, each individual pixel is likely to be associated with a wrong pixel by a degree of -0.4.

If this topic interests you, we encourage you to Join Our Discord which you can do here.

PreviousOur Discord NextIntegrations

Last updated 9 months ago