Content Protection Benchmarks
Our Veil Content Protection Benchmarks
Last updated
Our Veil Content Protection Benchmarks
Last updated
There are several content protection technique available to content creators from digital watermarking services to more advanced technologies like Glaze and Mist.
At Living Assets, we've developed Veil to allow you to protect image files from non-consensual AI scraping easily through our Discord. In this section, we'll briefly discuss , our intention with this service, and our benchmarks of effectiveness of the Veil.
If this topic interests you, we encourage you to Join Our Discord which you can do .
In the three drop downs below, you can see an Original image, an example of an adversarial Veil that is applied to the image, and the Veiled Image, after the Veil has been applied.
Computers see images and videos as rows and columns of numbers.
If we wanted to know the impact Veil has on those numbers, there are two ways Living Assets measures that impact.
Structural dissimilarity tells us how dissimilar images are to one another.
Big differences (more than 10%) verify that we've created a new data object.
We can be assured that the data object being ingested into a machine learning model, without our permission, is fundamentally a different object than what a scraper is looking for.
The percentage is "level of dissimilarity".
High dissimiliarity implies greater stochasticity and entropy in statistical models.
Structural Dissimilarity: To an AI model, the Veiled image is 56.41% different than the original . (That's great for you, and bad for non-consensual AI trainers! 😉)
Measuring Mutual Information tells us how much the Veiled image and the original have in common.
We know how different it is, but how adversarial is that for an AI scraper?
We answer the question, "How disrupted will an AI model be after it ingests a veiled content?"
We've shifted the concepts that a statistical model try to guess when comparing our Veiled asset to things it already knows.
Normalized Mutual Information (NMI): 1.42 In the example above, this is the NMI measurement between the Original and the Veiled image.
NMI < 1
represents no longer being recognizable to a human, as much information is lost.
NMI == 2
represents near identifcal correlation between information spaces.
1 < NMI < 2
represents a "sweet spot", maximum disruption for illicit training and no changes to the user experience.
We can use structural dissimilarity
and mutual information
to model the impact our adversarial techniques will have on a statistical AI model.
By introducing these slightly changed images via Veil, we're able to confuse an AI model.
It starts to see things differently, leading to more mistakes in recognizing what’s in the images.
This is like a librarian putting books on the wrong shelves because of minor changes in their appearance, leading to confusion and misplacement.
One misplaced library book is not a big deal. But most AI models are trained on millions or billions of parameters. Imagine more and more misplaced books in a giant library of four billion books. Over time, the misplacements add up and ultimately the library is no longer useful. It is simply a maze of books without organization.
In an AI model's high-dimensional embedding space, each feature (e.g., fur texture, snout shape) corresponds to certain dimensions. The Veil changes the values along these dimensions, causing the embedding to move from the original "cat" position.
In any AI model with 100 million paramters, the example Veiled image is 2168637.7 embeddings away from the original in an AI model with 100 parameters.
In a dictionary, the definition of "cat" might be something like:
Cat: A small domesticated carnivorous mammal with soft fur, a short snout, and retractile claws. Often kept as a pet or for catching mice.
With the Veil, the embedding shifts slightly. Imagine this as adding a few additional words or altering a few terms in the dictionary definition:
Cat: A small domesticated carnivorous mammal with slightly rough fur, an unusual snout shape, and claws that are not entirely retractile. Sometimes seen in atypical environments.
This metric tells us how much the librarian's confusion affects the organization of the library.
A higher score (greater than 1000) means the librarian is making more mistakes in placing books on the correct shelves.
What is the average parameter shift likely to be observed in a statistical model?
How much we're "throwing off" an AI model.
Impact Estimate: 2,168,637.47 With a single Veiled image, we've impacted the accuracy of an AI model with 100 million parameters by shifting around 2 million of those parameters.
By what factors are we estimated to shift each individual parameter associated with the veiled image when compared with the original?
Parameter Shift Statistic: -0.4 In other words, each individual pixel is likely to be associated with a wrong pixel by a degree of -0.4.
If this topic interests you, we encourage you to Join Our Discord which you can do .