Get my free guide: How to draw anything →

How AI is stealing your art

youtube cover

Today I want to take a closer look at AI or machine learning from the perspective of an artist, and after taking a look at what this technology is and how it works and how it changes the creative community, I’m going to tell you how AI stole my own art and how I think about the entire topic as an illustrator.

Lately, there’s been a lot of talk about AI generated art and texts on the internet, and it has caused a lot of discussion in the creative community. AI generators are using artworks and texts and all kinds of intellectual property to generate new versions of it – without the consent of the human creators. It is already changing our culture and raises a lot of questions about the value of creating any kind of art.

Here’s a video version of this post:

How AI is stealing your art

What is AI

In case you don’t know, AI stands for artificial intelligence, which is a term that’s a bit misleading, since right now, in 2023, there’s no actual intelligence as we understand it behind the processes and algorithms of the tools described as AI. The AI language models and image generators have no real understanding of the data they’re processing, they don’t make decisions, they aren’t aware of themselves, and they do not have an understanding of the world. Don’t assign them human qualities just yet: They’re clever algorithmic programming, and they put together their answers or images from the data they were fed before (the companies call that “training”). A less misleading term instead of AI would be to say machine learning, although – again – the machines in questions are not doing any independent learning similar to humans or other lifeforms. For all those limitations, they’re shockingly effective in a few areas – there are texts and images floating around the web that you only recognize as AI-generated at second glance.

How does AI work

Let’s take a look at how AI works. The language models and image generators need samples before they can generate anything – they can’t create text or images from nothing. The companies call this “training” – the AI is being provided huge datasets that are run through several algorithms. Where does that data come from? From us. Human creations. Books, media, the internet. In the case of image generators, the entire internet is scraped so that as many images as possible can be put into datasets for the AI to learn from. It’s very probable that what you can read here will soon be scraped for Chat GPT (a very refined chat bot) to integrate into its training dataset, and I’ve already found my art in the datasets of image generators like Stable Diffusion (more on that in a bit). Basically, if you can see it on the internet, it will be taken. Without our data (or “content”, as corporations like to call it), none of these tools would exist.

The current limitations of AI

AI-generated art can often look a bit creepy, since the computer doesn’t have an understanding of human anatomy, for example. It just puts together new images based on thousands of images that were put into a dataset. So sometimes AI-generated images have weird or creepy looking humans with 7 fingers, or deformed heads. But it can more or less reliably spit out a red panda dancing in the pale moonlight in the style of Albrecht Dürer if you want it to. Many of the generated images look a bit weird, or creepy. But every art style of the past or present is within its reach, provided it has ben fed with enough scraped data about it. If it’s on the internet, it’s very likely you can find it in a dataset used by AI image generators. AI art is already beginning to flood stock image services. Especially for fields like editorial illustration that are often based on very abstract concepts and imagery, it’s being used already by magazines or web publications. We are only in the beginning stages on this technology, so it’s very likely that the results will become more refined over time.

Texts written by language models often produce texts that seem weirdly hollow, but they can imitate all kinds of styles. Poems, product reviews, summaries of novels, talks, code, fairy tales, scientific papers – there’s nothing Chat GPT won’t generate for you, and it always presents the results sounding very confident and matter of fact – even if the facts don’t hold up and are entirely fabricated. In the case of scientific papers, it just makes up its own non-existing citations to make the text sound like a scientific paper. In the case of code, it produces code that won’t work. If you ask it to summarize a novel, it will do so without even having access to the novel, and simply claim random things about it. If you ask it to write your bio, it will do so and simply make up stuff. This technology has no built-in fact checker, and it doesn’t announce that it just makes up things out of thin air – or rather out of the data it has been fed and puts together in convincing sounding sentences.

How AI is changing the creative industry

For all of its current limitations, AI and computer-generated images or texts could make a huge part of today’s creative industry obsolete in the n ear future – if you use artificial image generation tools like Stable Diffusion or Midjourney to put together illustrations (think of editorial illustrations, childrens’ books, book covers), and let language models like Chat GPT write your texts (think of essays, articles, novels), you don’t need artists or writers anymore. Or do you? This is not just a dystopian fantasy, it’s happening all over the web and in the real world right now.

The publisher Tor Books recently had to admit they had been using an AI image generator for the cover art of one of their books. There is even a childrens’ book that has been created entirely through the new technology: it was written using Chat GPT, and the illustrations were generated with Mid Journey, an AI image generator. You just need to take a look at it to see that right now, it can not really replace human writers and illustrators, but give the technology a few years at this might look very different.

The internet is also already being flooded with texts generated through Chat GPT (with no-one really fact-checking them), I’ve heard from from a freelance writer who has been contacted by their client: the writer’s services wouldn’t be needed anymore for writing entire texts, but the client offered them to edit text generated by AI (to make it sound more human), for a much lower fee. Buzzfeed recently announced they will use AI for content creation, and their stock went up 200%. They will also cut their workforce by 12%. There are schools which have already banned Chat GPT, because students (of course) will use the technology to write essays for them – with no consideration for, let’s say, historical facts. The text generator also likes to “balance out” more controversial topics, in the sense of false balance: for example it will give arguments why colonial racism or the Holocaust might not have been entirely negative. I hope you can see why this is a very scary development.

From a more poetic and existential perspective, right from the heart of an artist, the musician Nick Cave has commented on what he thinks about Chat GTP writing song lyrics in the style of Nick Cave. (If you’re not familiar with Nick Cave & the Bad Seeds, or any of his other work, definitely give it a listen.)

And soon, the internet might become unusable through tools like this: Even now, the internet is already flooded with bullshit texts that sound like they’ve been through a translation engine a few dozen times. Maybe you’ve already noticed that searching for high quality information on the internet has already become almost impossible – the search results are filled with scammy sites that are just generated and optimized for search engine results, mainly to squeeze a bit of ad money out of them. When I search for things like “What’s the difference between Mac and Windows computers” or “What are foods with a lot of iron” (or even “How to care for your hamster”) I either get a spammy bot-generated site with nonsense text, or a video that has been created by someone who reads those spammy texts out loud. This will only get worse with tools like Chat GPT, but hey, at least we’ll get AI-generated art next to it to brighten up the text! And it’s already happening: There are questions on Quora (a platform for getting answers to questions posted by its users) that are flooded with copy & paste answers by Chat GPT. In this case, it was not as bad, because the questions were also generated by Chat GPT!

You can often distinguish AI writing from human writing through its tendency to provide very long, non-committal answers, a bit like writing when its optimized for search engines (which makes sense if the language models sources a lot of its texts from the internet).

Of course you need to make sure that you precious AI language model itself isn’t trained on those stylistically questionable robot-written texts, but only on human texts – so there will likely be a watermark to make sure of that. I’m sure there will be similar watermarks in AI-generated images. And this will also be the business model behind Chat GPT – imagine how you can sell this stuff to schools, or newspapers, because they depend on detecting AI-generated texts. By the way, Microsoft has already planned to make AI text creation a part of MS Word.

Who is funding and creating these AI tools?

Billionaire investors are funding Silicon Valley start up companies with venture capital, who then build these tools – the same kind of people who have brought us social media and corporate platforms that destroy existing communities and the way societies and the public work. The people behind this proudly call this “disrupting” and “breaking things”. Let’s take a look at our AI friends: The Laion 5B dataset that provides the data for the image generator Stable Diffusion is put together by a group funded by VC capital. Their website (laion.ai – not gonna link it directly) says „We believe that machine learning research and its applications have the potential to have huge positive impacts on our world and therefore should be democratized.“ “Democratizing” in this case simply means stealing data from people.

How I found out that an AI had stolen my art

So just out of curiosity I wanted to check if my art has been scraped from the web and used by Stable Diffusion. They graciously allow you to do that on haveibeentrained.com. Behind this site is a site called Spawning.ai, and they state in their Q&A that „copyright is an outdated system that is a bad fit for the AI era.“ They also state that „We believe that […] the artist community will benefit from this [AI] training to be consensual.“ Let me explain briefly why I shudder when I read this: As an artist, the intellectual property right is the most important right I have. If I can’t manage the rights to my art and decide who is allowed to do what with it, it becomes worthless. If everyone can just take it, how am I going to get paid?

I didn’t expect to find anything because my art isn’t like the digital AI art style that you see a lot from these image generators. I go field sketching, and create scientific illustrations, I’m not a very well known artist either. But I was curious. The search result was a surprise. The site found countless examples of my art, scraped directly from my website and from other places on the web. What I found really unsettling is that the algorithm grouped many of my pictures directly together – I’m not sure if it really does that based on my style or on alt tag or url properties – it was creepy moment.

What I saw on haveibeentrained.com after I uploaded one of my images too see if it’s been scraped – it found plenty more! The images with the little box in the upper right corner are the ones I’ve marked for removal.

How you can remove your art from AI training datasets

The site claims to let you opt out of this kind of image scraping. To opt out of your images being used, you have to create an account, mark every single image by clicking right, and as of right now you can’t even exclude your own domain or mark several images at once. If you have lots of pictures of your art on the net, which artists tend to have, you will be busy for a long time.

Here’s the catch: The „opt out“ only works for future versions of the datasets – these are the base of machine learning systems like Stable Diffusion. Laion 5-B for example, is announced on their website as „a new era of open large-scale multi-modal datasets”. In their FAQ they mention that these datasets are „simply indexes to the internet“, so lists with of urls and alt texts, and no images are stored in the dataset itself, images that might have been stored at any point to train the AI have been deleted again. They don’t address any other copyright issues in their FAQ (laion.ai/faq).

I didn’t volunteer any of my images for web scraping. I certainly didn’t allow them to use any of my art, and I suspect none of the artists whose images I saw did. The people building these datasets and image generators are literally stealing our intellectual property, and are telling us it’s for our best.

Can we protect our images and the copyright on our images?

A group of artists has just filed a lawsuit against Stability AI, Midjourney, and DeviantArt for their use of the image generator Stable Diffusion that uses millions of copyrighted works. The stock image agency Getty Images has also sued Stability AI for copyright violation, but I don’t think they’re doing this to stop the technology – but rather to get good licensing deals in the future. It will still be interesting how these lawsuits play out. Of course, we also need more direct resistance to this in the meantime. There are a few technical or legal measures that individuals can use to protect their images from being used like this. You can take technical measures, like adding an NoAi-meta tag to your website, or tell web crawlers not to index your site via a small robots.txt file, but it’s uncertain if this will be respected by AI image scrapers. The opt-out on haveibeentrained.com I have described above should also work, even if it’s very cumbersome. Wired.com has an interesting piece about what else we can do in different areas to take back control over our images and texts.

In any case, the discussion has just started, and I think it will be critical that we as a society find a way to establish rules around these technologies. It’s already hard for independent artists and writers to protect their intellectual property, but the companies behind these tools are on the way of turning our art into empty, replicated shards of bullshit while they’re data farming us and our creativity and selling it to us as a huge positive impact on the world. It’s not a world I want to live in.

Thank you for reading this blog! It'll always stay free. To keep it going, you can support my work directly through a donation or through my nature sketching classes.


Join my free newsletter and never miss a blog post! You'll get new blog post notifications directly to your inbox. Receive 5 great sketching resources as a welcome gift for joining my newsletter! Here's what's inside:

  • How to draw anything (PDF guide)
  • Getting started with watercolor (free ebook)
  • My favorite tips for creating great sketchbook pages
  • My 5-step guide for drawing birds (PDF guide)
  • My current watercolor palette layout (PDF guide)

By subscribing, you agree that I may process your information in accordance with my privacy policy

16 thoughts on “How AI is stealing your art”

  1. This is just perfectly explained and really necessary!
    Thank you very much, Julia.
    I am going o to share it everywhere I can find good readers. Awareness is a first step. I think we have to slow those developments so artists can figure out how to save Intellectual Property as a fundamental human rights, but also to protect human creativity for the future.

    1. Thank you Georgina for spreading the word! Yes, artists should care about this, although it is all very depressing.

  2. This is a battle that going to escalate beyond the control of every artist. We do not have the capacity (even the energy that’s going to be stolen from creative time) to take on the “big boys”. Copyright means nothing to thieves.
    Good on you for speaking up.

    1. Yes, big tech has the whole world in its grip by now. I still hope there are people (or governments) with the time and energy to take those companies on.

  3. I have a feeling that the AI generating companies are not going to respect copyright, no matter what the results of how-ever many lawsuits happen to be. If they did respect copyright, they just about could not develop their applications. Almost everything is subject to copyright restrictions.

    Those things not subject to copyright are limited to public domain items, either because the artist/author is long dead or because the original creator put those items in the public domain themselves. It simply isn’t enough public domain content for a creditable AI generator to work with.

    The only thing I can be absolutely sure of, is that there is no simple resolution to the problem.

  4. Sounds really creepy. Unfortunately, it is very likely that despite all the protests this AI marching will continue and general public will greet it with hoorays and applause as they will be made to believe this being a good and innovative “new normal” (with just some weirdos not wanting to accept this for some “unknown” reason – probably they are just selfish and do not want to share what they have created, you know). People are by nature lazy and will only gladly accept that there is somebody writing, e.g. essays, for them. By the way, recently there were several articles about this in the Economist magazine – they explained, rather objectively to my mind, how it works and how the owners of AI and its content can become rulers of the world. There was not so much about the intellectual property thing, more about the generating abilities of AI – and it was scary because the authors of the articles were quite sure of the AI ability to produce very human-like writing and art.

  5. Very unbelievable this is happening. Thank you for sharing your knowledge in a straightforward way. I appreciate your blog. Keep it up!

  6. I was having this exact same conversation with a group of friends the other, who are super hyped about Chat GPT and Midjourney. I see people getting used to mediocre texts and image generation, therefor shunning more complex and elaborate art or writing.

    The problem, my friends is that the REAL problem is not so much AI, but capitalism. Billionaires will stop at nothing to squeeze every last drop of life from our dead bodies if that means making more money and ranking better as on of the richest people in the world. The sooner we accept that this economic system will lead us to our ultimate doom, the better, but is seems that corporate media has been successful at taming us to believe that all we need is a little bit of regulation and more ‘liberal’ policies that pour money into the deep pockets of big corporations.

    If AI is capable of coming up with rather convincing image stills in 2023, it won’t be long until it belches out motion picture and music, making swathes of people unemployed and fighting violently for their survival.

  7. I follow you on You Tube and left a comment there as I am also an artist whose work was stolen by LAION-5B and used the Have I Been Trained Website to find and try to opt out what I found there. 12 of my paintings had been stolen. The one thing I keep trying to tell my fellow artists is that no artist is obscure enough not to have their work taken if they’ve ever put an image online. As much as I despise social media, I’m still keeping up an artist page on Facebook and on an alumni page with my fellow art majors, so I’m going to share this blog there. Thank you for every word you said on this topic and I feel your pain at having your work stolen.

  8. I just found 2 dozen of my comic pages and various paintings in haveibeentrained. It’s excruciating and violating. I’m glad I landed here because I don’t know how to explain this pain to anyone who hasn’t experienced it.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top