By José Ignacio Orlando, PhD -Subject Matter Expert in AI/ML @ Arionkoder – and Nicolás Moreira -Head of Engineering @ Arionkoder-.
Recent advances in automated image generation allow users to create fake images from a text description in a matter of seconds. The quality of the outputs is so remarkable that most of them are indistinguishable from human-made pictures.
These tools are fueled by AI algorithms that combine the power of large language models –which comprehensively analyze input prompts– and diffusion models –which take this information to generate plausible artistic outcomes–. Some of these models can be tried out for free, either through a custom-made website or by prompting in Discord. Moreover, companies like Stability.AI have gone one step further, open-sourcing the pre-trained weights of their generative model, Stable Diffusion, so that it can be embedded in any other application.
Public access to this technology allowed everyone to exploit their creativity to unthinkable levels. From redefining movies by changing the characters to creating new tattoo designs and fake portraits and landscapes, this has redefined the way in which artists and creators work, easing their first shots and allowing them to focus more on the tiny details.
However, all new technologies come with a downside that has to be addressed. In this case, it’s about empowering people rather than simply eradicating their jobs. In the last two weeks there has been an extensive controversy around image generation tools, right after some renowned artists saw their creative styles and techniques being used by these AIs to create fake pieces of their art.
The first wave of rage was around Lensa AI, a mobile application that lets you create artificial portraits of yourself in contemporary art styles. The process is simple: you choose a few of your selfies, you upload them to the app, and -after paying- you get 20 to 50 gorgeous images of yourself. In the background, Lensa uses your pictures to fine-tune a Stable Diffusion model so that it learns your facial characteristics and how to insert them into those fake creations.
The secret is not so much how to learn your face features, but rather how to come up with the most pleasant output based on a bunch of text. To this end, creators of the app identified the right words to instruct the model to create nice pictures, a task that is known as “prompt engineering” (Remember those words, it might be the work of the future!).
Everything was going smoothly until an Australian artist started to recognize that some of the outputs resemble the style of other known creators… and even her own! After she publicly complained about it, people started to notice that some of the portraits can eventually include pieces of signatures taken from the original artists, which further emphasize the fact that there might be a certain level of plagiarism going on in there.
From a technological point of view, this is easy to explain. Stable Diffusion was trained on large corpuses of text and millions of images scraped from the Internet, including some of those created by these artists. Therefore, it is not rare that the model has learned their style and how to copy them. This, combined with the fact that Lensa probably has identified the right words to force the model to reproduce this style on your face, definitely solves the case. Of course, we will never know for sure since those words are hidden under lock and key for Lensa to keep profiting. But trust us, the answer is there.
Artists fairly complained about this point, arguing that they never agreed Stable Diffusion to be trained using their productions. Lensa answered that the model simply learned a style, as other humans do, and it’s not copying exactly what they made. Current legislation is on Lensa’s side: as these images borrow ideas from artists’ but do not contain any actual snippets of their work, they are not violating any copyright law. So the only option for creators to avoid this issue seems to explicitly opt-out their images from being used for training new upcoming versions of these generative models. There’s already a website created by a group of artists to double check whether a given image was included or not in the training set, featuring also a tutorial to avoid being part of it in the future.
This debate started to leak to other apps and websites. ArtStation and Deviantart, two websites used by artists to showcase their portfolios, have been recently flooded with banners protesting against AI generated images. The reason? Creators consider it unfair for their art–which took years of progress and learning to look as it looks–to share space with images produced in seconds by an AI model that was even trained on them. ArtStation responded by updating their terms of service to allow artists to protect their productions for being used for training, although some artists considered this reaction to be too weak.
At this point you’re probably asking yourself who’s right or wrong in this debate, right? In our opinion, everyone has a point.
On the one hand, it’s reasonable for artists to stand up and be concerned about the future of their work in this context. The debate around San Francisco Ballet’s promo of the Nutcracker, which showcased an AI-generated picture, went into this direction. If an AI tool can produce multiple outputs in a matter of seconds, aren’t they at risk of not being hired anymore to make banners, covers, cartoons, or article illustrations? This becomes even riskier when we can see pieces of art resembling their own work being instantaneously produced by already commercial tools, all because their images were compulsorily included in a training set. While opting-out seems to be a reasonable short-term solution to this last issue, we should certainly see efforts in the upcoming years trying to update our laws to assimilate these technological changes, and make peace simultaneously with both sides of the debate.
On the other hand, similar debates have occurred before with other technologies. In a super interesting blog post, the AI expert Alberto Romero traces a parallel with previous historical experiences such as the creation of photography: while many thought it was going to kill painting for good, it was yet another disruption in the world of visual art, opening a whole alternative space for creativity. What if the same happens with AI? What if these tools are used by the next generations to come up with new ways to creatively express themselves and profit from it?
As we usually say, it’s a developing story. What are your opinions on this debate? Are you concerned about the impact of image generation models on the art market, or do you think that this technology will unveil a whole fascinating world of creations? We’re happy to read your opinion on our social media! Furthermore, if you envision applications of these algorithms in your business, let us know so we can help you make the most out of them!