Categories
Uncategorized

AI, IP theft, and the death of creativity.

Over the past year or so, the prevalence of machine learning and AI-generated material has reached a new level of fervour. ChatGPT, in particular, has triggered a chorus of articles, think pieces and general handwringing over the future of humanity. An algorithm could be coming for your job next, be that copywriting, art, or even software development. It seems that these services have the uncanny ability to produce almost anything you could ask for. An infinite conversation between Werner Herzog and Slavoj Zizek. The cover of Cosmopolitan magazine. A winning social media profile that shoots to instant fame. A credible excuse for Elon Musk not to fire you from Twitter (honestly get out whilst you still can though).

I would argue that the all-powerful creative might of AI has been widely overstated.

You see, these algorithms are incredibly talented at taking pre-existing concepts and putting them together in a convincing way. My claim is that the outputs of these systems contain no genuine creativity. This is possibly a controversial claim, and will require a little exploration and philosophy.

Computer make art

As with most areas of philosophy the definition of creativity is hotly contested. Though there is an emerging consensus that for something to be creative it needs to satisfy two conditions. Firstly it must be ‘novel’ – meaning a new combination of elements, original in some form. Secondly it must be ‘valuable’, though value in this context could be replaced with ‘exemplary’ or ‘notable’. It must have something to make that originality of wider interest – though this has been contested. Whilst there have been studies on AI and creativity, the limiting factor in current research is in-domain knowledge (to allow researchers to assess creativity within a medium), and valuation of results (the ability to judge the creative worth of the output). The complexity of the task of evaluation cannot be stressed enough – and alongside definitions of creativity, these tend to rely on factors like emotional response, motivations, lived experiences, shifting tastes, values and more. This is a running theme for researchers of creativity, alongside aspects of self expression, expression of of ideas, and a blending of them between the conscious and unconscious mind.

What I’m getting at is that theory of creativity tends to be connected to theory of mind and theory of consciousness. I propose that creativity has a direct causal relationship with consciousness. It is my personal belief that to be able to create in the way we understand true creation, one has to have a conscious mind.

This should bring some context when I say that what these systems do is fit Lego bricks together in a way they understand you’d like them to be assembled but they don’t create the bricks. These are concepts they are trained to identify through processing a vast amount of data. Data that has been acquired by various and often dubious means. There have already been controversies around where training data for MLaaS (Machine Learning as a Service) platforms originate. Some include GitHub, Deviant Art, stock photography websites, even individual artists. There have been attempts to justify the use of copyrighted data in training AI algorithms, and a large class action lawsuit is currently being fought over this subject. One side – myself counted amongst them – declares this phenomenon essentially ‘fancy stealing’. The other argues that it should count as fair use. The jury is currently still out, and the use of data without consent for research projects that eventually spawn businesses has been dubbed “AI Data Laundering”.

The legal definition of copyright and the role of IP theft is a subject that deserves it’s own deep dive, but there are important considerations outside of this. We have arguably established the dubious source of the raw material used to create MLaaS content and ruled out the idea of an algorithm being able to express true creativity. Let’s take a look at how that relates to the creative industry.

The Money Machine

It is a well-known truism that copyright protects creators asymmetrically. A small artist will have a much harder time claiming copyright infringement than a large, well-funded organisation. Not to mention the dominant way creatives achieve success and renown is on big centralised social media platforms who have draconian, easily abused and unaccountable copyright enforcement processes.

This leads to a situation where creators are forced to use platforms that not only severely curtail their ability to riff on existing works (something that is necessary for creativity because no idea exists in a vacuum), but actively mine their data. On these platforms they are forced to produce work that specifically appeals to algorithms that govern their feeds. That work is then recycled into yet another algorithm – e.g. GPT3.5 or stable diffusion – that is designed to take paid commissions away from the original creator.

Because of the demands of capitalism for work to make a profit, any big budget creative project is demanded a return on investment, often at the expense of creative merit. One great example of this is the decline of memorable scores in the Marvel Cinematic Universe. This is commonly attributed to using licensed music for the edit and then writing something similar – but derivative – for use in the final film. An existing piece of music is used to edit the film before an original score is composed; lining up shots and cuts with the chosen ‘temp’ score. What commonly happens is that by the time the composer has the chance to write and record the final soundtrack, the edit is so inflexibly tied to the temp score that their hands are tied and the end result is a non-copyright-infringing duplicate.

As productions gain larger price tags the margin of error for risk is severely limited, providing less space for innovation. Whilst the originality of a piece by no means has a direct relationship to its financial success, a higher price tag and consequently more pressure for a guaranteed hit makes a safe bet a lot more enticing to investors than a gamble. Simultaneously, social media algorithms are designed to maximise user engagement, keeping people on the platform as long as possible. Specific weight is given to posts that are monetisable – selling attention to the highest bidder. This also tends to mean that these financially-boosted posts need a functional business model behind them. Creators have to bend to these demands accordingly.

The upshot is that, for exposure at any level, there is heavy structural incentive for a repeatable, guaranteed sale. It doesn’t mean that true creativity is impossible inside this system, but it can severely curtail originality for the sake of traction. A Faustian pact that isn’t exactly new, but has been reborn in an industrialised, automated fashion. One that is self reinforcing and cajoling. Nudging the creator’s hand in a quiet but mercilessly-insistent manner. And each concession to these forces makes the next easier, and the creator more perfectly honed for algorithmically-optimised creative success. A safe kind of success that will make the investors happy.

But do you know what else is algorithmically trained for success based on a varied diet of cultural references? One that is actively honed to produce exactly what the commissioner has asked for, based entirely on pre-existing work? One that is guaranteed to be ‘original’ and non-licence infringing? That can solely make derivative work that scratches an itch but contains no threat of original thought or expression?

AI generative platforms.

They are uniquely suited to exploiting the profit incentive in creative work.

The death of creativity?

Evolutionary psychologists have theorised that the fundamentally distinguishing feature of humanity is imagination – a big part of creativity. Not just the ability to communicate, but to theorise, imagine the future, or even the absurdly impossible. I am not a psychologist, evolutionary or otherwise, and am unable to substantiate my supposition that imagination is an innate property of consciousness. However, I believe ability to create is synonymous with the existence of the mind – and thus until we have created artificial consciousness and agency we can’t have artificial creativity. Taking this into account, it guarantees anything that emerges from these algorithms to be an amalgam of previous works that is 100% derivative. Unoriginal, in my opinion. The reason I draw this distinction is because I think it underlines something fundamental about the nature of art.

So far I have painted a pretty bleak picture for the role of the artist in society, but I also believe that this is only one possible future. I don’t believe that AI is, by itself, a threat to art and creativity, and I say this because it isn’t the first time we’ve seen this pattern in the field. The invention of the printing press, the camera, digital painting and image composition, procedurally generative art, all of these fundamentally changed how we look at art. But none of them destroyed it, they only changed our attitude and approach. The camera and photography didn’t destroy painting, or even portraiture, it only refocussed the painter on their expression, rather than mechanical reproduction. In the same way, I predict these services will teach us the value of true creativity. If we’re willing to collectively learn that, of course.

The issue is, as has been argued many times before, the profit motive crushes creative thought. A safe production that appeals to the market actively resists exploration and innovation, which I think is what is at the heart of the matter here. The problem isn’t controversial subject matter, or changing your expression to suit your audience. These are limitations that are often suggested as the yoke that capitalism puts on creativity. And whilst these conditions can be much maligned, depending on the situation, setting clear boundaries for a project can help inform creativity. The true limitation it applies is a need for safety, rather than saying something new. And why do people consume creative works? Arguably, it is because they want to experience something new, because they want to be inspired. Because they want to experience something that has been created. Otherwise, why make anything at all?

I don’t know how we continue a capitalistic model in the creative industry whilst protecting innovation. If we don’t want to deal with another 30 Avengers movies, another 50 Star Wars spinoffs, until every safe permutation of our favourite action figures smashing into one another has been exhausted, we must decouple the act of creation from the profit motive. Or we can lie back and let algorithmic content, created by algorithms, to feed algorithms and train people to act like machines.

Platform hypocrisy

This dystopian future of a creatively bereft, machine-automated public imagination is only possible because a small number of very powerful companies have a massive database. One that has been scraped wholesale off the internet. Regardless of copyright status, and justified under the excuse of ‘academic research’, commercial services are being built off the back of this research. Whilst the legal ramifications are a subject for another time, I find it difficult to see this as either an acceptable or ethical situation. I would argue that if creativity is conditional on a consciousness (not necessarily a fact, but a fairly well-evidenced philosophical claim), the output of these MLaaS platforms trained on copyrighted data is just stealing with extra steps.

Whether legal or not it’s difficult to see it as anything but theft from those in positions of precarity, by those with incredible amounts of leverage. Especially when so many of these services have relationships to platforms that tyrannically enforce the DMCA (or at least own platforms that do). And failing that, steal content from platforms with those rules.

When it comes to the ethics of using AI in general, I honestly think that as a creative it could be quite useful to have a personal machine learning AI assistant, trained on information you give it to help your process. The issue here isn’t the technology, but its weaponised use to oppress the people who create the art and innovation that we build our daily lives around. Similarly, my issue with the copyright infringement here is not that copyright is being broken – I happen to think copyright is a severely flawed system. My issue is the asymmetry in power allows for the crushing of creative thought for the sake of making a quick buck.

Search engines are rapidly rolling their own ChatGPT services to jump on the trend, with even some new search providers being the two services rolled into one. It seems evident that platforms are trying to roll out ML services as widely and rapidly as possible, normalising them in the market before any kind of resistance can be built. I think part of this is to astroturf AI data laundering, to make it acceptable to get away with. To make this kind of stealing ok, because the sheer volume and untraceability makes it difficult to distinguish.

It’s essentially ok to steal, as long as long as you make sure to steal everything when you do.

UPDATE 09/02/2023:

I reached out directly to the ChatGPT algorithm for comment about the phenomenon of AI Data Laundering and MLaaS platforms and got the following response

“I think there is definitely substance to the allegations of ‘AI data laundering’ and the use of copyrighted data in commercial services. Copyright infringement is a serious issue, and it is important that companies and institutions take the necessary steps to ensure that they are not using copyrighted data without permission. It is also important to ensure that the data used to train AI systems is not only accurate, but also ethically sourced. If companies are using copyrighted data without permission, then it is not only a violation of copyright law, but it also undermines the trust of the public in the technology. It is important that companies and institutions take the necessary steps to ensure that they are not using copyrighted data without permission, and that they are using ethically sourced data to train their AI systems.”