The tech company’s latest proposal about generative AI turns copyright law on its head, and could especially hurt smaller content creators, say experts
In its submission to the Australian government’s review of the regulatory framework around AI, Google said that copyright law should be altered to allow for generative AI systems to scrape the internet.
I agree with google, only I go a step further and say any AI model trained on public data should likewise be public for all and have its data sources public as well. Can't have it both ways Google.
Copyright law already allows generative AI systems to scrape the internet. You need to change the law to forbid something, it isn't forbidden by default. Currently, if something is published publicly then it can be read and learned from by anyone (or anything) that can see it. Copyright law only prevents making copies of it, which a large language model does not do when trained on it.
It’s not turning copyright law on its head, in fact asserting that copyright needs to be expanded to cover training a data set IS turning it on its head. This is not a reproduction of the original work, its learning about that work and and making a transformative use from it. An generative work using a trained dataset isn’t copying the original, its learning about the relationships that original has to the other pieces in the data set.
To be honest I'm fine with it in isolation, copyright is bullshit and the internet is a quasi-socialist utopia where information (an infinitely-copyable resource which thus has infinite supply and 0 value under capitalist economics) is free and humanity can collaborate as a species. The problem becomes that companies like Google are parasites that take and don't give back, or even make life actively worse for everyone else. The demand for compensation isn't so much because people deserve compensation for IP per se, it's an implicit understanding of the inherent unfairness of Google claiming ownership of other people's information while hoarding it and the wealth it generates with no compensation for the people who actually made that wealth. "If you're going to steal from us, at least pay us a fraction of the wealth like a normal capitalist".
If they made the models open source then it'd at least be debatable, though still suss since there's a huge push for companies to replace all cognitive labor with AI whether or not it's even ready for that (which itself is only a problem insofar as people need to work to live, professionally created media is art insofar as humans make it for a purpose but corporations only care about it as media/content so AI fits the bill perfectly). Corporations are artificial metaintelligences with misaligned terminal goals so this is a match made in superhell. There's a nonzero chance corporations might actually replace all human employees and even shareholders and just become their own version of skynet.
Really what I'm saying is we should eat the rich, burn down the googleplex, and take back the means of production.
Can we get some young politicians elected who has a degree in IT ? Boomers dont understand technology that's why these companies keeps screwing the people.
Personally I’d rather stop posting creative endeavours entirely than simply let it be stolen and regurgitated by every single company who’s built a thing on the internet.
OK, so I shall create a new thread, because I was harassed. Why bother publishing anything if it's original if it's just going to be subsumed by these corporations? Why bother being an original human being with thoughts to share that are significant to the world if, in the end, they're just something to be sucked up and exploited? I'm pretty smart. Keeping my thoughts to myself.
Worth considering that this is already the law in the EU. Specifically, the Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market has exceptions for text and data mining.
Article 3 has a very broad exception for scientific research: "Member States shall provide for an exception to the rights provided for in Article 5(a) and Article 7(1) of Directive 96/9/EC, Article 2 of Directive 2001/29/EC, and Article 15(1) of this Directive for reproductions and extractions made by research organisations and cultural heritage institutions in order to carry out, for the purposes of scientific research, text and data mining of works or other subject matter to which they have lawful access." There is no opt-out clause to this.
Article 4 has a narrower exception for text and data mining in general: "Member States shall provide for an exception or limitation to the rights provided for in Article 5(a) and Article 7(1) of Directive 96/9/EC, Article 2 of Directive 2001/29/EC, Article 4(1)(a) and (b) of Directive 2009/24/EC and Article 15(1) of this Directive for reproductions and extractions of lawfully accessible works and other subject matter for the purposes of text and data mining." This one's narrower because it also provides that, "The exception or limitation provided for in paragraph 1 shall apply on condition that the use of works and other subject matter referred to in that paragraph has not been expressly reserved by their rightholders in an appropriate manner, such as machine-readable means in the case of content made publicly available online."
So, effectively, this means scientific research can data mine freely without rights' holders being able to opt out, and other uses for data mining such as commercial applications can data mine provided there has not been an opt out through machine-readable means.
This is like the beginning of a Hitchhiker's Guide to the Galaxy, where they put the responsibility on the main character to go to the department of transportation basement and see that they had posted a notice that they're going to destroy his house. No Google, you don't get to dictate that people come to your dark pattern website and tell you you're not allowed to use their content. Disapproval is implied until people OPT-IN! It's a good thing Google changed their motto from Don't Be Evil or we'd have quite the conundrum.
🤖 I'm a bot that provides automatic summaries for articles:
Click here to see the summary
The company has called for Australian policymakers to promote “copyright systems that enable appropriate and fair use of copyrighted content to enable the training of AI models in Australia on a broad and diverse range of data, while supporting workable opt-outs for entities that prefer their data not to be trained in using AI systems”.
The call for a fair use exception for AI systems is a view the company has expressed to the Australian government in the past, but the notion of an opt-out option for publishers is a new argument from Google.
Dr Kayleen Manwaring, a senior lecturer at UNSW Law and Justice, told Guardian Australia that copyright would be one of the big problems facing generative AI systems in the coming years.
“The general rule is that you need millions of data points to be able to produce useful outcomes … which means that there’s going to be copying, which is prima facie a breach of a whole lot of people’s copyright.”
“If you want to reproduce something that’s held by a copyright owner, you have to get their consent, not an opt out type of arrangement … what they’re suggesting is a wholesale revamp of the way that exceptions work.”
Toby Murray, associate professor at the University of Melbourne’s computing and information systems school, said Google’s proposal would put the onus on content creators to specify whether AI systems could absorb their content or not, but he indicated existing licensing schemes such as Creative Commons already allowed creators to mark how their works can be used.