Most instances don't have a specific copyright in their ToS, which is basically how copyright is handled on corporate social media (Meta/X/Reddit owns license rights to whatever you post on their platform when you click "Agree"). I've noticed some people including Copyright notices in posts (mostly to prevent AI use). Is this necessary, or is the creator the automatic copyright owner? Does adding the copyright/license information do anything?
Please note if you have legal credentials in your reply. (I'm in the USA, but I'd be interested to hear about other jurisdictions if there are differences)
It's crazy to me that anyone thinks it does anything. How can someone who cares enough about AI not know the controversies about OpenAI's training data?
The people and organizations building LLMs do not give a fuck if you add that garbage to your comment or not.
Does that mean creative commons doesn't really mean anything? I have my website cc by sa, thinking or changing it to cc by sa no cc but I feel like companies would still take my stuff from my website.
Yeah, it's unclear whether copyright is even relevant when it comes to training AI. It feels a lot like people who feel very strongly about intellectual property but have clearly confused trademarks, patents, copyright, and maybe even regular old property law - they've got an idea of what they think is "right" and "wrong" but it's not closely attached to any actual legal theory.
In the vast majority of countries, everything written down is automatically copyrighted by default and if you want to release it into the public domain or under a free license you have to make it explicit.
It’s not really fully determined whether you can actually release something to the public domain, since the “public domain” is not a legally sanctioned entity. It’s just the name we use for things that are uncopyrightable or otherwise not copyrighted (like certain government works, or works old enough that the copyrights have expired). The CC0 license from Creative Commons gets around this by waiving all copyrights instead.
This waiver nullifies and voids all copyright on a work. It also provides a fallback all-permissive license in case the waiver is deemed legally invalid. In the worst case that even the license is deemed invalid, the license contains a promise from the copyright holder not to exercise any copyrights he/she owns in the work.
I'm writing this response mainly for the purpose of bringing it to the public domain. Feel free to screenshot, copy, and distribute however you see fit.
does adding the copyright/license information do anything?
Not a lawyer, but I'd be sore amazed if "your honor, he copy/pasted my Lemmy comment" flies in court, regardless of your copyright status. The same goes for those AI use notices--they're a nice feel-good statement, but the scrapers won't care, and good luck (a) proving they scraped your comment, (b) proving they made money on it, and (c) getting a single red dime for your troubles.
On top of all comments generally being copyrighted by their author automatically, the licence at the bottom of a comment is like a no trespassing sign. The sign itself doesn't stop people from trespassing. You still need to call police when someone trespasses. If you never call police then the sign is literally useless.
The licence is the same thing. If someone includes it at the bottom of all their comments, but never launches legal action when someone violates that licence agreement, then it's literally useless. Given that launching legal action is incredibly expensive, I highly doubt the people using these licences will ever follow up. Also, how will they even know? How will they know a company used their comment as training data for their commercial AI? How are they going to even enforce the terms of the licence?
This is a really good point. If someone did violate your copyright, you have to enforce it. Almost no one is going to do that, so it's effectively not copyrighted.
There's a lot of "you couldn't have been murdered because that's illegal" thinking that somehow putting up a license on your posts stops these AI companies from scraping.
If someone includes it at the bottom of all their comments, but never launches legal action when someone violates that licence agreement, then it’s literally useless.
Well, its 'poisoning the well'. What happens next depends.
For AI companies that actually honor licensing, or are fearful of getting caught at some point, they'll honor/follow the license for the content.
And for those who do not, if they get caught with their hands in the cookie jar, Creative Commons (and other license creators) will have something to say about it. And they will get caught, we all know about black-box programming their models from the outside via our comments.
Finally, Congress right this second is considering new laws about this, so you never know. Companies in the future may be forced to have to explicitly state where the content comes from that they train their AI models on.
As far as wasting my time, all I do is copy/paste this one line of text via a macro keypress ...
The creator is the automatic copyright owner, or in some cases their employer. Copyright is automatic through international treaties like the Berne convention. The Berne convention is from the 19th century and was created by the authoritarian european empires of the time. The US joined only in 1989. I think your question shows that the idea has not fully taken hold of the public consciousness. Automatic copyright is now the global norm. (I always wonder how much its better copyright laws helped the US copyright industry to become globally dominant.)
Very short and/or simple texts are not copyrighted. IE they are public domain.
Adding a license statement gives others the right to use these posts accordingly. It only serves to give away rights but is not necessary to retain them. The real tricky question is the status of the other posts. I'd guess most jurisdictions have something like the concept of an implied license. Given how fanatical some lemmy users are on intellectual property, not having it in writing is really asking for trouble, though.
What such a license means for AI training is hard to say at this point. The right-wing tradition of EU copyright law gives owners much power. They can use a machine-readable opt-out. Whether such a notice qualifies is questionable. However, there is no standard for such a machine-readable opt-out, so who knows?
US copyright has a more left-wing tradition and is constitutionally limited to certain purposes. It's unlikely that such a notice has any effect.
which is basically how copyright is handled on corporate social media (Meta/X/Reddit owns license rights to whatever you post on their platform when you click “Agree”).
Yes, this is how it works. You give them a license to your posts.
I’ve noticed some people including Copyright notices in posts (mostly to prevent AI use). Is this necessary, or is the creator the automatic copyright owner?
The creator automatically owns the copyright. People can put in license terms, but they're effectively useless in this context. Let's say OpenAI violates the copyright on your post (it's still an open question whether or not training AI on copyrighted data constitutes copyright infringement, but we'll assume it does). Your only recourse is to sue them if they do this. Because you never registered the copyright, you're limited to recovering actual damages -- if you do register the copyright you can get statutory damages, which are up to $150k per violation. So how much money did you lose on the ability to commercially exploit this post that OpenAI took away from you by copying your posts? Less than the cost to bring the suit, I'm sure.
So the TL;DR here is that the anti-AI licensing thing is only effective if you're registering the copyright on your posts/comments. And even then, that's only true if AI training is considered to be copyright infringement.
I don't think it exists at all on the fediverse. I've talked about it before, not a lawyer, but from a technical standpoint I don't know how anyone can claim copyright.
All fediverse apps start on your instance, you write a post. Great, maybe there's a disclaimer there. But then it's shotgunned out to literally anyone or anything that's listening. Other instances, governments, corporate, whatever. You're literally giving it to anyone who would listen.
So to me copyright is like saying "only people I approve of can look at this sign" and then posting that sign on every tree and post in town
Copyright isn't about who can look upon something so much as who can reproduce it. However, due to the way federation works, it has to be assumed that fediverse users are agreeing to allow anyone using the protocol to reproduce their "works"
Copyright is more than that, like who is allowed to make commercial use of a given work. Just because something is written down in a public forum doesn't give everyone free rein to do whatever they want with it under copyright.
But the fediverse isn't them taking that data, it's you giving your data to them. It's you placing your data directly on their server.
It's more than my flyer metaphor, it's you literally placing your flyers in someone's house and then saying "but you can't do x y or z". Even if morally you are in the right, how would you ever enforce that or prove something in court? You still have the hurdle of "if you didn't want them to have it, you shouldn't have handed it to them"
Copyright is more than that, like who is allowed to make commercial use of a given work. Just because something is written down in a public forum doesn’t give everyone free rein to do whatever they want with it under copyright.
In legal terms, what does that mean? Every post is presumed to be public domain? Or in the process of posting, is there an implied license to generate unlimited copies for the purposes of federation? If someone likes a post and decides to make it into a chapter in their book, which they sell, is the original author entitled to attribution? To compensation?
I would argue that yes, you're posting publicly on a public forum, whose contents are shotgunned out to any listening servers/apis/whoever.
If this were in a courtroom, I'd expect the defense to say that the poster chose to post it on a public forum which was then shared with whoever was listening, that there was no way to expect it to remain private, and there is could be no assumption of privacy with the way it was shared.
For enforcement, there is no way to enforce any sort of licensing with the fediverse model, you handed your post to me, if you didn't want your post handled in a certain way then the response is "Why did you hand it out in the first place?". If someone did make your post into a book, then it's on you, the poster, to make the case that what they did was wrong, and I think it's enough of a grey area here to say that they were simply listening. To flip it around, what if their server has posted terms saying "Anything you give to us will be used for training and publishing." You sent it out to anyone listening, they posted their terms, who is right then?
This is different from normal social media where you posted to a walled garden, where you're bound by just their terms. Now any server can have any rules or terms, and we're blasting our data out to all of them (unless they are explicitly defederated)
You’re literally giving it to anyone who would listen.
The quantity of sharing does not dimish the licensing of the content.
So to me copyright is like saying “only people I approve of can look at this sign” and then posting that sign on every tree and post in town
I mean, ProPublica has explicit instructions on how to share their content with others, content that is licensed with a Creative Commons license, and that includes displaying the license number, when you share the content.
Well first thing is that the license is a copyleft license so it is still allowed to be used, distributed, etc. the only real difference between this license and public domain (as far as I know) is me saying that I don't want it being used for commercial purposes that's it.
Also for me its more just a way for me to say fuck you to everything having to be commercialized so even if it doesn't hold legal water I don't care.
Right but if they use your content anyway and you find out (and that's a big if, because it'll just disappear into some AI data set and you'll never see it again), what are you going to do? Sue?
Ah well then I might try and find a license that doesn't require attribution because I don't care about that part. But the rest seem exactly what I'm going for.