I made a short post last night explaining why image uploads had been disabled. This was in the middle of the night for me, so I did not have time to go into a lot of detail, but I'm writing a more detailed post now to clear up where we are now and where we plan to go.
What's the problem?
As shared by the lemmy.world team, over the past few days, some people have been spamming one of their communities with CSAM images. Lemmy has been attacked in various ways before, but this is clearly on a whole new level of depravity, as it's first and foremost an attack on actual victims of child abuse, in addition to being an attack on the users and admins on Lemmy.
What's the solution?
I am putting together a plan, both for the short term and for the longer term, to combat and prevent such content from ever reaching lemm.ee servers.
For the immediate future, I am taking the following steps:
1) Image uploads are completely disabled for all users
This is a drastic measure, and I am aware that it's the opposite of what many of our users have been hoping, but at the moment, we simply don't have the necessary tools to safely handle uploaded images.
2) All images which have federated in from other instances will be deleted from our servers, without any exception
At this point, we have millions of such images, and I am planning to just indiscriminately purge all of them. Posts from other instances will not be broken after the deletion, the deleted images will simply be loaded directly from other instances.
3) I will apply a small patch to the Lemmy backend running on lemm.ee to prevent images from other instances from being downloaded to our servers
Lemmy has always loaded some images directly from other servers, while saving other images locally to serve directly. I am eliminating the second option for the time being, forcing all images uploaded on external instances to always be loaded from those servers. This will somewhat increase the amount of servers which users will fetch images from when opening lemm.ee, which certainly has downsides, but I believe this is preferable to opening up our servers to potentially illegal content.
For the longer term, I have some further ideas:
4) Invite-based registrations
I believe that one of the best ways to effectively combat spam and malicious users is to implement an invite system on Lemmy. I have wanted to work on such a system ever since I first set up this instance, but real life and other things have been getting in the way, so I haven't had a chance. However, with the current situation, I believe this feature is more important then ever, and I'm very hopeful I will be able to make time to work on it very soon.
My idea would be to grant our users a few invites, which would replenish every month if used. An invite will be required to sign up on lemm.ee after that point. The system will keep track of the invite hierarchy, and in extreme cases (such as spambot sign-ups), inviters may be held responsible for rule breaking users they have invited.
While this will certainly create a barrier of entry to signing up on lemm.ee, we are already one of the biggest instances, and I think at this point, such a barrier will do more good than harm.
5) Account requirements for specific activities
This is something that many admins and mods have been discussing for a while now, and I believe it would be an important feature for lemm.ee as well. Essentially, I would like to limit certain activities to users which meet specific requirements (maybe account age, amount of comments, etc). These activities might include things like image uploads, community creation, perhaps even private messages.
This could in theory limit creation of new accounts just to break rules (or laws).
6) Automated ML based NSFW scanning for all uploaded images
I think it makes sense to apply automatic scanning on all images before we save them on our servers, and if it's flagged as NSFW, then we don't accept the upload. While machine learning is not 100% accurate and will produce false positives, I believe this is a trade-off that we simply need to accept at this point. Not only will this help against any potential CSAM, it will also help us better enforce our "no pornography" rule.
This would potentially also allow us to resume caching images from other instances, which will improve both performance and privacy on lemm.ee.
With all of the above in place, I believe we will be able to re-enable image uploads with a much higher degree of safety. Of course, most of these ideas come with some significant downsides, but please keep in mind that users posting CSAM present an existential threat to Lemmy (in addition to just being absolutely morally disgusting and actively harmful to the victims of the abuse). If the choice is between having a Lemmy instance with some restrictions, or not having a Lemmy instance at all, then I think the restrictions are the better option.
I also would appreciate your patience in this matter, as all of the long term plans require additional development, and while this is currently a high priority issue for all Lemmy admins, we are all still volunteers and do not have the freedom to dedicate huge amounts of hours to working on new features.
As always, your feedback and thoughts are appreciated, so please feel free to leave a comment if you disagree with any of the plans or if you have any suggestions on how to improve them.
Personally I say just leave hosting of images to dedicated sites for that purpose. Your efforts are better left to dealing with how to render them. That being said, I use to be in charge of managing abuse on a site that has an average of 20 million posts a month (seriously).
The way I essentially defeated these kinds of attacks was with an image scanning service. It scans for anything NSFW and blocks it. Sometimes things would make it through but once an admin flagged it we could use that to block the users IP and account. It’s not cheap but the volume is also not huge yet for lemm.ee so it might not be too bad.
For step 6 - are you aware of the tooling the admin at dbzero has built to automate the scanning of images in Lemmy instances? It looks pretty promising.
IMO Lemmy shouldn't have media uploading of any kind. Aside from the CSAM risk, it's unsustainable and I think one of the reasons Reddit went to shit is by getting into the whole image/video/gif hosting.
Dozens of media hosts exist out there, and the mobile/web clients should focus instead on showing remote content better.
The success of a forum like this depends on people being able to join and express their thoughts freely. Reddit and digg would never have gotten where they are if they had a closed system.
I almost didn't join lemmy because the first two instances I heard about (lemmy.ml and beehaw) had closed registration. I think I applied and then forgot about it for 2 weeks. Thankfully I saw a post about lemmy on reddit yet again and finally found an open instance.
Don't let the actions of a few scumbags ruin a good thing for everyone. You'll be giving them exactly what they want.
All of this seems good to me except 4 - I hate the thought of any instances being invited only. I'd much prefer it was just a verified user approach (even just an email) with a waiting period for doing things like posting images. Maybe even limit newish users after that period to a small number of image posts a day.
Making an instance feel like a club is going to turn off a lot of people. For sure do what you need to do, but I hope you can avoid that one.
thank you for your work sunaurus, and i'm sorry you had to sort through this
(particularly annoying though, as i never got around to adding a user banner; and i had one in mind as well. i wish there was some way to externally host avatars and banners)
Forums have existed on the internet forever and and have already dealt with this thousands of times previously. You don't need to overthink it or reinvent the wheel. It didn't stop forums existing very comfortably in the past and isn't an issue that should be that different to deal with today.
Simply limit image uploads to a certain account age threshold and karma threshold and you will eliminate 99% of the ability to abuse this.
This has been a great instance since day one, and it's good to see you once again being so proactive. Thank you for the update!
There are downsides with all kinds of moderation, but ultimately most of us accept that the internet can't function as a true free-for-all. Absolutely in support of whatever you feel is necessary to keep the server safe, but please watch out for yourself too and make sure you're asking for help where needed.
p.s. anyone reading this who doesn't donate to the server yet, here's a reminder that that's a thing you can do.
Could you post a guide on disabling the local image cache? I compile from scratch so I’m not afraid of making changes in the code, I just don’t really know rust. I shut down my personal instance and this would allow me to turn it back on.
This is something that many admins and mods have been discussing for a while now, and I believe it would be an important feature for lemm.ee as well. Essentially, I would like to limit certain activities to users which meet specific requirements (maybe account age, amount of comments, etc). These activities might include things like image uploads, community creation, perhaps even private messages.
Sounds like the old karma requirements some reddit subs had. While I'm not against that, it would restrict locally registered users more so than others who are posting on lemm.ee communities when their host instance has no such system in place. I'm aware that if they post images those would be uploaded to their home instance and linked here with the patch you mentioned above, but the downside is that local users might feel inconvenienced more so than others. Not saying it's a bad idea though, if we are thinking from a "protect lemm.ee" angle first and foremost.
Automated ML based NSFW scanning for all uploaded images
You might want to reach out to the dev of Sync for Lemmy, ljdawson on !syncforlemmy@lemmy.world, he just implemented an anti-NSFW upload feature in the app to do his part. Essentially, Sync users currently can't post any kind of porn. While I don't think that the CP spammers were using his particular app, or any app to begin with, I do think it's a neat feature to have, but would make much more sense to run server-side.
Got to be honest, having an invite based system and locking certian features behind age of accounts, karma, etc seems like the opposite of the freedom everyone promised me the Fediverse represented when we moved over.
I personally don't really care about images and would prefer image uploads just stay deactivated and we operate as a text only forum but with open membership.
I like almost everything on this plan, except for the last 2 items. The account requirements for "extra activities" best be chosen carefully as to not encourage the good old "karma farming" that we got away from in leaving Reddit.
And the ML thing for recognizing NSFW is also something to be carefully considered. Too strict and it gets annoying with false positives, it can restrict posting actual content, and too lax won't make a difference for the people actually looking to circumvent it. I think a "vetting" system like the previous item could be better in the long run, in only letting "trusted" people upload content.
I hope there is another option besides just deleting images indiscriminately. I run several comic strip communities and it would be a shame to lose all the posts and work I've put in.
What about implementing Imgur or something similar, assuming they scan for CSAM on their end. For example I often use the Lemmy iOS app and I noticed that all my image uploads using the app are through Imgur.
I prefer a more text based main post experience so this is gonna be good for me. Reddit used to be a fantastic discussion forum until every single post on /all was either an image post or video post. I wish there was a way to completely disable media posts so I could just view discussion posts.
Lemmy admins need to do whatever it is they can to handle CSAM if and when it arises. Users need to be understanding in this because as I’ve argued in other threads, CSAM itself poses a threat to the instance itself, as it poses a threat to the admins if they cannot clean up the material in a timely manner.
This is going to likely get weird for a bit, including but not limited to:
instances going offline temporarily
communities going offline temporarily
image uploads being turned off
sign ups being disabled
applications and approval processes for sign ups
ip or geoip limiting (not sure if this feature currently exists in lemmy, I suspect it doesn’t but this is merely a guess)
totally SFW images being flagged as CSAM. Not advocating against use of ML / CV approaches, but historically they aren’t 100% and have gotten legit users incorrectly flagged. Example
I just want folks to know that major sites like reddit and facebook usually have (not very well) paid teams of people who’s sole job is to remove this material. Lemmy has overworked volunteers. Please have patience, and if you feel like arguing about why any of the methods I mentioned above are BS or have any questions reply to this message.
I’m not an admin, but I’m planning on being one and I’m sort of getting a feel for how the community responds to this sort of action. We don’t get to see it a lot in major social media sites because they aren’t as transparent (or as understaffed) as lemmy instances are.
Seems like a good plan. I have been very impressed with your approach to administer ing lemm.ee.
Regarding the planned invite system, what would be the consequences of inviting a malicious user? I would think it would be hard to enforce any consequences simply because of the open nature of lemmy as an ecosystem.
It is now impossible to add an avatar or banner to profiles because the only way to do so through the UI is uploading to the instance. There’s no way to add an external URL. Just wanted to point that out in case it wasn’t intentional. Very understandable if that’s something we have to sacrifice for the time being.
Edit: I noticed that images will upload to the account's home instance instead of the community's home instance. This means that one workaround for the time being to change your lemm.ee community's icon and banner is to create an account on another instance and then add that account as a moderator to your lemm.ee community. You can then use that external account to change the icon and banner of your lemm.ee community because images will be uploaded to whatever instance your account is on instead of lemm.ee.
I understand that admins need to take whatever measures needed to protect themselves from legal pursuits
At the same time I hate to see the promised federated network revert to what commercial platforms have become, karma and account age requirement, phone and identity verification , forced 2fa and what not.
Maybe lemmy should implement a shared database whereas if an admin of an instance marks a post as potentially illegal, it gets replicated to other instances automatically and gets in queu for deletion.
A karma system is sounding pretty good right now... /me lifts shield and ducks
Even if it's just a a limited tiered system with numbers to obsess about.
Level - 1 browsing rights.
Graduate to level 2 after 5 days and total of greater than 30minutes of logged in activity
Level - 2 commenting rights.
Limited to 10 comments daily for 5 days.
Graduate after at least 3 comments, total upvote count >+3, and 5days.
Level 3 - posting rights. Limited to 3 posts daily for 5 days. Unlimited commenting.
Graduate after 5d and total upvote count >50
Level 4 — image posting rights. 10 images per day max
Graduate after 2 weeks and total upvote count >100
Level 5 - you've made it, everyone is equal here. Entry level users are still enjoying and growing into the community. No need to be a tool about trying to get more karma / points and number of bots / temp accounts / total losers should be minimal by this screening level.
These are great ideas especially the ability for users to invite others. I think it’s also a good way to get new people into the fediverse since inviting someone will have them easily know what instance to go to.
Will you submit all these features to the official lemmy backend too?
Top work - I notice DBZero's instance reckons they've implemented AI scanning-and-blocking for CSAM, it may be worth getting in touch/investigating there.
Hey there!
Why not talk with the main lemmy developer to try and integrate such a content blocker directly into the lemmy stack so that it’s easier to implement for smaller instances?
Thanks for keeping this instance up and runnin’!
Cheers!
I didn’t even know there was an option to load images directly from the source instance instead of caching the content locally. I know it’s a resource issue and it can slow things down a bit for users, but I think ultimately it should be done that way by default, to mitigate exponential propagation of illegal content. Wasn’t caching the main reason why lemmy.world preemptively blocked piracy communities?
That, or admins should be able to selectively choose what communities to cache content from, like maybe the ones where they can confirm there is active moderation.
A way to deal with false positives of an ML NSFW scanner would be: Once per day, each user can "overwrite" the scanner. If a user is caught abusing this, they get banned.
Has there been any developments on the Github in regards to all this? Really, the only things that will solve this long term are proper and granular moderation tools.
Overall it's a tough situation to be in. I feel a combination of account restrictions would be a way to mitigate the majority of these low quality troll accounts who get verified and then immediately start spamming.
Having images uploads tied behind user metrics such as interactions, time since creation, upvote / downvote count etc I feel would be a good indicator of a "real" user. You'll always have bad actors coming in causing issues, but at least making new users jump through hoops will make this process slower.
Closing registrations temporarily to add in extra mod features is fine, but leaving it closed and switching to an invite only system feels like it's going to slow adoption (unless in the request an invite form it's explicit that the request will be processed quickly, people will just move on otherwise)
All images which have federated in from other instances will be deleted from our servers, without any exception
At this point, we have millions of such images, and I am planning to just indiscriminately purge all of them. Posts from other instances will not be broken after the deletion, the deleted images will simply be loaded directly from other instances.
My impression was that this was how this worked from the beginning, but apparently that's wrong. I thought the host instance (that is, the instance of the user making the post, not necessarily the instance of the community) would be the host of the image. Instead, it seems like instances share images and whatnot between themselves, to distribute the load to their own users.
Maybe this core principle is flawed. It should definitely be reviewed, anyway.
I think the images should never be cached from other instances in the first place, that is a huge oversight in pictrs since not only does it have the potential to cache unwanted content but also causes the images hosted to rapidly accumulate which isn't ideal as it increases storage requirements which is unfair to people who want to self-host a personal instance. Hosting a personal instance should not have monstrous storage requirements or serious liability risk due to caching all images automatically, it should only cache what is uploaded to the Instance like profiles and banners, and posts that include images from the Instance.
I have reservations about allowing fully-invite based registrations on lemmy instances. While I do think it might be good to have invites as a way for users to skip filling out an application I don't really like the idea of requiring them like Tildes does, makes it feel like an elitist exclusive club of sorts having to beg for an invite from users. I don't think it should be an alternative to application-based registration, but rather a supplement to it, if someone can get an invite from users that's great but if not they should still be able to write an application to join, this could be extensive and also lower priority since you could get invites but should still be an option available.
Account requirements really depends on what they are and what they restrict (also who on the instance is allowed to impose restrictions). For example on instances with downvotes enabled I think score/upvote requirements are a bad idea since it essentially means that people who disagree are locked out, like on Reddit with karma restrictions, I do not support this, it creates an echo-chamber where unpopular opinions. It'll also lead to upvote farming if there are negatives due to having a lower score.
Comment or post requirements would just lead to post or comment farming similar to vote farming, though it's not as bad as score-requirements since people making posts and comments naturally (whether they are liked or not) can't be taken away by other people based on opinions (only if they break the rules and get posts removed, which isn't even remotely similar since they broke the rules).
Limiting image uploading is a fair requirement in my opinion since uploads can be particularly harmful if the uploads are malicious, and also uploads aren't really needed since people can externally host almost all their images without the need for uploads.
When it comes to DMs and restrictions around them I feel like that should be up to individual users to decide to allow private communication from certain users or not, or even to allow DMs at all, this shouldn't be something globally applied to people, maybe it could be a default in User settings and have a requirement set by the Admins but people should be able to turn it off if they don't care or want to accept messages from new users, I know I certainly will, I hate being nannied when it comes to who's allowed to send me messages, IMO Annoying or uncomfortable DMs are a fact of life and I prefer to deal with issues when they happen rather than block anyone who's a new user that might want to talk to me, it's one of the things I hated that Reddit does without giving me the option to opt out and receive messages from everyone.
I think having a Machine-Learning based system to identify Malicious images is actually a pretty good idea going forward, I know how some people feel about AI and Machine-Learning but I think it's probably our best defense considering that none of us want to see it, it might have False positives but I'd rather than than to allow CSAM to live here. Ultimately the choice is have ML scanning or Disable pictrs here, I think ML is the better option because people are going to want to have Avatars and without pictrs that isn't possible (unless Lemmy adds support to the UI for externally hosted Avatars and Banners).
I understand that this would be a temporary measure, and I hope this gets revisited in the near future.
Got to do what you have to do.
same as 2
I do not agree with invite-based registrations and would prefer other ways to limit sign ups such as what others have already suggested in this thread.
This will be tricky, but if done correctly would be something I can support.
Agreed.
Once again, thank you for this wonderful instance and I'm glad this is my home.
Might I suggest banning reported users? I think with a combination of users reporting posts for rule violations and mods and admins confirming and keeping them banned, it could be a better alternative for the time being.
So, Lemmy.World images seem to be 're-federating' here. I couldn't find any news items over there, but... did the CSAM issue finally get patched at the Lemmy software level?
Well as always users that did nothing wrong are the ones that suffer. I think banning images is overkill. Let the forum police themselves. It’s the way this is supposed to work. Just banning images site wide is pretty draconian and defeats the purpose of the fediverse. Blocking any images that could contain any level of nudity is also overkill. I’ll probably move to a self hosted server eventually.