Self-hosted lemmy without serving arbitrary federated content?
I want to self-host lemmy and participate in federation. However, I wonder whether it's possible to have a setup where only I, and trusted users, are allowed to browse federated-content.
Basically, guests should not be allowed to use my instance to browse other federated content. So requests to "mydomain.tld/c/whatever@otherdomain.tld" should not be possible. Only users, logged-in on my instance, should be able to do that.
Despite that, guests should be allowed to see posts of communities posted on my instance, and users of other instances should be allowed to comment.
I know I can choose with which other instances mine should link with, but this would make the experience inconvenient to me. Because then I would need to adjust the config if I want to subscribe to a community on an instance I have not yet linked with.
Is such setup possible? Could not find the answer in the docs unfortunately
The only thing I can think of is something like blocking UI requests, and allow them only from localhost (so I would create a "ssh -L" tunnel on the server). Federation API endpoints would not be blocked. But this seems shaky, does Lemmy support a cleaner, built-in solution?
My main concern on my end is someone using my instance to go browse some illegal content off another instance, and now I'm legally on the hook for it because now my instance is publicly serving that content.
Exactly this is what I am worried about, you can get into trouble quickly. Good luck explaining to lawyers/judges that it's not your content actually but just federation. Even if you could, in most jurisdictions you wouldn't be off-the-hook anyway.
I said it in a higher comment with other info but try looking up a remote community that isn't already known by an instance, without being logged in. It won't look it up for you and just silently fail. If unwanted content is what you're worried about unfortunately a malicious actor can basically just drop content directly into your instance without prior notice if your federation is open. This is why db0 is working on systems that will in the future work like shared blacklists (opt-in of course).
Seriously if you really want to host your own instance, it is more or less your responsibility as an admin to moderate that instance. That includes purging and blocking unwanted contents. There is no way to avoid that.
As for your suggestion, it largely boils down to restricting anonymous access to the search related APIs in an instance. It is no doubt a good feature, espeically for read-only instances. I think you can create an issue about it in Github to get more visibility from the devs.
Wouldn't this do basically nothing to prevent a 3rd party client from browsing your instance without authentication? I don't know that there's much that can really be done about this because you need open APIs for other instances to be able to access the content of your instance in order to make federation possible. That said, it's an important consideration that anybody running a single person instance should consider. If you run a single person instance, people can learn a lot about you just by seeing which communities are available on your instance. The only way to obfuscate your actual interests is to have a dummy account subscribe to all the top communities on the biggest instances. (Which, honestly, this isn't a bad strategy to employ anyway if you're wanting a fresh All feed).
Yes the basic auth way I suggested only protects the lemmy-ui from being accessed which is the lowest hanging fruit in the equation. That's also why I call it the "simplest way". "Interested parties" can still access your instance via API if they know their way.
open APIs for other instances to be able to access the content of your instance in order to make federation possible.
the federation API is independent of the front-end client API. You can run headless, without lemmy-ui, and federation still works. The API structure for federation is standardized, the front-end client API is unique to Lemmy.
It would not affect federation as the endpoints are still open. But a word of caution. This only protects the lemmy-ui from being accessed without the basic auth credentials. If someone tries to access your instance via API, it will still work.
It might. Some mods/instance admins might see your comments, decided to check your instance, and found it suspicious because it's protected behind basic auth and decided to block your instance. You can see in the modlog that people sometimes bans private instance (instance that don't let you see anything unless you're logged in) out of suspicion that they are a source of bots traffics.
A better way is probably to only protect your search page behind basic auth so no one can hook in new communities in your instance.
Anonymous users can't actually lookup other instance communities through yours in the same way logged in users can. They'll only be able to see a remote community if a user who's already been logged in on your instance has searched it up before and/or is subscribed, but they can't just arbitrarily make your instance look up other instance communities.
Then I guess you could configure nginx to not allow /c/ requests that have an @ unless the "jwt" cookie is present and do the same with your search endpoints. Of course, someone could just add an arbitrary jwt cookie to try and bypass it but if the point is more to make the average anon user not waste your server resources I think that should do. Without search and without the communities visible via /c/ everything within it wouldn't be indexed in search results so the only way for them to see a federated post through your instance would be a direct link to one.
Lemmy has a feature/setting called "Private instance" that I think could be used to achieve this, but I think that got broken at some point because it got tied to turning federation off... not sure what the current state is but may be worth looking into.