I've a ton of stuff running at home ranging from a 5-node PI cluster with various containers running on them including things like self-written python scripts doing "super important" stuff, node-red running everything about my energy setup, pfsense, TrueNAS etc.
Logging is painful and I've just lost about 4 hours trying to find a fault which stopped car charging. Lots of rabbit holes were entered as I currently don't have an end-to-end logging solution.
I tried some tools and the one that I am currently using is OpenObserve, it's light has a very good compression and is simple to manage, as an observability platform I think that open observe has some features that can be used instead of datadog like log injestion and open telemetry traces
They state in their documentation that the software is alpha (https://openobserve.ai/docs/ OpenObserve is currently in alpha, but don't let that stop you from trying it out.) . To be honest I didn't bother to investigate why ingesting data stops working after a few days, might be my installation then.
I'm very curious which organisation uses alpha software in production
It's a hybrid solution but I prefer putting my logs with an S3 provider, it's just cheap storage that I don't have to care about. And there are a lot of tools to do it with, like loki for example.
https://github.com/openobserve/openobserve . Built in rust - No JVM. Much lighter than the alternatives mentioned here and with extremely good UI. Beautiful dashboards. Could even run on raspberry pi.
I come from a Cybersecurity background which might explain my answer: Security Onion had proven adept at cross referencing logs and pcaps which is pretty awesome for troubleshooting
For most self hosted use cases Splunk's free 500MB (per day) license should be enough.
It's way easier to set up and maintain than ELK and has tons of free extensions for parsing log formats and dashboards.