See MongoDB
See MongoDB
See MongoDB
NoSQL has always been a niche use case thing.
For some stuff, no ACID is no problem. They have their place. What I'm more suspicious of is things like Google offering distributed databases that they pretend as if they could break the CAP theorem.
And yet my Uni treats it like the biggest thing in existence. Meanwhile I've never used anything other than RDBS and Redis (only for cache), neither in private nor at work.
MongoDB is huge though for all the wrong reasons, businesses think that just because it's JS, they can just have frontend devs - sorry, they are "fullstack" now - doing DBA work.
I worked as one of two NoSQL DBAs for a Fortune 50 finance company, and there is a ton of CV-driven development going on giving NoSQL a bad name. Most use cases don't need NoSQL. And for those which do, NoSQL is almost always harder to implement than simple SQL based RDBMSs.
Sharded RDBS gets you very very far from my experience at least.
If you need to run queries that aggregate big amounts of data in a reasonable time and cost, you'll need something built for it. For example, with a column oriented file format instead of the row oriented file format found in traditional relational databases
It always depends on the context... My current job is 100% on Elasticsearch and I'm not missing transactions at all.
What's ACID?
Atomicity: either all parts of the transaction complete, or all parts of the transaction don't complete; there's no "partly complete" state
Consistency: the state of the database after a transaction is stable; all "downstream" effects (e.g. triggers) of the query are complete before the transaction is confirmed.
Isolation: concurrent transactions behave the same as sequential transactions
Durability: a power failure or crash won't lose any transactions
Traditionally, ACID is where relational databases shine.
You've gotten good answers from other folks but I'll provide a ELI5:
Basically a set of rules in the database to make sure that it is immediately consistent.
NoSQL databases offer eventual consistency in exchange for speed so they are generally not considered to be ACID compliant.
Most traditional databases (MySQL, postgresql, etc.) are.
There are a couple of emerging companies that try to tackle speed for traditional databases. CockroachDB offers a postgress-based database that scales more like NoSQL while still offering ACID transactions.
TiDB is a similar company but for MySQL.
https://en.m.wikipedia.org/wiki/ACID
Atomicity (something happens in its entirety or not at all), consistency (database is always in a valid state --- if the database has constraints, they will always be honored), isolation (transactions don't step on each other), durability (complete transaction is complete even if there's a power failure).
Not a database expert, my parenthetical explanations may need work.
If that is what it takes to get these kick-ass benchmarks.
Amazing video
Or does writes to S3 via LSM
This is kinda absolute BS at this point, though.
Mongo has acid transactions, and has for years now. Although this is only within the same database, there are plenty of dbms (including rdbms) that don't support cross-database transactions.
Mongo also, since time immemorial, has had "write concern" to ensure that it's written to disk (to the journal) before the transaction is completed.
This post is very timely because I was just introducing some new people to Mongo earlier this week and led off with "Now you might still hear people say 'mongo is trash, it's not even ACID compliant!' but those people are dumb... it's had that for years and years and is just another DBMS at this point (but not relational)"
... the last part also answers the other reply to this post. Yes.
Every time I'm assigned to a project that uses a document database
"So how are you guys handling all your related data?"
Finds collection of massive JSON documents containing all the related data
"Oh boy."
What's the problem with that? In my previous team, we had a structure with four levels of nesting where we only ever needed to query the first two levels. At first we used Postgres with normalized tables, but it was just slow as hell. Switching to MongoDB actually made our performance issues vanish.
Of course it all depends on what kinds of queries you need to run, but I don't think that large JSON documents are necessarily a problem.
They're talking about relations between data. For example, when you delete a user, you may also want to delete their stored data.
To some degree, this is less of a problem with document databases, because they don't force you to chop your data into small parts like relational databases do (e.g. you can have lists of that user's stored data as part of the JSON document). But you will likely still need some relations at some point.
Chances are you have a layer in your application code which ensures these relations that way.
Which is fine in my opinion. With relational databases, there's also often some relations which you cannot model in the database.
But yeah, it requires somewhat more software architecture awareness, to not lump the relation checking logic into general application logic. And you can't connect a second application to that database, without having to implement the relations another time or at least pulling them out into a shared library.
Those are rookie numbers.