6 Comments
User's avatar
Tomasz Ducin's avatar

Curious: of you had a hypothetical need to use a document database, what would be your first choice? The question is deliberately broad 😉

Expand full comment
Oskar Dudycz's avatar

Good question! Before I answer a follow-up one: do you mean document databases or, in general, key-value databases? 🤔

Expand full comment
Tomasz Ducin's avatar

Document databases per se.

As for key value I guess that _could be_ Redis?

Context: mongo used to be the thing, but it seems it's lost traction. In favor of - and here comes the question

Expand full comment
Oskar Dudycz's avatar

I think I see where you're going 😅If it was my choice, then I'd use my Pongo, so Document approach on top of Postgres. But if we put aside my pet peeves then...

I think that the actual choice is to use:

1. Relational Databases with JSONB support - PostgreSQL has the best; MySQL and SQLite also have it. JSONB (similarly as Mongo BSON) is actually a binary format. Each JSON property is translated into the column, nested are flattened into `property.nested1.nested2.final` arrays into `property[0]`. So they can be indexed. I should probably do a blog article on that :)

The downside is that JSONPath API is not super friendly. Libraries like Pongo, Marten that use PostgreSQL in this way make it easier to use this syntax.

There are also tools like FerretDB that plug a man in the middle into Mongo protocol. So you can use Mongo client, but connect to PostgreSQL or SQLite. The downside is that you need to keep the stateful proxy somewhere.

2. Cloud-native clones of MongoDB - so DocumentDB in AWS, or CosmosDB Mongo Api. But well, they're also built on top of PostgreSQL like FerretDB. So that's that... ;)

3. Use key-value databases (DynamoDB, CosmosDB) or blob storages (S3, Azure Blob Storage). Document Databases are type of key-value databases, that have structured schema. So, just like you can build it on top of a Relational database, you can build it on top of key-value stores or even blob storage. Thanks to that we can get serverless capabilities, which is hard to get in relational databases (as they need to be constantly running somewhere).

Blob storages can give cheaper storage, but lower latency and consistency guarantees. You can use tools like DuckDB to query.

Of course, as always pick your poison.

Is this answering your question? If not feel free to follow up. You can also check my past articles:

- https://event-driven.io/en/key-value-stores/

- https://www.architecture-weekly.com/p/using-s3-but-not-the-way-you-expected

- https://www.architecture-weekly.com/p/building-your-own-ledger-database

That's probably a good topic to expand in one of upcoming editions :)

Expand full comment
Szymon Bernad's avatar

What's my paranoia level? Quite often it is above average, but it's fine. Not everyone has a talent for predicting actual worst-case scenarios ;)

> The key isn't to become less paranoid as you learn - it's to become more precisely paranoid. <

I like this one. When discussing tradeoffs or aiming for "good enough" architecture it's usually helpful to provide well-documented concerns rather than a gut feeling. It might turn out that in some cases blocking incorrect behavior when it occurs is just fine and the system does not need to prevent the entire thing from happening.

Expand full comment
Oskar Dudycz's avatar

Yes, it always amazes me how many fancy words and approaches we try to invent to name "gut feeling". 😅

That's why I like to search a more systematic way to have a proper discussions, focusing on the merit, not battle stories.

> Quite often it is above average, but it's fine. Not everyone has a talent for predicting actual worst-case scenarios

Yeah, I probably trust other more than myself. That's also why I'm blogging to reevaluate my ideas and thoughts 😅

Expand full comment