Discussion about this post

User's avatar
Neural Foundry's avatar

Really solid walkthrough of the tenant isolation tradeoffs. The consistent hashing explainer tied to Kafka's partitioning model is super helpful for teams trying to wrap thier heads around dynamic distribution. Hit a similar problem last year where we tried spinning up per-tenant workers and coordination overhead killed us at around 300 tenants. The parttionBy approach with internal worker threads sounds way more practical than jumping straight to Raft/Paxos. Curious how checkpointing behaves when a thread crashes mid-batch tho.

No posts

Ready for more?