How Kafka Producer Works internally

Jan 13

We take a Kafka client, call the producer, send the message, and boom, expect it to be delivered on the other end. And that's actually how it goes. But wouldn't it be nice to understand better what happens behind the scenes? How is this data actually stored on disk? Where? When? That's what I did today, making a dummy Kafka Producer and taking you on the journey as the message goes through Broker, partition, and disk. Bon Appetit!

Read →

8 Comments

Michał

Jan 17

Great explanation Oskar! I would really like to read the ending - consumer view :)

Expand full comment

Reply (1)

Oskar Dudycz

Jan 17

Thank you, happy that it was helpful!

I’ll try to cover that on Monday 🙂 Are there any specific parts that I should expand more?

Also curious which part of this article clicked with you the most 🙂

Expand full comment

Reply (1)

Michał

Jan 18

Mosty about file segments and broker perspective.

Expand full comment

Reply (1)

Oskar Dudycz

Jan 19

Thank you for expanding. 🙂

Such feedback is helpful to fine-tune the scope and style of further releases.

Expand full comment

Thiago Martins

Jan 21

That's an amazing post, Oskar. I'm impressed by the depth of the post. This is a CodeCrafter's (https://app.codecrafters.io/catalog) worth material - have you considered creating a course from that type of content? It might reach an even bigger audience that wants a deeper understanding of event streaming

Expand full comment

Reply (1)

Oskar Dudycz

Feb 16

Thank you, Thiago; I have been thinking for a long time about the online course about Event Sourcing but I haven't decided to do it yet. Maybe indeed, I could change those blog article series into a course or e-book. I'll think about it :)

Expand full comment

Mike G

Feb 13

I like this post very much! The most interesting part for me was one where you showed how high level concept of WAL is implemented while being aware of the physical constraints of fsync concurrency. With workloads where I expect high load it’s important to me to ensure I configured crucial parts of app settings correctly (like batch or replica sync configuration) and this post was one of the more concise guides on basic producer/broker configuration I’ve read in the last months.

Having this kind of cross-layer understanding helps building significantly more robust software in fewer iterations than usual.

Expand full comment

Reply (1)

Oskar Dudycz

Feb 16

Thank you! Such a comment is the one I hoped to get 😅 My goal was to precisely show that the magic is that there's no magic, and the physics law still works for Kafka and other tools we use. I personally like to learn new things this way, so zoom in, zoom out, to understand micro and macro scale.

As you said, understanding how fsync can be seen as technical detail, but it's not as it helps to understand where are limitations and how much we can bend the tool to our will. Plus understand the original use case behind it. 🙂

Expand full comment

Architecture Weekly

How Kafka Producer Works internally