<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Architecture Weekly]]></title><description><![CDATA[Weekly Software Architecture resources to boost your knowledge and developer skills.]]></description><link>https://www.architecture-weekly.com</link><image><url>https://substackcdn.com/image/fetch/$s_!dCx9!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3ab6cef1-9ce5-47fa-9306-17854b383dc1_358x358.png</url><title>Architecture Weekly</title><link>https://www.architecture-weekly.com</link></image><generator>Substack</generator><lastBuildDate>Wed, 29 Apr 2026 18:35:29 GMT</lastBuildDate><atom:link href="https://www.architecture-weekly.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Oskar Dudycz]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[architectureweekly@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[architectureweekly@substack.com]]></itunes:email><itunes:name><![CDATA[Oskar Dudycz]]></itunes:name></itunes:owner><itunes:author><![CDATA[Oskar Dudycz]]></itunes:author><googleplay:owner><![CDATA[architectureweekly@substack.com]]></googleplay:owner><googleplay:email><![CDATA[architectureweekly@substack.com]]></googleplay:email><googleplay:author><![CDATA[Oskar Dudycz]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Vibing, Harness and OODA loop]]></title><description><![CDATA[On why Vibing and Harness are not new and why feedback loops are important]]></description><link>https://www.architecture-weekly.com/p/vibing-harness-and-ooda-loop</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/vibing-harness-and-ooda-loop</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 27 Apr 2026 05:27:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!8Ufw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9101dc-f367-4ec3-99a5-98a1ebaab933_626x391.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8Ufw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9101dc-f367-4ec3-99a5-98a1ebaab933_626x391.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8Ufw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9101dc-f367-4ec3-99a5-98a1ebaab933_626x391.png 424w, https://substackcdn.com/image/fetch/$s_!8Ufw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9101dc-f367-4ec3-99a5-98a1ebaab933_626x391.png 848w, https://substackcdn.com/image/fetch/$s_!8Ufw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9101dc-f367-4ec3-99a5-98a1ebaab933_626x391.png 1272w, https://substackcdn.com/image/fetch/$s_!8Ufw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9101dc-f367-4ec3-99a5-98a1ebaab933_626x391.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8Ufw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9101dc-f367-4ec3-99a5-98a1ebaab933_626x391.png" width="626" height="391" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3c9101dc-f367-4ec3-99a5-98a1ebaab933_626x391.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:391,&quot;width&quot;:626,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;cover&quot;,&quot;title&quot;:&quot;cover&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="cover" title="cover" srcset="https://substackcdn.com/image/fetch/$s_!8Ufw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9101dc-f367-4ec3-99a5-98a1ebaab933_626x391.png 424w, https://substackcdn.com/image/fetch/$s_!8Ufw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9101dc-f367-4ec3-99a5-98a1ebaab933_626x391.png 848w, https://substackcdn.com/image/fetch/$s_!8Ufw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9101dc-f367-4ec3-99a5-98a1ebaab933_626x391.png 1272w, https://substackcdn.com/image/fetch/$s_!8Ufw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c9101dc-f367-4ec3-99a5-98a1ebaab933_626x391.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>Hey, have a look at what I made during the weekend. I had some time, grabbed a beer, turned on the computer and tried to code this feature. If I could do so much during the weekend, how much could you and your team do with it in 2 weeks?</p></blockquote><p>It&#8217;s almost a 1:1 quote of what I heard from the startup founder I worked with over 10 years ago. I&#8217;m sure that you&#8217;ve heard similar phrases from people you worked with. We all know the annoying type of person who doesn&#8217;t code anymore but thinks, <em>&#8220;I still got it!&#8221;</em>. Then they threw a piece of stuff at you to <em>&#8220;just fine-tune it a bit and do final touches&#8221;</em>. Then they&#8217;re the first ones to ask &#8220;Why so long?&#8220;.</p><p>Nowadays, the Internet is full of such people. They shout about what they did with Claude or how much progress LLM tools have made. Some even predict the end of coding. I already wrote that <a href="https://event-driven.io/en/the_end_of_coding_wrong_question/">this is wrong perspective</a>. I won&#8217;t repeat that, but I want to say that&#8230;</p><p><strong>Vibing isn&#8217;t new and isn&#8217;t always an issue.</strong></p><p>I&#8217;m saying that LLM tools are an appraisal for ignorance. The more ignorant we are of the topic we&#8217;re working with, the better we see the outcomes. And that, by itself, is not always bad, as there&#8217;s <a href="https://event-driven.io/en/power_of_ignorance/">power in ignorance</a> if we focus on getting it done with the simplest tools we have.</p><p>Still, this can be terrible if we fall in love too much with what we&#8217;ve vibed.</p><p>To understand why that &#8220;weekend beer&#8221; energy is both a superpower and a liability, we need to look at the OODA Loop.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PCEe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8fad63-3571-4523-8385-8f1ec279f98e_800x436.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PCEe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8fad63-3571-4523-8385-8f1ec279f98e_800x436.png 424w, https://substackcdn.com/image/fetch/$s_!PCEe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8fad63-3571-4523-8385-8f1ec279f98e_800x436.png 848w, https://substackcdn.com/image/fetch/$s_!PCEe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8fad63-3571-4523-8385-8f1ec279f98e_800x436.png 1272w, https://substackcdn.com/image/fetch/$s_!PCEe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8fad63-3571-4523-8385-8f1ec279f98e_800x436.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PCEe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8fad63-3571-4523-8385-8f1ec279f98e_800x436.png" width="800" height="436" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5c8fad63-3571-4523-8385-8f1ec279f98e_800x436.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:436,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;OODA loop&quot;,&quot;title&quot;:&quot;OODA loop&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="OODA loop" title="OODA loop" srcset="https://substackcdn.com/image/fetch/$s_!PCEe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8fad63-3571-4523-8385-8f1ec279f98e_800x436.png 424w, https://substackcdn.com/image/fetch/$s_!PCEe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8fad63-3571-4523-8385-8f1ec279f98e_800x436.png 848w, https://substackcdn.com/image/fetch/$s_!PCEe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8fad63-3571-4523-8385-8f1ec279f98e_800x436.png 1272w, https://substackcdn.com/image/fetch/$s_!PCEe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8fad63-3571-4523-8385-8f1ec279f98e_800x436.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Disclaimer, it&#8217;s not a competition for Ralph Wiggum Loop. It&#8217;s much older and generic.</p><p>Military strategist John Boyd developed the OODA loop (Observe, Orient, Decide, Act) for fighter pilots. In a dogfight, the pilot who cycles through these four stages the fastest and most accurately survives.</p><p>In software, the &#8220;dogfight&#8221; is the gap between your intent and the production-ready feature.</p><p><strong>OODA loop is built from four steps:</strong></p><ol><li><p><strong>Observe</strong> - This is the intake of raw, unfiltered information. In our world, this means looking at the state of the system.</p></li><li><p><strong>Orient</strong> - This is the most critical and difficult stage. It&#8217;s where you filter your observations through your experience, culture, and technical knowledge.</p></li><li><p><strong>Decide</strong> - Based on your orientation, you formulate a hypothesis.</p></li><li><p><strong>Act</strong> - You execute.</p></li></ol><p>Getting back to my favourite founder and LLM-based tools.</p><p><strong>The reason founder could build a PoC in a weekend while the team needed more than two weeks is that he bypassed the Observe and Orient phases. He went straight from a vague idea to Act.</strong></p><p>If we skip or brush past the observation step, it feels like lightning speed. If the fancy UI grid is there and it does something we wanted, we move on. We&#8217;ve outsourced Orientation to our own ego. It&#8217;s too easy to assume that because we wrote it, it works.</p><p><strong>Observation is the intake of raw data.</strong> In a professional environment, our eyes aren&#8217;t enough. We need a Harness. If we don&#8217;t have automations, tests, integration tests, and pristine traces, we aren&#8217;t observing the system; we&#8217;re just looking at it. If the inputs are messy, our observation is clouded.</p><p>But real engineering, the kind that takes those &#8220;two weeks&#8221;, is about closing the loop properly. That&#8217;s also where we need different perspectives and knowledge sharing.</p><p><strong>Orientation is where you process those observations.</strong> This is the part where LLMs make us feel smarter than we are. If we don&#8217;t understand how a database handles concurrent connections, our &#8220;orientation&#8221; of a generated script will be shallow. We&#8217;ll see code that &#8220;looks&#8221; right, decide it&#8217;s fine, and act by deploying it.</p><p>The &#8220;I still got it&#8221; crowd loves the Decide and Act phases because that&#8217;s where the visible progress happens. LLM tools have made these phases nearly instantaneous. We can decide to build a feature and have the code for it in ten seconds.</p><p>The problem is that the faster we Act, the faster we need to Observe. If our &#8220;Act&#8221; phase takes seconds but our &#8220;Observe&#8221; phase requires a manual weekend of clicking around and drinking beer, our OODA loop is broken. We&#8217;re just generating a pile of stuff that we haven&#8217;t actually verified.</p><p><strong>That&#8217;s why the team usually needs more than an imaginary &#8220;two weeks&#8221;.</strong> They are not &#8220;fine-tuning&#8221; the single-brilliant-dude masterpiece. They are building the infrastructure required to make the OODA loop sustainable.</p><p>And to make that possible, they need to run the full loop: Observe, Orient, Decide, Act. And do it multiple times. That takes time, but it&#8217;s required to assess the direction, automate what needs to be automated, and ensure they can iterate further and run this loop sustainably. That&#8217;s critical for delivering the outcome at the expected pace.</p><p>Of course, there&#8217;s a danger here, overfocusing on the Orient and Decide can lead to overengineering, building stuff we don&#8217;t need. That&#8217;s where ignorance can be blissful, especially when we connect it with humility. Being humble about what we don&#8217;t know and trying things the easiest way, then learning and making enhancements. Still, humility fails under deadline pressure. The harness doesn&#8217;t.</p><p>Let me give you&#8230;</p><h2>The example</h2><p>I&#8217;m adding proper Observability and Open Telemetry to <a href="https://github.com/event-driven-io/emmett">Emmett</a> right now. I spent some time working on it and instrumented the first component: <a href="https://event-driven-io.github.io/emmett/getting-started.html#command-handling">Command Handling</a>.</p><p>Of course, I had tests to prove it works, but I don&#8217;t trust them enough, and I wanted to try it on a real sample, since you never know until you run it. Even the best test suite won&#8217;t tell you all.</p><p>So I decided to plug it into the <a href="https://github.com/event-driven-io/emmett/tree/main/samples/webApi/expressjs-with-postgresql">sample</a>. See if it works, how ergonomic the API is and how it fits conventions in this area.</p><p>To do it, I decided to use <a href="https://grafana.com/">Grafana stack</a> and set it up with Docker Compose. So, stable, boring stack. Not going to lie, I vibed the config. Not that there are no docs, but I intentionally wanted to see the typical config people use.</p><p>If someone says LLM-based tools are great at proof of concepts, they don&#8217;t run the stuff they vibed. If I made the observation based on the initial config, then an oriented decision would be that it won&#8217;t work. Of course, then I did the typical back-and-forth, with the LLM tool doing some Linux command Voodoo to make it work. Once. Then, if you try to repeat it, you won&#8217;t know how to do it without doing Voodoo again.</p><p>Again, that&#8217;s not much different from the other stuff we do. I&#8217;m sure that you had multiple cases, when someone didn&#8217;t use Continuous Deployment tools, but clicked through Azure, AWS, GCP portal, deployed the stack, and then there was no trace on how to set it up again (e.g. to have a different environment for testing or demos for customers).</p><p><strong>So, we need a harness, we need a leash to keep our process on track.</strong></p><p>How to do the harness? My advice is to start simple. We may ask LLMs to give us shell scripts, and we may ask them to run them multiple times. We also need experience and knowledge of what we want to achieve and the tools we use. It&#8217;s fine not to remember all the YAML config to set up the Grafana stack, but it&#8217;s not fine not to understand why you even use it, how it relates, and how to set it up.</p><p>Still, our first loop can close on the first working solution, even a manually vibed one. But that&#8217;s not even a PoC. We need to automate them.</p><p>I asked LLM to take notes on what issues it had, and it solved them. Then, based on that, I asked to research how to code it in TypeScript. And to use tools I know, used in past, validating if there are no new more modern ones. For instance, I was a big fan of <a href="https://gulpjs.com/">Gulp.js</a> and <a href="https://github.com/adamralph/bullseye">Bullseye</a> in the past, but they&#8217;re mostly dead. I wanted to have something in the same spirit, using native, maintained tooling.</p><p>I ended up with the following tools:</p><ul><li><p><a href="https://github.com/sindresorhus/execa">execa</a> for running shell scripts,</p></li><li><p><a href="https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API">native fetch</a> for calling http endpoints,</p></li><li><p><a href="https://nodejs.org/api/test.html">native Node.js test tools</a> for checking if the stack works as expected.</p></li></ul><p>Then I asked it to create the script to automate the shell Voodoo they did to make Grafana stack and Docker Compose work.</p><p><strong>Essentially, it should:</strong></p><ol><li><p>Run Docker Compose script starting up services (Grafana, Prometheus, Loki, Tempo, PostgreSQL, etc.).</p></li><li><p>Wait for them to check when they&#8217;re ready (it usually takes some time).</p></li><li><p>Start the application and make a request.</p></li><li><p>Check if the predefined dashboard with Emmett metrics appears, and shows expected traces and metrics.</p></li></ol><p>Initial diagnostic tools looked like that</p><pre><code><code>async function fetchWithDiag(label: string, url: string, init?: RequestInit) {
  const res = await fetch(url, init);
  if (!res.ok) {
    const body = await res.text().catch(() =&gt; '(could not read body)');
    console.error(`\n  &#10007; ${label} &#8594; HTTP ${res.status}\n  body: ${body}\n`);
  }
  return res;
}

async function diagnoseCollector() {
  const text = await fetch(URLS.otelCollectorMetrics)
    .then((r) =&gt; r.text())
    .catch(() =&gt; 'unreachable');
  const emmett = text
    .split('\n')
    .filter((l) =&gt; l.startsWith('emmett_') &amp;&amp; !l.startsWith('#'))
    .slice(0, 5);
  console.log(
    emmett.length
      ? `\n  collector /metrics (emmett lines):\n  ${emmett.join('\n  ')}`
      : '\n  collector /metrics: no emmett_* lines found',
  );
}

async function diagnosePrometheus() {
  const json = await fetch(
    `${URLS.prometheus}/api/v1/label/__name__/values`,
  )
    .then((r) =&gt; r.json() as Promise&lt;{ data: string[] }&gt;)
    .catch(() =&gt; ({ data: [] as string[] }));
  const emmett = json.data.filter((n) =&gt; n.startsWith('emmett_'));
  console.log(
    emmett.length
      ? `\n  Prometheus emmett_* metrics: ${emmett.join(', ')}`
      : '\n  Prometheus: no emmett_* metrics found yet',
  );
}

async function diagnoseLoki() {
  const labels = await fetch(`${URLS.loki}/loki/api/v1/labels`)
    .then((r) =&gt; r.json() as Promise&lt;{ data?: string[] }&gt;)
    .catch(() =&gt; ({ data: [] as string[] }));
  console.log(`\n  Loki labels: ${(labels.data ?? []).join(', ') || '(none)'}`);
}

async function diagnoseDockerLogs(service: string, lines = 10) {
  const { stdout } = await execa('docker', [
    ...COMPOSE,
    'logs',
    '--tail',
    String(lines),
    service,
  ]).catch(() =&gt; ({ stdout: '(could not get logs)' }));
  console.log(`\n  docker logs ${service} (last ${lines}):\n  ${stdout.split('\n').join('\n  ')}`);
}</code></code></pre><p>Are they pretty? No. Can they be improved? Yes. Do they have to be improved at this specific moment? No.</p><p>The setup uses test infrastructure</p><pre><code><code>
const CLEANUP = process.env['CLEANUP'] === '1' || process.env['CLEANUP'] === 'true';
const CLEANUP_AFTER = process.env['CLEANUP_AFTER'] === '1' || process.env['CLEANUP_AFTER'] === 'true';
const NO_START = process.env['NO_START'] === '1' || process.env['NO_START'] === 'true';

// &#9472;&#9472;&#9472; configuration &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;

const COMPOSE = ['compose', '-f', 'docker-compose.yml', '--profile', 'observability'];

const URLS = {
  app: 'http://localhost:3000',
  prometheus: 'http://localhost:9090',
  tempo: 'http://localhost:3200',
  loki: 'http://localhost:3100',
  grafana: 'http://localhost:3001',
  otelCollectorMetrics: 'http://localhost:8889/metrics',
};

// Fresh client per run &#8212; avoids stale cart state from previous runs.
const SERVICE_NAME = 'expressjs-with-postgresql';
const CLIENT_ID = randomUUID();
const CART_ENDPOINT = `${URLS.app}/clients/${CLIENT_ID}/shopping-carts/current/product-items`;
const CONFIRM_ENDPOINT = `${URLS.app}/clients/${CLIENT_ID}/shopping-carts/current/confirm`;

// Matches the .http file &#8212; unitPrice is resolved server-side.
const ADD_PRODUCT_BODY = JSON.stringify({ productId: randomUUID(), quantity: 10 });


before(async () =&gt; {
  console.log(`\n&#9654; client ID for this run: ${CLIENT_ID}\n`);

  if (NO_START) {
    console.log('&#9654; --no-start: skipping docker compose and app startup');
    return;
  }

  if (CLEANUP) {
    console.log('&#9654; --cleanup: killing port 3000 and tearing down stack (down -v)&#8230;');
    await execa('bash', ['-c', 'fuser -k 3000/tcp 2&gt;/dev/null || true']).catch(() =&gt; {});
    await new Promise((r) =&gt; setTimeout(r, 500));
    await execa('docker', [...COMPOSE, 'down', '-v', '--remove-orphans'], {
      stdio: 'inherit',
    });
  }

  const stackReady = await fetch(`${URLS.prometheus}/-/ready`)
    .then((r) =&gt; r.ok)
    .catch(() =&gt; false);

  if (stackReady) {
    console.log('&#9654; observability stack already up &#8212; skipping docker compose up');
  } else {
    console.log('&#9654; starting observability stack&#8230;');
    await execa('docker', [...COMPOSE, 'up', '-d'], { stdio: 'inherit' });
  }

  console.log('&#9654; waiting for backends&#8230;');
  await waitFor(() =&gt; checkUrl('Prometheus', `${URLS.prometheus}/-/ready`), {
    timeout: 90_000, label: 'Prometheus',
  });
  await waitFor(() =&gt; checkUrl('Grafana', `${URLS.grafana}/api/health`), {
    timeout: 90_000, label: 'Grafana',
  });
  await waitFor(() =&gt; checkUrl('Tempo', `${URLS.tempo}/ready`), {
    timeout: 90_000, label: 'Tempo',
  });
  await waitFor(() =&gt; checkUrl('Loki', `${URLS.loki}/ready`), {
    timeout: 90_000, label: 'Loki',
  });

  // /health returns { status: 'ok', service: 'expressjs-with-postgresql' } &#8212;
  // checking service name lets us distinguish our app from other processes on :3000.
  const checkOurApp = () =&gt;
    checkUrl('app /health', `${URLS.app}/health`, async (res) =&gt; {
      const json = (await res.json().catch(() =&gt; ({}))) as { service?: string };
      if (json.service !== SERVICE_NAME) {
        console.log(
          `    app /health: service="${json.service ?? '(missing)'}", expected="${SERVICE_NAME}"`,
        );
        return false;
      }
      return true;
    });

  const appIsOurs = stackReady &amp;&amp; (await checkOurApp());

  if (appIsOurs) {
    console.log('&#9654; app already running and healthy &#8212; skipping npm start');
  } else {
    const portTaken = await fetch(URLS.app).then(() =&gt; true).catch(() =&gt; false);
    if (portTaken) {
      // Port is occupied but not by our app &#8212; stale process or unrelated service.
      console.error(
        '\n  &#10007; Port 3000 is occupied by a process that is not this app.\n' +
          '  It may be a stale version of this app (connected to a wiped database)\n' +
          '  or a completely different service.\n' +
          '  Fix: run  npm run verify:observability:cleanup  to kill it and restart,\n' +
          '  or manually free port 3000.\n',
      );
      process.exit(1);
    }

    console.log('&#9654; starting app&#8230;');
    app = execa('npm', ['start'], { stdio: 'inherit' });

    await waitFor(checkOurApp, { timeout: 60_000, label: 'app /health' });
  }

  console.log('&#9654; setup complete\n');
});</code></code></pre><p>As you see, nothing fancy, the cleanup is even simpler</p><pre><code><code>after(async () =&gt; {
  if (app) {
    console.log('\n&#9654; stopping app&#8230;');
    app.kill('SIGTERM');
    await app.catch(() =&gt; {});
    console.log('&#9654; app stopped');
  }

  if (CLEANUP_AFTER) {
    console.log('&#9654; tearing down stack (down -v)&#8230;');
    await execa('docker', [...COMPOSE, 'down', '-v', '--remove-orphans'], {
      stdio: 'inherit',
    });
    console.log('&#9654; stack torn down');
  } else {
    console.log('&#9654; stack is still running');
    console.log('&#9654; to clean up: npm run verify:observability:cleanup');
  }
});</code></code></pre><p>Having that we can run tests:</p><pre><code><code>
test('successful command returns x-trace-id header', async () =&gt; {
  const res = await fetchWithDiag('POST add product', CART_ENDPOINT, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: ADD_PRODUCT_BODY,
  });

  assert.equal(res.status, 204, `Expected 204 &#8212; body logged above`);

  const header = res.headers.get('x-trace-id');
  if (!header) {
    console.error(
      '  &#10007; x-trace-id missing &#8212; verify the wrapper app in src/index.ts ' +
        'adds it via @opentelemetry/api before mounting the emmett app',
    );
  }
  assert.ok(header, 'x-trace-id header missing');
  assert.match(header, /^[0-9a-f]{32}$/, `"${header}" is not a 32-hex trace ID`);

  traceId = header;
  console.log(`  trace ID: ${traceId}`);
});

test('OTel collector exposes Emmett metrics on port 8889', async () =&gt; {
  // Send a few more requests so metrics are definitely recorded.
  for (let i = 0; i &lt; 5; i++) {
    await fetch(CART_ENDPOINT, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: ADD_PRODUCT_BODY,
    });
  }

  try {
    await waitFor(
      async () =&gt; {
        let text: string;
        try {
          const res = await fetch(URLS.otelCollectorMetrics);
          text = await res.text();
        } catch {
          console.log('    collector :8889: connection refused');
          return false;
        }
        const emmettLines = text.split('\n').filter((l) =&gt; l.startsWith('emmett_') &amp;&amp; !l.startsWith('#'));
        if (emmettLines.length === 0) {
          const allFamilies = [...new Set(text.split('\n').filter((l) =&gt; !l.startsWith('#') &amp;&amp; l).map((l) =&gt; l.split('{')[0]))].slice(0, 5);
          console.log(`    collector :8889: no emmett_* metrics yet. Present: ${allFamilies.join(', ') || '(none)'}`);
          return false;
        }
        return true;
      },
      { timeout: 90_000, interval: 5_000, label: 'emmett metrics on collector :8889' },
    );
  } catch (err) {
    await diagnoseCollector();
    await diagnoseDockerLogs('otel-collector');
    throw err;
  }
});</code></code></pre><p>I put it into a <a href="https://github.com/event-driven-io/emmett/blob/a937ff98ba39d3e504540886d8cd918843b28149/samples/webApi/expressjs-with-postgresql/src/observability.spec.ts">single file</a> that can be run as a regular Node.js script.</p><p>It already showed me (and Claude) that what they initially did wasn&#8217;t working if you try to run it multiple times. It also showed that doing a full cleanup and rebuild, and making it reproducible, needs more work.</p><p>Is it done? Not yet; it takes too much time and resources to run it continuously throughout the pipeline. The code is a bit messy, so it needs to be organised. It&#8217;s segmented into blocks, includes basic automation and tests, and has already gone through some failures to get it done.</p><p>Could I do it better? Sure, and I will improve it, but that&#8217;s not the point. I wanted to show you my findings during weekend vibing (without beer tho), the real, not polished iteration, before I run the next one.</p><p><strong>The main idea behind OODA loops is not to be perfect, but to iterate quickly, gather feedback as soon as possible, learn from it, develop another theory, and verify it through action.</strong></p><p>It&#8217;s not about vibing, but it&#8217;s also not about analysis paralysis.</p><p>I hope you&#8217;re now better equipped to think about when vibing, with beer or without, with LLMs or without, actually helps, and when it doesn&#8217;t.</p><p>Vibe coding is just high-frequency steering. It only works if you have a Harness: a mechanical way to observe and orient, so you don&#8217;t steer the whole project into a wall.</p><p>Act takes seconds now. Observe takes as long as it always did. Without a harness, you&#8217;re not going faster; you&#8217;re just making more stuff you haven&#8217;t checked.</p><p>Harness is not magic, a new discipline, or the next buzzword; I hope I showed you that a bit in this article on what it may look like.</p><p><strong>So iterate fast, but wisely remembering to do the full loop.</strong> It&#8217;s great that LLMs can help us make Acting faster, but we should not skip other steps. We should aim for a fast feedback loop to iterate in the right direction and achieve continuous improvement, to deliver proper value.</p><p>Just like Vibing isn&#8217;t new, we shouldn&#8217;t abandon <em>&#8220;old&#8221;</em> engineering practices. We should also not replace collaboration with solitary self-high fives.</p><p>Check also:</p><ul><li><p><a href="https://github.com/event-driven-io/emmett/pull/335">Emmett Pull Request with mentioned changes</a></p></li><li><p><a href="https://event-driven.io/en/interactive_rubber_ducking_with_gen_ai/">Interactive Rubber Ducking with GenAI</a></p></li><li><p><a href="https://event-driven.io/en/the_end_of_coding_wrong_question/">The End of Coding? Wrong Question</a></p></li><li><p><a href="https://event-driven.io/en/tricks_on_how_to_set_up_related_docker_images/">A few tricks on how to set up related Docker images with docker-compose</a></p></li><li><p><a href="https://event-driven.io/en/docker_compose_profiles/">Docker Compose Profiles, one the most useful and underrated features</a></p></li></ul><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[Yoda Principle for better integrations]]></title><description><![CDATA[On why doing is better than trying for api and workflows design]]></description><link>https://www.architecture-weekly.com/p/yoda-principle-for-better-integrations</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/yoda-principle-for-better-integrations</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 20 Apr 2026 13:17:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!H0d4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F946c3f85-a03a-40e9-90ad-95a44231c9fa_626x391.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!H0d4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F946c3f85-a03a-40e9-90ad-95a44231c9fa_626x391.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!H0d4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F946c3f85-a03a-40e9-90ad-95a44231c9fa_626x391.png 424w, https://substackcdn.com/image/fetch/$s_!H0d4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F946c3f85-a03a-40e9-90ad-95a44231c9fa_626x391.png 848w, https://substackcdn.com/image/fetch/$s_!H0d4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F946c3f85-a03a-40e9-90ad-95a44231c9fa_626x391.png 1272w, https://substackcdn.com/image/fetch/$s_!H0d4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F946c3f85-a03a-40e9-90ad-95a44231c9fa_626x391.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!H0d4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F946c3f85-a03a-40e9-90ad-95a44231c9fa_626x391.png" width="626" height="391" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/946c3f85-a03a-40e9-90ad-95a44231c9fa_626x391.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:391,&quot;width&quot;:626,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;cover&quot;,&quot;title&quot;:&quot;cover&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="cover" title="cover" srcset="https://substackcdn.com/image/fetch/$s_!H0d4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F946c3f85-a03a-40e9-90ad-95a44231c9fa_626x391.png 424w, https://substackcdn.com/image/fetch/$s_!H0d4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F946c3f85-a03a-40e9-90ad-95a44231c9fa_626x391.png 848w, https://substackcdn.com/image/fetch/$s_!H0d4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F946c3f85-a03a-40e9-90ad-95a44231c9fa_626x391.png 1272w, https://substackcdn.com/image/fetch/$s_!H0d4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F946c3f85-a03a-40e9-90ad-95a44231c9fa_626x391.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>Try not. Do. Or do not. There is no try!</p></blockquote><p>I&#8217;m calling this the Yoda Principle.</p><p><a href="https://www.youtube.com/watch?v=BQ4yd2W50No">Master Yoda said that to Luke Skywalker a long time ago in a galaxy far, far away</a>. He was teaching Luke how to name commands properly while trying to untangle some legacy enterprise mess.</p><p>I&#8217;m sure you&#8217;ve also seen a death star of weirdly-named stuff. Some of them have already tripped you hours of thinking whether someone named this thing badly, or there&#8217;s some hidden truth behind it.</p><p><strong>Let&#8217;s discuss that by the example: E-Commerce order fulfilment.</strong></p><p>The order is placed automatically once the customer confirms the items in the shopping cart. We&#8217;re not making any product reservations before the shopping cart is confirmed, as this would lock it for other customers, and, as you know, they tend to drop items from their carts.</p><p>Once we receive the event notification that the cart has been confirmed, we&#8217;ll start the order fulfilment process. It starts (as mentioned) with the order initiation, which acknowledges and initiates the multi-step fulfilment process.</p><p>The first step is checking product availability before confirming the order. We need to determine whether we can proceed with completing the shipment and initiating product payment. If the product is unavailable, we need to either ask the customer to wait until we have it again or cancel the order.</p><p>We have dedicated modules for order fulfilment and for inventory. Fulfilment is the orchestrator, and inventory is responsible for tracking the state in warehouses.</p><p>The ordering module would need to call the inventory module to verify product availability. We could send a command (through the messaging system or web api). How would we name it?</p><p>What about <em><strong>VerifyProductExists</strong></em>? We&#8217;d send the product id and quantity from the order information, and return true if we have enough products, false otherwise. Sounds fair, right?</p><p>Well, it may seem nice at first glance, but what happens if more than one order verifies the same product availability, and we&#8217;re running short?</p><p>Then we&#8217;re vulnerable to race conditions I described in <a href="https://event-driven.io/en/tell_dont_ask_how_to_keep_an_eye_on_boiling_milk/">Tell, don&#8217;t ask! Or, how to keep an eye on boiling milk</a>. The information we get is only valid at the time of querying. If we don&#8217;t lock the product quantity, it can change before we get a response (think: Black Friday-like demand).</p><p><strong>Naming our command like </strong><em><strong>VerifyProductExists</strong></em><strong> is a mistake</strong>.</p><p><em>VerifyProductExists</em> is not even a command; it&#8217;s a query. Command is a request (intention) to run business logic. Query is a request to return data.</p><p>Of course, pragmatically, our <a href="https://event-driven.io/en/can_command_return_a_value/">command can return status information about the result of our operation</a>. But the intention is different.</p><p><strong>What&#8217;s our real intention here?</strong></p><p>The real intention is that we&#8217;d like to reserve products so we can ship them and get payment for them, not to verify if they exist. It&#8217;d be better to name our command as ReserveProducts or LockProducts.</p><p>Why does it matter?</p><p>If we&#8217;re naming our commands with Verify/Validate/Check prefixes, we&#8217;re putting ourselves into the wrong mindset. We&#8217;re not focused on actions and integrations, but just brief checking. If we&#8217;re in such a mode, it&#8217;s easy to handwave the integration complexity.</p><p>Locking for shipment may require sending someone to double-check that the product is in the warehouse and hasn&#8217;t been stolen, damaged, or other steps. It may be an async process on its own. Still from the ordering module, we should not care, as we&#8217;re telling what our intention is and expect to get events informing us whether the reservation succeeded, failed, or timed out (we don&#8217;t want to lock products forever in case of order fulfilment issues, but only for some time).</p><p><strong>Prefixes like Verify/Validate/Check, etc., are just synonyms for trying.</strong> And well, commands are always a form of trying. The handling module can reject the command, as its business rules and state are the source of truth.</p><p>We should always assume that the command processing can fail. We should not be discouraged by that, and we should double-check everything. We should not be intimidated by the potential failure, but prepared for it.</p><p><strong>We should try not. Do or do not. There is no try.</strong></p><p>What if we have both? So <em>VerifyProductExists</em> and <em>LockProducts</em>? It can work if the first one is a query used by the Shopping Cart module, without any guarantee that the data isn&#8217;t stale, on a best-effort basis.</p><p>If we&#8217;re always requiring VerifyProductExists from the handling module before LockProducts, we&#8217;re making our communication chatty. I described that in <a href="https://event-driven.io/en/what_does_mr_bean_opening_the_car_have_to_do_with_programming/">What does Mr Bean opening the car have to do with programming?</a> that this is not only a bad developer experience, but also just redundancy. Locking should already verify whether the product exists, so why require someone to memorise those scenarios instead of checking it internally?</p><p>The same goes for cases like:</p><ul><li><p>verify payment has been made,</p></li><li><p>check if the order wasn&#8217;t already fulfilled,</p></li><li><p>validate if the shipment has been completed,</p></li><li><p>etc.</p></li></ul><p>All of them either hide a missing business concept or should be a business rule verified within the specific action (e.g., confirming an order).</p><p>I recognise this may seem nitpicky, but big things are built on small details.</p><p>If we don&#8217;t think about such things, we&#8217;ll not only end up with misnamed integrations but also fight <a href="https://event-driven.io/en/dealing_with_race_conditions_in_eda_using_read_models/">race conditions</a> and incorrect boundaries.</p><p>Then <a href="https://www.youtube.com/watch?v=cTwZZz0HV8I">we&#8217;ll be doomed</a>.</p><p>So better think twice and do or do not.</p><p>May the force be with you!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[Anti-patterns in event modelling - Passive-Aggressive Events]]></title><description><![CDATA[On why event-driven communication should not be only about events.]]></description><link>https://www.architecture-weekly.com/p/passive-aggresive-event</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/passive-aggresive-event</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 13 Apr 2026 11:00:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UjMX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UjMX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UjMX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!UjMX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!UjMX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!UjMX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UjMX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1853124,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.architecture-weekly.com/i/194057953?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UjMX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!UjMX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!UjMX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!UjMX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F19ccb285-3907-4ebb-86fe-663cb917f31f_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Have you ever heard phrases like.</p><blockquote><p>Just an update, the milk ran out. Someone finished it and put the empty carton back.</p></blockquote><p>Or</p><blockquote><p>So everyone is aware, the meeting started 15 minutes ago.</p></blockquote><p>Or</p><blockquote><p>Heads up: the coffee machine is empty again.</p></blockquote><p>I&#8217;m sure you either heard or used such phrases.</p><p>We all know that there&#8217;s some hidden intention behind it.</p><p>The intention is not to inform, but to trigger a certain action.</p><p>Formally, we&#8217;re reporting on events to announce the facts, but in practice, we&#8217;re using passive-aggressive words. The real intention is to command someone.</p><p>We don&#8217;t want to inform that the trash bin is full, but we want someone to take it out. We don&#8217;t want to inform that the coffee machine requires coffee beans refill, but we want someone to do it.</p><p>Passive-aggressive tone is the worst. It&#8217;s toxic for both sides of the communication. Usually, it&#8217;s just better to ask someone to do it.</p><p>The same rule applies in event-driven modelling. We should avoid passive-aggressive communication at all costs.</p><p><strong>We should watch out for Passive-Agressive Events. So events that should be commands.</strong></p><p>I already warned you in past <a href="https://event-driven.io/pl/dont_let_event_driven_architecture_buzzwords_fool_you/">not to let Event-Driven Architecture buzzwords fool us</a>.</p><p>Event-Driven Architecture is an integration architecture style. We&#8217;re trying to model our business processes to run smoothly. To achieve that, we prefer a non-blocking communication flow, with things happening in parallel at their own pace. The goal is to achieve autonomous components, reducing the time needed to understand them. That helps <a href="https://event-driven.io/pl/removability_over_maintainability/">maintain, or even replace them</a> as your business evolves.</p><p>And events are enablers for that. They notify of what has happened, allowing other components to interpret facts and take the next steps.</p><p>But&#8230; Let me show one more photo.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sdC9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ed3c6a-e9ab-42af-a170-8d452d66db09_800x440.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sdC9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ed3c6a-e9ab-42af-a170-8d452d66db09_800x440.jpeg 424w, https://substackcdn.com/image/fetch/$s_!sdC9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ed3c6a-e9ab-42af-a170-8d452d66db09_800x440.jpeg 848w, https://substackcdn.com/image/fetch/$s_!sdC9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ed3c6a-e9ab-42af-a170-8d452d66db09_800x440.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!sdC9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ed3c6a-e9ab-42af-a170-8d452d66db09_800x440.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sdC9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ed3c6a-e9ab-42af-a170-8d452d66db09_800x440.jpeg" width="800" height="440" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66ed3c6a-e9ab-42af-a170-8d452d66db09_800x440.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:440,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;parliament&quot;,&quot;title&quot;:&quot;parliament&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="parliament" title="parliament" srcset="https://substackcdn.com/image/fetch/$s_!sdC9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ed3c6a-e9ab-42af-a170-8d452d66db09_800x440.jpeg 424w, https://substackcdn.com/image/fetch/$s_!sdC9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ed3c6a-e9ab-42af-a170-8d452d66db09_800x440.jpeg 848w, https://substackcdn.com/image/fetch/$s_!sdC9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ed3c6a-e9ab-42af-a170-8d452d66db09_800x440.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!sdC9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ed3c6a-e9ab-42af-a170-8d452d66db09_800x440.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It&#8217;s parliament, per the official definition: a room full of angry, shouting people.</p><p><strong>If we model our communication only in terms of events, our system will look just like that.</strong> We&#8217;d just announce new facts in a passive-aggressive style and not be interested in what happens next. Oh, wait, are we really not interested? Actually, we are. If someone won&#8217;t do what we expect with our information, we&#8217;ll be even angrier.</p><p><a href="https://event-driven.io/pl/whats_the_difference_between_event_and_command/">What&#8217;s the difference between a command and an event?</a> Both are messages. They convey specific information: a command indicating intent to do something, an event describes what has happened. From the computer&#8217;s point of view, they are no different. Only the business logic and the interpretation of the message can distinguish between an event and a command.</p><p>And that&#8217;s the main difference: commands can be rejected by the command handler. Events can only be ignored.</p><p>If we publish an event, we expect one or more consumers to be interested in it. Yet, we don&#8217;t know which components will do it. We just broadcast information.</p><p>This can easily change into passive-aggressive:</p><blockquote><p>I did my work, now it&#8217;s your turn.</p></blockquote><p>And here&#8217;s the crucial part. If we always have a single consumer for an event that needs to run the specific logic and expect to get the particular event back, then it should be a command. It&#8217;s not an event, we don&#8217;t inform. We want some component to take the next specific step and let us know when it&#8217;s finished.</p><p>Aren&#8217;t we making our communication synchronous?</p><p>What does it even mean, synchronous or asynchronous?</p><p>That&#8217;s what <a href="https://www.youtube.com/watch?v=2LMEJ-WGFTk">Sam Newman discussed in his great talk</a>. The main conclusion is that synchronous vs asynchronous discussion is actually about blocking or non-blocking processing. And that&#8217;s much broader topic than the technical solution (so whether we call something in-process via an HTTP endpoint or a messaging system).</p><p>It&#8217;s a common misconception that events are published asynchronously through a messaging system (e.g. Kafka, RabbitMQ, SQS, WhateverQueue) and commands are sent through synchronous WebAPI. That can be true for a specific solution, but not as the general rule. As said, both events and commands are messages; we can send them through a messaging system or via HTTP (e.g. events via webhooks).</p><p>This misleading split came out from our expectation about handling. We expected the command handler to give us the result. For event handler, we don&#8217;t expect a specific result. At least in theory.</p><p><strong>If we publish a specific event to the messaging system and expect a specific critical path of follow-up events, then we&#8217;re not making our communication non-blocking. It&#8217;s still sequential.</strong> We cannot proceed until the expected sequence occurs.</p><p>Whether something is blocking or not is not established by the tools we use, but by how our business process looks.</p><p>Speaking about it.</p><p>Let&#8217;s get back to our favourite E-Commerce Order scenario (read more in <a href="https://www.architecture-weekly.com/p/predictable-identifiers-enabling">Predictable Identifiers: Enabling True Module Autonomy in Distributed Systems</a>).</p><p>We could model it so we just publish the <em>OrderConfirmed</em> event and passively-aggressively expect that others will take it from there. So:</p><ul><li><p>The payment module will initiate the payment.</p></li><li><p>Inventory will start completing shipments.</p></li><li><p>The notification module will send a confirmation e-mail.</p></li><li><p>Fraud detection module will check if the order is not rigged.</p></li></ul><p>Once we receive information about a successful shipment or payment registration, we can complete the order.</p><p><strong>You may notice two paths for order processing:</strong></p><ol><li><p><strong>Blocking</strong> - We need to wait for information about payments and shipments. This is our critical path.</p></li><li><p><strong>Non-blocking</strong>- Order process shouldn&#8217;t stop if the notification wasn&#8217;t sent or the data warehouse wasn&#8217;t able to process events. We&#8217;d like that to happen, but it&#8217;s expected rather than critical.</p></li></ol><p>Now, both payments may fail (if our customer doesn&#8217;t have enough money), and the shipment may not be completed (if it&#8217;s Black Friday, and multiple people are competing for the same product).</p><p>If that happens, ordering module needs to take action, for instance, do reimbursement if the shipment wasn&#8217;t completed, and eventually cancel the order.</p><p>If we don&#8217;t foresee that and stay in passive-aggressive mode, we tend to forget about <em>&#8220;negative&#8221;</em> scenarios. it&#8217;s too easy to stay in I-Alread-Did-My-Job mode. This will have severe consequences: blocked orders, missed communication, and dissatisfied customers.</p><p>We may lear too late that another module can actually say no:</p><ol><li><p>Payment module can say: <em>Man, that&#8217;s not going to happen, you&#8217;ve already run out of money</em>.</p></li><li><p>Shipment module can say: <em>Man, I&#8217;m sorry, but you weren&#8217;t fast enough and we&#8217;ve run out of product</em>.</p></li></ol><p>And both of those scenarios will block successful order completion.</p><p>How to find such cases? <a href="https://event-driven.io/pl/intro_to_example_mapping/">Doing Example Mapping during modelling can be a good option for that</a>.</p><p>Most importantly, we need to embrace the fact that some processes require direct, blocking communication, and others don&#8217;t. Just like in real life, sometimes it&#8217;s just more effective to tell someone to do something. We should avoid micromanagement and aim for autonomy, but not end up with anarchy.</p><p>In our case, it&#8217;d be better to have a coordinator (<a href="https://event-driven.io/pl/how_to_have_fun_with_typescript_and_workflow/">workflow</a>, <a href="https://event-driven.io/pl/saga_process_manager_distributed_transactions/">saga, process manager</a>, <a href="https://event-driven.io/pl/to_do_list_and_passage_of_time_patterns_combined/">To-Do List</a> etc.) that publishes the <em>OrderConfirmed</em> event for modules not on the critical path and sends commands like <em>RecordPayment</em> and <em>InitiateShipment</em>.</p><p>By that, we&#8217;re separating responsibilities and making explicit what should be explicit. This also helps in understanding the business process, as you have a central place to see the critical flow and get proper observability.</p><p>Lacking tracing and observability of the business process is one of the most common issues <a href="https://event-driven.io/pl/training/">I see in my clients&#8217; projects</a>. As said, if we don&#8217;t want to end up with parliament instead of proper communication in our system, we need to be explicit about our intention.</p><p>Is that all? Not quite, there&#8217;s one more message type we model as events that should not be events.</p><p><strong>Gregor Hohpe, in <a href="https://www.enterpriseintegrationpatterns.com/patterns/messaging/Message.html">&#8220;Enterprise Integration Patterns&#8221;</a>, besides <a href="https://www.enterpriseintegrationpatterns.com/patterns/messaging/EventMessage.html">Event</a> and <a href="https://www.enterpriseintegrationpatterns.com/patterns/messaging/CommandMessage.html">Command</a> defines one more message type: <a href="https://www.enterpriseintegrationpatterns.com/patterns/messaging/DocumentMessage.html">Document</a>.</strong></p><p>What&#8217;s the Document? It&#8217;s a state. Or to be precise: self-contained data we have at a certain point in time. We can store it, but we can also publish information about its new value.</p><p>That&#8217;s probably why Martin Fowler frames it as <a href="https://martinfowler.com/articles/201701-event-driven.html">Event-Carried State Transfer</a>, and I don&#8217;t like that term. For me, it&#8217;s extremely misleading as, it doesn&#8217;t tell what has happened, but what has changed. It just gathers the new version of the state (or the diff).</p><p>In my opinion, it&#8217;s a variation of <a href="https://event-driven.io/pl/state-obsession/">State Obsession anti-pattern</a>. Many people fell into that and believe it&#8217;s fine to connect the messaging system to the database, use tools like <a href="https://en.wikipedia.org/wiki/Change_data_capture">Change Data Capture</a>, and publish it automatically to others. They end up with passive-aggressive communication style:</p><blockquote><p>You have all you need. The whole state is in the <em>events</em>, just interpret it.</p></blockquote><p>How can you reason about what has happened if instead of <em>OrderConfirmed</em> you get <em>OrderCreated</em>, <em>OrderUpdated</em>, <em>OrderDeleted</em>? You&#8217;d need to do the diff, compare with previous values, and do the guess about the reason of the specific change.</p><p>You deal with <a href="https://event-driven.io/pl/clickbait_event/">Clickbait Events</a> and have a leaking business abstraction. All consumers need to understand the internals of your processing to detect a specific type of change. I wrote about it in detail in <a href="https://event-driven.io/pl/internal_external_events/">Internal and external events, or how to design event-driven API</a>.</p><p>Again, the loose coupling of the event-driven processing is only loose for producers; consumers need to adapt. This can lead to hidden coupling, where a change in the producer breaks consumer flows. And that&#8217;s the worst type of coupling you can get.</p><p><strong>If we&#8217;re making commands explicit, we&#8217;re also making an explicit relationship between components.</strong> It&#8217;s no longer flattened to producer &lt;=&gt; consumer, where the producer always shapes the communication. Now, if the other component exposes a command, that&#8217;s the driving force behind its behaviour. This helps to shape autonomy. In our case, we could make the Payment Module a generic module with a stable public API for registering payments, and an ordering module that requests them, in accordance with the Shipment Module. Fraud Detection could continue subscribing to events, as it already does. <a href="https://github.com/ddd-crew/context-mapping">Context Mapping</a> can greatly help in finding those relationships.</p><h2>TLDR</h2><p>We tend to be all about events these days, but they&#8217;re not the only message types. In our systems, messages take various forms: Events, Commands, and Documents, each serving distinct purposes:</p><ul><li><p><strong>Documents are all about state transitions</strong>, which are essential for syncing data across services but missing deeper business insights.</p></li><li><p><strong>Commands represent a clear intent to act</strong>, directed with an expectation of execution, and can be accepted or rejected.</p></li><li><p><strong>Events are immutable facts</strong>, announced without waiting for a response. They&#8217;re like broadcasting news, hoping it catches the right ears.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kihU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F052a764b-cf75-4728-a914-9512c78242e0_800x452.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kihU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F052a764b-cf75-4728-a914-9512c78242e0_800x452.png 424w, https://substackcdn.com/image/fetch/$s_!kihU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F052a764b-cf75-4728-a914-9512c78242e0_800x452.png 848w, https://substackcdn.com/image/fetch/$s_!kihU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F052a764b-cf75-4728-a914-9512c78242e0_800x452.png 1272w, https://substackcdn.com/image/fetch/$s_!kihU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F052a764b-cf75-4728-a914-9512c78242e0_800x452.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kihU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F052a764b-cf75-4728-a914-9512c78242e0_800x452.png" width="800" height="452" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/052a764b-cf75-4728-a914-9512c78242e0_800x452.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:452,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;message types&quot;,&quot;title&quot;:&quot;message types&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="message types" title="message types" srcset="https://substackcdn.com/image/fetch/$s_!kihU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F052a764b-cf75-4728-a914-9512c78242e0_800x452.png 424w, https://substackcdn.com/image/fetch/$s_!kihU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F052a764b-cf75-4728-a914-9512c78242e0_800x452.png 848w, https://substackcdn.com/image/fetch/$s_!kihU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F052a764b-cf75-4728-a914-9512c78242e0_800x452.png 1272w, https://substackcdn.com/image/fetch/$s_!kihU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F052a764b-cf75-4728-a914-9512c78242e0_800x452.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Event-Driven Architectures enable loose coupling, but only for producers. To make consumers loosely coupled, we need to take extra steps, embrace different message types, and have them participate in modelling business processes.</p><p>If we go too far with an event-all-the-things communication style, we&#8217;ll make our system a room full of shouting people, with a passive-aggressive communication style. Or just aggressive.</p><p>In consequence, we won&#8217;t know what&#8217;s happening in our system, will see only noise, and will have a hard time making it reliable, observable and predictable. We should treat our messages as a communication contract, API and model their flow in a way that shapes our regular communication.</p><p><strong>So next time, ask yourself if your event shouldn&#8217;t be a command. If it has always had a single consumer and you expect a specific event back, then it&#8217;s probably so.</strong> It&#8217;s all about being clear about the intention, not lying to yourself and others.</p><p>I hope this article will equip you with the knowledge to fix that.</p><p><strong>If you&#8217;re dealing with such issues, I&#8217;m happy to help you through consulting, <a href="https://event-driven.io/en/training">training</a> or mentoring. <a href="mailto:oskar@event-driven.io">Contact me</a> and we&#8217;ll find a way to unblock you!</strong></p><p><strong>See also more in series about <a href="https://event-driven.io/en/anti-patterns/">event modelling anti-patterns</a>:</strong></p><ul><li><p><a href="https://event-driven.io/en/state-obsession/">State Obsession</a>,</p></li><li><p><a href="https://event-driven.io/en/property-sourcing/">Property Sourcing</a>,</p></li><li><p><a href="https://event-driven.io/en/i_will_just_add_one_more_field/">I&#8217;ll just add one more field</a>.</p></li><li><p><a href="https://event-driven.io/en/clickbait_event/">Clickbait event</a>,</p></li><li><p><a href="https://event-driven.io/en/passive_aggressive_events">Passive Aggressive Events</a>,</p></li><li><p><a href="https://event-driven.io/en/one_or_more_event_that_is_the_question/">Should you record multiple events from business logic?</a>,</p></li><li><p><a href="https://event-driven.io/en/on_putting_stream_id_in_event_data/">Stream ids, event types prefixes and other event data you might not want to slice off</a>.</p></li></ul><p><strong>Check also more general considerations:</strong></p><ul><li><p><a href="https://event-driven.io/en/events_should_be_as_small_as_possible/">Events should be as small as possible, right?</a>,</p></li><li><p><a href="https://event-driven.io/en/whats_the_difference_between_event_and_command/">What&#8217;s the difference between a command and an event?</a>,</p></li><li><p><a href="https://event-driven.io/en/internal_external_events/">Internal and external events, or how to design event-driven API</a>,</p></li><li><p><a href="https://event-driven.io/en/event_streaming_is_not_event_sourcing/">Event Streaming is not Event Sourcing!</a>,</p></li><li><p><a href="https://event-driven.io/en/dont_let_event_driven_architecture_buzzwords_fool_you/">Don&#8217;t let Event-Driven Architecture buzzwords fool you</a>,</p></li><li><p><a href="https://event-driven.io/en/how_to_design_software_architecture_pragmatically/">How to design software architecture pragmatically</a>,</p></li><li><p><a href="https://event-driven.io/en/gdpr_in_event_driven_architecture/">How to deal with privacy and GDPR in Event-Driven systems</a>.</p></li></ul><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[The one where Oskar explains Example Mapping]]></title><description><![CDATA[Learn one of the simplest and actionable modelling technique]]></description><link>https://www.architecture-weekly.com/p/the-one-where-oskar-explains-example</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/the-one-where-oskar-explains-example</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 30 Mar 2026 16:02:25 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!wiRE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd37784ae-b61e-458f-bb5d-62b57dfdfc59_800x671.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wiRE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd37784ae-b61e-458f-bb5d-62b57dfdfc59_800x671.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wiRE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd37784ae-b61e-458f-bb5d-62b57dfdfc59_800x671.png 424w, https://substackcdn.com/image/fetch/$s_!wiRE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd37784ae-b61e-458f-bb5d-62b57dfdfc59_800x671.png 848w, https://substackcdn.com/image/fetch/$s_!wiRE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd37784ae-b61e-458f-bb5d-62b57dfdfc59_800x671.png 1272w, https://substackcdn.com/image/fetch/$s_!wiRE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd37784ae-b61e-458f-bb5d-62b57dfdfc59_800x671.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wiRE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd37784ae-b61e-458f-bb5d-62b57dfdfc59_800x671.png" width="800" height="671" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d37784ae-b61e-458f-bb5d-62b57dfdfc59_800x671.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:671,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;cover&quot;,&quot;title&quot;:&quot;cover&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="cover" title="cover" srcset="https://substackcdn.com/image/fetch/$s_!wiRE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd37784ae-b61e-458f-bb5d-62b57dfdfc59_800x671.png 424w, https://substackcdn.com/image/fetch/$s_!wiRE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd37784ae-b61e-458f-bb5d-62b57dfdfc59_800x671.png 848w, https://substackcdn.com/image/fetch/$s_!wiRE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd37784ae-b61e-458f-bb5d-62b57dfdfc59_800x671.png 1272w, https://substackcdn.com/image/fetch/$s_!wiRE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd37784ae-b61e-458f-bb5d-62b57dfdfc59_800x671.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>One of the first indications of getting old(er) is when people stop getting your movie or music references. Of course, based on this rule, some people are always old. That happens.</p><p>Recently, I realised during <a href="https://event-driven.io/en/training/">my workshops</a> that referencing Friends is not so cool anymore. It started to happen when I was explaining the Example Mapping technique.</p><blockquote><p>It always starts with &#8220;The one where&#8221;.</p><p>Just like in Friends.</p></blockquote><p>I started to notice a bit slower head nodding and a bit more awkward smiles from the attendees. I repeated</p><blockquote><p>You know, like in titles of Friends episodes.</p></blockquote><p>And head-nodding stopped; only totally awkwardly polite smiles remained. Definitely, it wasn&#8217;t The One Where Everybody Finds Out. So I finally asked:</p><blockquote><p>You weren&#8217;t watching Friends, didn&#8217;t you?</p></blockquote><p>Obviously, the answer was:</p><blockquote><p>Emmm. No&#8230;</p></blockquote><p><strong>Okay, then, if you don&#8217;t know Friends or the Example Mapping technique, this will be The One Where You Find Out.</strong></p><p>Let&#8217;s say that we&#8217;re working on the guest checkout feature for a hotel management system.</p><p>We could start by asking the business how it works. We could get an answer that:</p><blockquote><p>The guest approaches the desk and requests checkout. The clerk inquires about the quality of the products and services, and after receiving an answer, requests the room key. After gathering, the key clerk checks whether the balance is settled. If it&#8217;s settled, then proceed with the checkout. Marking the stay as completed.</p></blockquote><p>Sounds straightforward, but we should already have several questions popping up, e.g. what does it mean that &#8220;balance is settled&#8221;? We could get quick feedback that:</p><blockquote><p>This means that the difference between the sums of all charges and payments is equal to zero.</p></blockquote><p>Then we could try to come up with an example:</p><blockquote><p>Ah, so for instance, when guests haven&#8217;t paid upfront for their stay, right?</p></blockquote><p>Right.</p><blockquote><p>Oh, then we need to charge them, right?</p></blockquote><p>Right.</p><p>We could visualise what we discovered in the following way:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9AoP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc990952-1b69-4e5c-a86e-ca2e6f47fe2a_800x871.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9AoP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc990952-1b69-4e5c-a86e-ca2e6f47fe2a_800x871.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9AoP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc990952-1b69-4e5c-a86e-ca2e6f47fe2a_800x871.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9AoP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc990952-1b69-4e5c-a86e-ca2e6f47fe2a_800x871.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9AoP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc990952-1b69-4e5c-a86e-ca2e6f47fe2a_800x871.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9AoP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc990952-1b69-4e5c-a86e-ca2e6f47fe2a_800x871.jpeg" width="800" height="871" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dc990952-1b69-4e5c-a86e-ca2e6f47fe2a_800x871.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:871,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;em01&quot;,&quot;title&quot;:&quot;em01&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="em01" title="em01" srcset="https://substackcdn.com/image/fetch/$s_!9AoP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc990952-1b69-4e5c-a86e-ca2e6f47fe2a_800x871.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9AoP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc990952-1b69-4e5c-a86e-ca2e6f47fe2a_800x871.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9AoP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc990952-1b69-4e5c-a86e-ca2e6f47fe2a_800x871.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9AoP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc990952-1b69-4e5c-a86e-ca2e6f47fe2a_800x871.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Now, this generated another flow for us. We have a new feature we weren&#8217;t aware of: the guest&#8217;s stay payment registration. Let&#8217;s try to start this time from the visualisation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gG9G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81974702-c0c2-4703-b415-a417e97294a8_800x727.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gG9G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81974702-c0c2-4703-b415-a417e97294a8_800x727.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gG9G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81974702-c0c2-4703-b415-a417e97294a8_800x727.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gG9G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81974702-c0c2-4703-b415-a417e97294a8_800x727.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gG9G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81974702-c0c2-4703-b415-a417e97294a8_800x727.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gG9G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81974702-c0c2-4703-b415-a417e97294a8_800x727.jpeg" width="800" height="727" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/81974702-c0c2-4703-b415-a417e97294a8_800x727.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:727,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;em02&quot;,&quot;title&quot;:&quot;em02&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="em02" title="em02" srcset="https://substackcdn.com/image/fetch/$s_!gG9G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81974702-c0c2-4703-b415-a417e97294a8_800x727.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gG9G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81974702-c0c2-4703-b415-a417e97294a8_800x727.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gG9G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81974702-c0c2-4703-b415-a417e97294a8_800x727.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gG9G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81974702-c0c2-4703-b415-a417e97294a8_800x727.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It&#8217;s the one Oskar pays for his stay, because he wants to check out but didn&#8217;t pay upfront. The payment is registered, and we can try checking out again. Sounds fine, but we should ask whether there are any rules for payments. It may appear that:</p><blockquote><p>Yes, there are some, for instance:</p><ul><li><p>Only guests with a valid credit card can pay with it for their stay,</p></li><li><p>Guests paying in cash need to hand it over before accepting the payment.</p></li></ul></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!V8g4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcb9e395-718e-4076-b22e-bbfc273b8885_800x1024.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!V8g4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcb9e395-718e-4076-b22e-bbfc273b8885_800x1024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!V8g4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcb9e395-718e-4076-b22e-bbfc273b8885_800x1024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!V8g4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcb9e395-718e-4076-b22e-bbfc273b8885_800x1024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!V8g4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcb9e395-718e-4076-b22e-bbfc273b8885_800x1024.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!V8g4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcb9e395-718e-4076-b22e-bbfc273b8885_800x1024.jpeg" width="800" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fcb9e395-718e-4076-b22e-bbfc273b8885_800x1024.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;em03&quot;,&quot;title&quot;:&quot;em03&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="em03" title="em03" srcset="https://substackcdn.com/image/fetch/$s_!V8g4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcb9e395-718e-4076-b22e-bbfc273b8885_800x1024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!V8g4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcb9e395-718e-4076-b22e-bbfc273b8885_800x1024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!V8g4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcb9e395-718e-4076-b22e-bbfc273b8885_800x1024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!V8g4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcb9e395-718e-4076-b22e-bbfc273b8885_800x1024.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And hey, we just found out new business rules, let&#8217;s put them on the board and update our flow to be more precise and reflect our scenario by adding a note that this scenario represents a guest paying with a credit card.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YAJv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7b76a54-8f85-45ec-b5cf-979e9f003cb1_800x1070.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YAJv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7b76a54-8f85-45ec-b5cf-979e9f003cb1_800x1070.jpeg 424w, https://substackcdn.com/image/fetch/$s_!YAJv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7b76a54-8f85-45ec-b5cf-979e9f003cb1_800x1070.jpeg 848w, https://substackcdn.com/image/fetch/$s_!YAJv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7b76a54-8f85-45ec-b5cf-979e9f003cb1_800x1070.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!YAJv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7b76a54-8f85-45ec-b5cf-979e9f003cb1_800x1070.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YAJv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7b76a54-8f85-45ec-b5cf-979e9f003cb1_800x1070.jpeg" width="800" height="1070" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c7b76a54-8f85-45ec-b5cf-979e9f003cb1_800x1070.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1070,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;em05&quot;,&quot;title&quot;:&quot;em05&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="em05" title="em05" srcset="https://substackcdn.com/image/fetch/$s_!YAJv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7b76a54-8f85-45ec-b5cf-979e9f003cb1_800x1070.jpeg 424w, https://substackcdn.com/image/fetch/$s_!YAJv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7b76a54-8f85-45ec-b5cf-979e9f003cb1_800x1070.jpeg 848w, https://substackcdn.com/image/fetch/$s_!YAJv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7b76a54-8f85-45ec-b5cf-979e9f003cb1_800x1070.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!YAJv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc7b76a54-8f85-45ec-b5cf-979e9f003cb1_800x1070.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Now, what if the payment fails? Can it fail? Let&#8217;s ask the business!</p><blockquote><p>Payment may fail if the payment gateway is unavailable or the issuer rejects it.</p></blockquote><p>What if our internet connection is down for the whole day?</p><blockquote><p>The clerk should ask the guest for cash.</p></blockquote><p>What if the guest doesn&#8217;t have cash? &#129300;</p><blockquote><p>Then the shift manager can authorise unsettled balance checkout and register a charge with a delayed due date.</p></blockquote><p>Here&#8217;s the updated flow. The one where Oskar pays for his entire stay with a credit card, but the Internet is down, and he doesn&#8217;t have cash.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fIY9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8acd24-7828-40d9-9d08-aa9d1f2622e2_800x1059.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fIY9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8acd24-7828-40d9-9d08-aa9d1f2622e2_800x1059.jpeg 424w, https://substackcdn.com/image/fetch/$s_!fIY9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8acd24-7828-40d9-9d08-aa9d1f2622e2_800x1059.jpeg 848w, https://substackcdn.com/image/fetch/$s_!fIY9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8acd24-7828-40d9-9d08-aa9d1f2622e2_800x1059.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!fIY9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8acd24-7828-40d9-9d08-aa9d1f2622e2_800x1059.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fIY9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8acd24-7828-40d9-9d08-aa9d1f2622e2_800x1059.jpeg" width="800" height="1059" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2b8acd24-7828-40d9-9d08-aa9d1f2622e2_800x1059.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1059,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;em04&quot;,&quot;title&quot;:&quot;em04&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="em04" title="em04" srcset="https://substackcdn.com/image/fetch/$s_!fIY9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8acd24-7828-40d9-9d08-aa9d1f2622e2_800x1059.jpeg 424w, https://substackcdn.com/image/fetch/$s_!fIY9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8acd24-7828-40d9-9d08-aa9d1f2622e2_800x1059.jpeg 848w, https://substackcdn.com/image/fetch/$s_!fIY9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8acd24-7828-40d9-9d08-aa9d1f2622e2_800x1059.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!fIY9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b8acd24-7828-40d9-9d08-aa9d1f2622e2_800x1059.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Now, we found out:</p><ul><li><p><strong>A new outcome</strong>, failed payment,</p></li><li><p><strong>A new rule</strong>, that we need Internet access to authorise credit card payment,</p></li><li><p><strong>A new feature</strong>, the shift manager can authorise unsettled balance checkout and register a charge with a delayed due date.</p></li></ul><p>How would the authorisation look? How should we register a delayed charge?</p><blockquote><p>We should follow double-entry bookkeeping and register the authorisation (type of payment) with the unbalanced amount to settle the balance for today, and register an additional delayed charge for the same amount</p></blockquote><p>The flow will look like:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Iilt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80f7897-1f9c-4f38-9148-dce8270beb51_800x1065.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Iilt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80f7897-1f9c-4f38-9148-dce8270beb51_800x1065.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Iilt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80f7897-1f9c-4f38-9148-dce8270beb51_800x1065.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Iilt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80f7897-1f9c-4f38-9148-dce8270beb51_800x1065.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Iilt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80f7897-1f9c-4f38-9148-dce8270beb51_800x1065.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Iilt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80f7897-1f9c-4f38-9148-dce8270beb51_800x1065.jpeg" width="800" height="1065" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b80f7897-1f9c-4f38-9148-dce8270beb51_800x1065.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1065,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;em06&quot;,&quot;title&quot;:&quot;em06&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="em06" title="em06" srcset="https://substackcdn.com/image/fetch/$s_!Iilt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80f7897-1f9c-4f38-9148-dce8270beb51_800x1065.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Iilt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80f7897-1f9c-4f38-9148-dce8270beb51_800x1065.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Iilt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80f7897-1f9c-4f38-9148-dce8270beb51_800x1065.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Iilt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80f7897-1f9c-4f38-9148-dce8270beb51_800x1065.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And that&#8217;s precisely how the Example Mapping session looks like. It&#8217;s a structured conversation format created by <a href="https://mattwynne.net/about">Matt Wynne</a>. You take a user story, gather a small group (usually a developer, tester, and someone from the business side), and spend around 25-30 minutes breaking it down together.</p><p>You don&#8217;t need a big setup, a huge ceremon, you don&#8217;t need sticky notes, you can just use plain text like:</p><p><strong>Given:</strong> Example</p><p><strong>When:</strong> We use specific features</p><p><strong>Then:</strong> Based on business rules, we get a specific outcome.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!brqo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56c7142-5150-4ef8-9a36-46a0b5fd9854_800x576.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!brqo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56c7142-5150-4ef8-9a36-46a0b5fd9854_800x576.jpeg 424w, https://substackcdn.com/image/fetch/$s_!brqo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56c7142-5150-4ef8-9a36-46a0b5fd9854_800x576.jpeg 848w, https://substackcdn.com/image/fetch/$s_!brqo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56c7142-5150-4ef8-9a36-46a0b5fd9854_800x576.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!brqo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56c7142-5150-4ef8-9a36-46a0b5fd9854_800x576.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!brqo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56c7142-5150-4ef8-9a36-46a0b5fd9854_800x576.jpeg" width="800" height="576" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c56c7142-5150-4ef8-9a36-46a0b5fd9854_800x576.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:576,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;em07&quot;,&quot;title&quot;:&quot;em07&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="em07" title="em07" srcset="https://substackcdn.com/image/fetch/$s_!brqo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56c7142-5150-4ef8-9a36-46a0b5fd9854_800x576.jpeg 424w, https://substackcdn.com/image/fetch/$s_!brqo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56c7142-5150-4ef8-9a36-46a0b5fd9854_800x576.jpeg 848w, https://substackcdn.com/image/fetch/$s_!brqo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56c7142-5150-4ef8-9a36-46a0b5fd9854_800x576.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!brqo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc56c7142-5150-4ef8-9a36-46a0b5fd9854_800x576.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Business people don&#8217;t need to give them to you in such form. You can use the interview as I showed above and note it on your own, while you&#8217;re discussing stuff. It&#8217;s also a nice way to collaborate and visualise your discussions.</p><p>You don&#8217;t even need to start with an interview; you can use the Example Mapping as a brainstorming tool to generate as many examples of your (part of the) system. Then, try to model it as you see fit and ask the business for clarifications in the preferred form. It can help facilitate discussion with your team, not only with business stakeholders.</p><p>It&#8217;s super helpful, as misunderstandings are expensive. When multiple people try to describe the same rule using real examples, you&#8217;ll quickly see where assumptions don&#8217;t match. Better to find that out in a 30-minute chat than after two weeks of coding the wrong thing.</p><p>It also works as a readiness check. Too many red cards? The story isn&#8217;t ready. Too many blue cards? The story is probably too big. If examples come easily and everyone nods along - you&#8217;re good to go.</p><p>There&#8217;s also more, as you can try to distil business rules based on the examples and outcomes the business describes.</p><p>It&#8217;s worth noting that I cheated you in colours. The original looks like this:</p><p>Why did I change them?</p><p>Example Mapping plays nicely with other collaboration techniques like Event Storming. And if you&#8217;re familiar with the Event Storming colour scheme, that&#8217;s also the reason why I used it. I aligned them. See:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!laHZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa59f21c1-131b-41ce-b9c5-9ebc4285fc6c_800x535.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!laHZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa59f21c1-131b-41ce-b9c5-9ebc4285fc6c_800x535.png 424w, https://substackcdn.com/image/fetch/$s_!laHZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa59f21c1-131b-41ce-b9c5-9ebc4285fc6c_800x535.png 848w, https://substackcdn.com/image/fetch/$s_!laHZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa59f21c1-131b-41ce-b9c5-9ebc4285fc6c_800x535.png 1272w, https://substackcdn.com/image/fetch/$s_!laHZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa59f21c1-131b-41ce-b9c5-9ebc4285fc6c_800x535.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!laHZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa59f21c1-131b-41ce-b9c5-9ebc4285fc6c_800x535.png" width="800" height="535" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a59f21c1-131b-41ce-b9c5-9ebc4285fc6c_800x535.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:535,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;source: https://cucumber.io/blog/bdd/example-mapping-introduction/&quot;,&quot;title&quot;:&quot;source: https://cucumber.io/blog/bdd/example-mapping-introduction/&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="source: https://cucumber.io/blog/bdd/example-mapping-introduction/" title="source: https://cucumber.io/blog/bdd/example-mapping-introduction/" srcset="https://substackcdn.com/image/fetch/$s_!laHZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa59f21c1-131b-41ce-b9c5-9ebc4285fc6c_800x535.png 424w, https://substackcdn.com/image/fetch/$s_!laHZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa59f21c1-131b-41ce-b9c5-9ebc4285fc6c_800x535.png 848w, https://substackcdn.com/image/fetch/$s_!laHZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa59f21c1-131b-41ce-b9c5-9ebc4285fc6c_800x535.png 1272w, https://substackcdn.com/image/fetch/$s_!laHZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa59f21c1-131b-41ce-b9c5-9ebc4285fc6c_800x535.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;m typically using it during modelling sessions to:</p><ul><li><p>brainstorm (read more in <a href="https://www.architecture-weekly.com/p/start-alone-then-together-why-software">Start Alone, Then Together: Why Software Modelling Needs Solitary Brainstorming</a>),</p></li><li><p>challenging existing models with real-world examples,</p></li><li><p>expanding the model with uncovered (through examples) use cases,</p></li><li><p>finding business rules,</p></li><li><p>helping with facilitation by looking at the model from a different perspective.</p></li></ul><p>And hot spots and notes, as known from EventStorming, are super helpful here. Read also more in <a href="https://www.architecture-weekly.com/p/the-underestimated-power-of-hot-spots">The Underestimated Power of Hot Spots and Notes in EventStorming</a>.</p><p>What&#8217;s more, if you look at the Given/When/Then pattern, you may notice that it works nicely with Behaviour-Driven Design. I already wrote that <a href="https://event-driven.io/en/behaviour_driven_design_is_not_about_tests/">Behaviour-Driven Design is more than tests</a>. How to do it? Check <a href="https://event-driven.io/en/testing_event_sourcing_emmett_edition/">here</a>.</p><p>I&#8217;ll also expand on it in the next articles. I&#8217;m doing the extreme Example Mapping with events, so stay tuned, the more will come.</p><p>For now, check also those materials:</p><ul><li><p><a href="https://www.youtube.com/watch?v=EtoTML8cuko">Seb Rose - short, practical and actionable intro to Example Mapping</a></p></li><li><p><a href="https://www.youtube.com/watch?v=WvkBKvMnyuc">Kenny Baas-Schwegler - showing how to use Example Mapping with EventStorming</a></p></li><li><p><a href="https://cucumber.io/blog/bdd/example-mapping-introduction/">An introduction by Matt Wynne himself</a>,</p></li><li><p><a href="https://draft.io/example/example-mapping">Other quick intro by Gojko Adzic</a>.</p></li></ul><p>And most importantly, try it. Take one feature from your system, try to crunch it, or start your design session with this technique.</p><p><a href="https://event-driven.io/en/training/">Most of the teams I&#8217;m working with</a> are enjoying this technique, as it&#8217;s a fun way to get quick, actionable outcomes.</p><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[Interactive Rubber Ducking with GenAI]]></title><description><![CDATA[On how to work on the design with GenAI, without getting mad at you're absolutely right!]]></description><link>https://www.architecture-weekly.com/p/interactive-rubber-ducking-with-genai</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/interactive-rubber-ducking-with-genai</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 16 Mar 2026 13:26:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!jfcr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jfcr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jfcr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!jfcr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!jfcr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!jfcr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jfcr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2240324,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.architecture-weekly.com/i/191127846?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jfcr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!jfcr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!jfcr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!jfcr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c6f36c-9c5b-44bb-9768-beb5da788b1d_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>You may already know that <a href="https://event-driven.io/en/the_end_of_coding_wrong_question/">I&#8217;m a GenAI sceptic</a>. And a general sceptic.</p><p>Do you know that scepticism comes from the Greek <em>&#963;&#954;&#941;&#960;&#964;&#959;&#956;&#945;&#953;</em> (<em>skeptomai</em>), meaning &#8216;to search, to think about, or look for&#8217;? So my intention is not to say no to everything new, but more to think about it first, and understand before I say yes.</p><p>There&#8217;s a lot of stuff about GenAI that makes me smile, but I still understand that my way is my way, and I won&#8217;t stop the world. I won&#8217;t even try. Thus, I want to research and consider how those tools can help me. I already wrote that I don&#8217;t feel like <a href="https://www.architecture-weekly.com/p/requiem-for-a-10x-engineer-dream">10x Dev</a>, but I&#8217;m finding more ways to get help from it.</p><p><strong>One of the ways that helped me is something I call </strong><em><strong>&#8220;Interactive Rubber-Ducking&#8221;</strong></em><strong>.</strong></p><p>Initially, I called it just <a href="https://www.architecture-weekly.com/p/start-alone-then-together-why-software">brainstorming</a>, but that wouldn&#8217;t be precise, as I&#8217;m not using it to brainstorm ideas, more to challenge and clarify them.</p><p>Most of the code I write nowadays is done in <a href="https://github.com/oskardudycz/">my OSS projects</a>. I&#8217;m grateful to have a <a href="https://discord.gg/fTpqUTMmVa">great community</a> with people actively contributing in different ways; still, the canonical design and code work is on my side. As I work in an event-driven niche, I&#8217;m often alone with my own thoughts. I try to use the <a href="https://www.architecture-weekly.com/p/workflow-engine-design-proposal-tell">RFC process</a> and discuss it with other fellow humans, but they are not always available. Even if they do, to avoid wasting their time, I need to know what to tell or ask them. I need to give some proposals (with alternatives) to have an <a href="https://event-driven.io/en/fifteen_tips_on_how_to_run_meetings_effectively/">effective discussion</a>. I may seem organised, but that&#8217;s not always the thing. Sitting in your own head is not a great place to be in general. If you&#8217;re a technical leader or an architect, I&#8217;m sure that you know that solitude too well.</p><p>GenAI tools are not great sparing partners. They&#8217;re <em>Yes men</em>. If they read this article, they&#8217;d for sure confirm it. They&#8217;d probably do it even without reading it. Of course, you can ask them not to be <a href="https://www.merriam-webster.com/dictionary/sycophant">sycophant</a>. You can ask numerous MUSTS with capital letters and bolded <strong>NEVER</strong> here and there, and it can help, but it won&#8217;t fully beat the way they were trained.</p><p>And talking to yourself is a similar experience: you&#8217;ll find numerous ways to justify your own decisions, and looking at the same place for too long will make you miss obvious blind spots.</p><p>Ok, so why would we take those two blind &#8220;people&#8221; and try to make them help each other?</p><p><strong>That&#8217;s kinda what &#8220;Interactive Rubber-Ducking&#8221; is.</strong> It takes a blind human with an idea, and another blind not-so-human asking questions. It starts with such a prompt:</p><blockquote><p>Ask me one question at a time so we can develop a thorough, step-by-step spec for this idea. Each question should build on my previous answers, and our end goal is to have a detailed specification I can hand off to a developer. Let&#8217;s do this iteratively and dig into every relevant detail. Remember, only one question at a time.</p><p>Once we are done, save the spec as spec.md</p><p>Before asking another question, store the previous one with the answer in qa.md. Write literally the question and answer, not just a summary.</p><p>Here&#8217;s the idea:</p></blockquote><p>Don&#8217;t treat it as <em>&#8220;one magical prompt that will change your life&#8221;</em>. Most important is why we&#8217;re doing it, what happens next, and who&#8217;s actually doing the work. Spoiler alert: it&#8217;s not LLM.</p><p>I&#8217;m using it as a command in Claude Code, and, most importantly, with beefier models like Opus, which can better reason and ask better questions. Doing it with lower-level models always gave me much worse results.</p><p>I&#8217;m using it with Claude Code, not Claude Chat, because I want the model to scan my codebase. I can ask it to look in certain areas or to reference my answers. I can even ask to search the web or MCPs like <a href="https://context7.com/">Context 7</a> to check documentation and APIs for popular libraries. Then it&#8217;s getting more into brainstorming sometimes, than rubber-ducking, but that&#8217;s fine.</p><p>As a result, we&#8217;ll get two artefacts:</p><ul><li><p><strong>qa.md</strong> - with the log of the back-and-forth discussion,</p></li><li><p><strong>spec.md</strong> - in theory spec built by LLM, but imho it&#8217;s more of a concise summary.</p></li></ul><p>It may look like Specification-Driven Design, but it&#8217;s not.</p><p>My goal for this exercise is not to get an actionable specification.</p><p><strong>The goal is to get our LLM-based Rubber Duck to ask us hard questions and make us think, not to make the LLM think for us. Find blind spots, and challenge our thinking.</strong></p><p>But we&#8217;re drivers, we need to know what we want to do, we need to know all the WHYs, and we also need to know HOW. LLM is here to help, but not to do creative work for us. It just pulls it out from our heads.</p><p>It also helps to see how our design may be seen by others, especially such mediocre thinkers as LLMs.</p><p>I don&#8217;t expect the Agent to be able to start implementing the spec. I expect it to reflect all considerations and summarise findings. I&#8217;m always double-checking to make sure it includes all the important points. If not, I&#8217;ll keep doing Q&amp;A until I&#8217;m satisfied.</p><p>Having both of those files will allow us to keep a full discussion without losing important details, and a shorter version. We can feed that to another model for review, or try to work on tasks and develop a more detailed plan. Sometimes I plan on my own; for simpler tasks, I may ask the LLM to do it fully. Usually, I&#8217;m driving the LLM step by step, passing just specific asks.</p><p>The example? Why not.</p><p><strong>I recently did such an exercise, trying to narrow down how to introduce the </strong><em><strong>Second-level cache</strong></em><strong> to <a href="https://github.com/event-driven-io/pongo">Pongo</a> and <a href="https://github.com/event-driven-io/emmett">Emmett</a>.</strong> What&#8217;s Second-level cache? A second-level cache is a local store of data managed by the persistence provider to improve application performance.</p><p><strong>Why do I want to introduce it?</strong> Because I got an <a href="https://github.com/event-driven-io/emmett/issues/322">issue from the user</a> that <a href="https://event-driven.io/en/rebuilding_event_driven_read_models/">rebuilding projections</a> with a lot events can take too long. One of the reasons is that applying an event on a projection takes:</p><ul><li><p>loading the current state,</p></li><li><p>updating it,</p></li><li><p>storing the result.</p></li></ul><p>Those are two operations per event. If we have a batch of 100 events, that&#8217;d mean 200 operations; for 1000 events, this would be 2000, so the classical N+1 problem. We could do it differently, and within a batch:</p><ul><li><p>group events by the target documents,</p></li><li><p>load all of them in one operation by finding all documents within an array of ids (taken from events),</p></li><li><p>caching them,</p></li><li><p>applying events in memory,</p></li><li><p>storing updated documents in one operation.</p></li></ul><p>Then we&#8217;ll get at worst 102 operations for 100 events and 1002 for 1000 events, so a linear increase.</p><p>Still, I decided to add batching and introduce 2nd-level cache as a solution. I and my experience were a driving force behind that. I could, of course, rubber duck it, but I had that already planned, not just detailed. I also did some initial research before doing the session.</p><p>Let me show you the result below. Not too bad if you ask me. Of course, the task wasn&#8217;t very innovative; it&#8217;s a standard way for mature solutions. Still questions were surprisingly good and helped me.</p><p>I hope this log will show you (besides how bad I&#8217;m at typos) some insight into how I think, talk, and consider the design.</p><p>Tell me your thoughts, if that matches your way, if you haven&#8217;t done it yet, try it and tell me how it went!</p><div><hr></div><h2>Interactive Rubber Ducking example</h2><h3><strong>My idea</strong></h3><p>I&#8217;d like to provide 2nd-level cache to Pongo. I want to either use keyv (https://www.npmjs.com/package/keyv) or lru-cache (https://www.npmjs.com/package/lru-cache) packages.</p><p>For now, I&#8217;d like to do it by id.</p><p>I probably need to either extend pongoCollection src/packages/pongo/src/core/collection/pongoCollection.ts with findeOneById and findManyByIds methods, or check if _id is passed in and then try to access the cache.</p><p>Cache should be set up either on the pongo collection, pongo db or pongoClient.</p><p>Optimistic Concurrency errors should invalidate cache (as that means someone else updated the record in db, e.g. through a different node). </p><p>InsertMany and insertOne should put it into the cache.</p><p>I&#8217;d also like to expand the handle method to allow passing either an id or an array of IDs to support batch loading of entities and improve performance, and to support batch update/insert/delete.</p><p>Thoughts?</p><p>Which defaults should I set? (Of course, allowing users to override it).</p><p>Check the related code I pasted. I want to keep it aligned and as non-invasive as possible.</p><h3>Q1: Cache library choice &#8212; lru-cache, keyv, or custom interface?</h3><p><strong>Question:</strong> You mentioned both <code>keyv</code> and <code>lru-cache</code>. They pull in different directions. Which trade-off do you prefer?</p><ul><li><p>A) Start with <code>lru-cache</code>, optionally let users swap adapters later.</p></li><li><p>B) Start with <code>keyv</code> so cross-node sharing works out of the box.</p></li><li><p>C) Define a thin cache interface (<code>get</code>/<code>set</code>/<code>delete</code>/<code>clear</code>) and ship <code>lru-cache</code> as the default adapter.</p></li></ul><p><strong>Answer:</strong> Custom cache interface (option C). No dependency on external APIs &#8212; keep it pluggable. Start with an in-memory provider only. Considered writing a custom LRU with me, but wants to ship fast. Key point: don&#8217;t stick to any third-party API surface, allow pluggability.</p><h3>Q2: Where should the cache live in Pongo&#8217;s hierarchy?</h3><p><strong>Question:</strong> Three attachment points: A) PongoClient level (shared across all), B) PongoDb level (per database), C) PongoCollection level (per collection). Or configurable at multiple levels?</p><p><strong>Answer:</strong> Allow defining it at different levels &#8212; cascading configuration. Set a default higher up, override per collection if needed.</p><h3>Q3: Inheritance/override behavior for cascading cache config?</h3><p><strong>Question:</strong> For the cascading config &#8212; what should the inheritance/override behavior be?</p><p>Two sub-questions:</p><ol><li><p>Should a lower level be able to <em>disable</em> caching even if a parent enables it?</p></li><li><p>Should cache <em>settings</em> (like max size, TTL) also cascade, or only the cache provider instance itself?</p></li></ol><p><strong>Answer:</strong></p><ol><li><p>Yes. Cache should be an object with type, general params and specific for this type, or <code>'disabled'</code>. That&#8217;s a proper devex.</p></li><li><p>Yes, as above. But if one selects a different cache type then general params should be inherited but type-specific should not be inherited and should fallback to defaults.</p></li></ol><p><strong>Claude&#8217;s thoughts:</strong> The <code>CacheConfig | 'disabled'</code> pattern is clean &#8212; discriminated unions play nicely with TypeScript. The inheritance rule makes sense: general params (TTL, max size) cascade, but type-specific params reset to defaults when you switch providers, avoiding leaking nonsensical config (e.g., a Redis connection string bleeding into an in-memory cache). Also worth distinguishing between &#8220;I didn&#8217;t specify anything&#8221; (inherit from parent via <code>undefined</code>) and &#8220;I explicitly want defaults&#8221; (reset) &#8212; <code>undefined</code> vs <code>'disabled'</code> vs <code>CacheConfig</code> gives a clean three-state: inherit / off / explicit.</p><h3>Q4: Cache interface shape &#8212; async, batch ops, scoping?</h3><p><strong>Question:</strong> Settling the cache interface details:</p><ol><li><p>Async or sync? Making the interface <code>Promise</code>-based from the start keeps it pluggable. Sound right?</p></li><li><p>Batch methods (<code>getMany</code>, <code>setMany</code>, <code>deleteMany</code>) &#8212; should the cache interface support batch ops natively, or loop internally?</p></li><li><p>Should <code>clear()</code> be scoped per collection, or always clear the entire cache?</p></li></ol><p><strong>Answer:</strong></p><ol><li><p>Maybe we could return either promise or no promise, then if we await it, it would support both? Also ensure that settings are close to what lru-cache and keyv are having.</p></li><li><p>Having them (even as syntactic sugar) would be a good devex.</p></li><li><p>We could pass down not only settings, but also the cache provider instance, then if settings are the same we can reuse cache without recreating it. If set had ttl provided, then we could do it by overriding those TTLs on item level. ClearAll probably would be something we should expose only for advanced, typically user should not be aware of cache.</p></li></ol><p><strong>Claude&#8217;s thoughts:</strong> <code>MaybePromise&lt;T&gt;</code> (i.e. <code>T | PromiseLike&lt;T&gt;</code>) works &#8212; <code>await</code> on a non-Promise just resolves immediately. No overhead for sync providers, no breaking change for async ones. Passing a cache instance is smart: config becomes either settings (Pongo creates the cache) or a pre-built instance (user brings their own). Multiple collections can explicitly share one cache instance. <code>clear()</code> should be internal/advanced &#8212; scope by collection prefix internally so a collection-level clear doesn&#8217;t nuke another collection&#8217;s entries when sharing an instance. For aligning with lru-cache/keyv settings: both support <code>max</code> (max entries) and <code>ttl</code> (TTL in ms). Start with <code>max</code> and <code>ttl</code> as the general params.</p><h3>Q5: Cache key strategy</h3><p><strong>Question:</strong> Since cache instances might be shared across collections, how to avoid key collisions?</p><ul><li><p>A) Pongo manages key prefixing internally (<code>{dbName}:{collectionName}:{id}</code>). The cache interface works with plain string keys, namespacing is Pongo&#8217;s concern.</p></li><li><p>B) Key is just the document <code>_id</code>. Simpler but only safe with per-collection instances.</p></li></ul><p><strong>Answer:</strong> A, definitely.</p><h3>Q6: Cache invalidation and consistency with optimistic concurrency</h3><p><strong>Question:</strong> How should cache interact with different operations?</p><ol><li><p>Successful writes &#8594; update/evict cache. Clear.</p></li><li><p>Concurrency errors &#8594; evict stale entry. Clear.</p></li><li><p>Reads with <code>_id</code> filter &#8594; cache hit or miss-then-populate. Clear.</p></li><li><p><code>handle</code> method &#8212; should the read inside <code>handle</code> use the cache (risking more concurrency errors on stale data) or always go to DB?</p></li></ol><p><strong>Answer:</strong> Definitely use cache. We could consider disabling cache through handle options.</p><p><strong>Claude&#8217;s thoughts:</strong> Use cache by default &#8212; optimistic concurrency is the safety net. If cached version is stale, write fails, we evict, caller can retry. Adding <code>skipCache?: boolean</code> to <code>HandleOptions</code> is a clean escape hatch for high-contention cases.</p><h3>Q7: Should <code>skipCache</code> be available on read methods too?</h3><p><strong>Question:</strong> Should <code>skipCache</code> also be available on <code>findOne</code> and other read methods for symmetry, or keep it <code>handle</code>-only?</p><p><strong>Answer:</strong> Good point, having <code>skipCache</code> on find methods (also <code>findMany</code> etc.) would be good.</p><p><strong>Claude&#8217;s thoughts:</strong> Consistent API surface. Add <code>skipCache?: boolean</code> on the shared options type (e.g. <code>CollectionOperationOptions</code>) so it&#8217;s available everywhere without repeating yourself.</p><h3>Q8: Batch operations on <code>handle</code> &#8212; signature and handler shape</h3><p><strong>Question:</strong> You want <code>handle</code> to accept an id or array of ids. Two design choices:</p><ol><li><p>Overloaded signature (<code>handle(id: string | string[], ...)</code>) vs separate method (<code>handleMany</code>)?</p></li><li><p>Handler shape for batch: A) still receives/returns one document at a time (Pongo batches DB ops internally) or B) receives/returns full array?</p></li></ol><p>Also noted: since PongoDocuments always have <code>_version</code>, if someone provides <code>expectedVersion</code> we could compare it against the cached version and decide if it&#8217;s stale without hitting the DB.</p><p><strong>Answer:</strong> Option A for handler shape. <code>handleMany</code> would be consistent with other APIs, but single id or array (<code>string | string[]</code>) is clean enough.</p><p><strong>Claude&#8217;s thoughts:</strong> Version comparison in cache is a cheap short-circuit &#8212; if cached <code>_version</code> doesn&#8217;t match <code>expectedVersion</code>, we know it&#8217;s stale without a round trip. On signature: TypeScript overloads can narrow the return type (<code>string</code> &#8594; single result, <code>string[]</code> &#8594; array). <code>handle</code> is a higher-level orchestration method, not direct CRUD, so a union input feels natural. Go with <code>handle(id: string | string[], handler, options?)</code>.</p><h3>Q9: Defaults &#8212; max entries, TTL, enabled by default?</h3><p><strong>Question:</strong> What defaults to set? Proposed: <code>max: 1000</code>, <code>ttl: 60_000</code> (1 min), and should cache be enabled by default or opt-in?</p><p><strong>Answer:</strong> Stick to recommended lru-cache defaults for now. Enable cache by default.</p><p><strong>Claude&#8217;s thoughts:</strong> Enabling by default works because: (1) optimistic concurrency is always the safety net, (2) it&#8217;s document-by-id only so invalidation is straightforward, (3) users can set <code>'disabled'</code>. <code>lru-cache</code> requires <code>max</code> to be set (no default) and has no default TTL. Suggested: <code>max: 1000</code>, no TTL (rely on LRU eviction + invalidation on writes/concurrency errors) &#8212; most lru-cache-aligned.</p><h3>Q10: TTL or no TTL as staleness backstop?</h3><p><strong>Question:</strong> No TTL means cached entries live until LRU-evicted or invalidated by a Pongo write. If something updates the DB outside this Pongo instance (another service, direct SQL), the cache won&#8217;t know. Options:</p><ul><li><p>A) No TTL, accept the trade-off &#8212; OC catches it on next write.</p></li><li><p>B) Set a conservative default TTL (e.g., 5 minutes) as a staleness bound.</p></li></ul><p><strong>Answer:</strong> Fine with using TTL and sometimes hitting OC issue. OC assumes rare conflicts.</p><h3>Q11: Where does cache integration hook into existing code?</h3><p><strong>Question:</strong> Two approaches for minimal invasiveness:</p><ol><li><p>Wrap at the <code>pongoCollection</code> level &#8212; intercept operations inside the factory function, driver/SQL layers untouched.</p></li><li><p>Decorator/wrapper pattern &#8212; <code>cachedPongoCollection</code> wraps <code>PongoCollection</code>, adds cache logic around delegation.</p></li></ol><p><strong>Answer:</strong> Leaning more to option 1. Thinks it&#8217;ll be easier to make observability precise eventually.</p><p><strong>Claude&#8217;s assessment:</strong> Agreed with option 1. Cache decisions depend on internal state (whether filter targets <code>_id</code>, document <code>_version</code>, write success vs concurrency error). A wrapper would need to re-inspect all of that from the outside. For observability, cache hit/miss logic next to the actual DB call enables precise metrics in context. Pongo&#8217;s factory functions already compose behavior internally (transactions, schema migration) &#8212; cache is another concern at the same level. A decorator would duplicate filter-inspection logic and add indirection that doesn&#8217;t carry its weight.</p><h3>Q12: Cache interaction with transactions</h3><p><strong>Question:</strong> How should cache interact with transactions?</p><ol><li><p>Should writes within a transaction update the cache immediately, or only after commit?</p></li><li><p>Should reads within a transaction check the cache?</p></li></ol><p><strong>Answer:</strong></p><ol><li><p>After commit.</p></li><li><p>We should use cache, but allow skipping it on transaction or session options. We can also allow setting cache settings per session.</p></li></ol><p><strong>Claude&#8217;s thoughts:</strong> Reads within a transaction benefiting from cache is more aggressive but logical &#8212; most transactions are short-lived. The narrow case: <code>insertOne</code> then <code>findOne</code> same <code>_id</code> within a transaction &#8212; cache won&#8217;t have the uncommitted insert, falls through to DB, gets the right answer anyway. No correctness issue, just no perf benefit for that case. Per-session cache settings fit the cascading pattern (e.g., <code>startSession({ cache: 'disabled' })</code> for batch migrations). Proposed cascade priority: client &#8594; db &#8594; collection &#8594; session &#8594; per-operation <code>skipCache</code>.</p><h3>Q13: Cascade priority &#8212; does session override collection?</h3><p><strong>Question:</strong> Does the cascade priority <strong>client &#8594; db &#8594; collection &#8594; session &#8594; per-operation </strong><code>skipCache</code> feel right? Session overriding collection is the non-obvious part &#8212; the argument being that session is the most immediate runtime context, while collection is structural config. Or would you prefer collection always wins over session?</p><p><strong>Answer:</strong> That&#8217;s fine. Session should precede collection, as normally you create session to e.g. share connection, and join the number of operations logically, so it&#8217;d make sense actually to also do it to e.g. override settings like caching.</p><p><strong>Claude&#8217;s thoughts:</strong> Session as logical grouping of operations with shared context (connection, transaction) &#8212; overriding cache settings at that level is extending the same concept. Bulk import in a session? Set cache disabled once on the session instead of <code>skipCache</code> on every call. Final cascade: client &#8594; db &#8594; collection &#8594; session &#8594; per-operation.</p><h3>Q14: Cache provider error handling</h3><p><strong>Question:</strong> If the cache provider throws (e.g., Redis connection drops), should Pongo:</p><ul><li><p>A) Treat it as a cache miss &#8212; fall through to DB silently, maybe log a warning. Cache failures never break the application.</p></li><li><p>B) Propagate the error &#8212; let the caller deal with it.</p></li></ul><p><strong>Answer:</strong> A &#8212; yes, when we add observability to Pongo, we need to make it also a metric and trace event/attribute/log entry.</p><p><strong>Claude&#8217;s thoughts:</strong> Cache is an optimization, not a correctness requirement. Silent degradation now, observable degradation when the observability layer lands.</p><h3>Q15: Cache warming, events/hooks, and delete operations</h3><p><strong>Question:</strong> Three remaining areas:</p><ol><li><p>Cache warming &#8212; should Pongo support pre-populating the cache on startup?</p></li><li><p>Cache events/hooks &#8212; should users be able to listen to cache hits/misses/evictions?</p></li><li><p>Delete operations &#8212; <code>deleteOne</code>/<code>deleteMany</code> should evict from cache. Anything non-obvious there?</p></li></ol><p><strong>Answer:</strong></p><ol><li><p>For now keep it as a future concept, or define pluggable strategy for users.</p></li><li><p>You can add the basic hook to the Pongo Cache wrapper. Shouldn&#8217;t be that hard to just call some function.</p></li><li><p>Yes, they should evict if they can, we can delete (for now) when <code>deleteMany</code> deletes by ids.</p></li></ol><p><strong>Claude&#8217;s thoughts:</strong> Cache warming deferred &#8212; can add a strategy hook later. Basic event callbacks (onHit, onMiss, onEvict, onError) on the cache wrapper are low-cost and useful for debugging even before full observability. Delete eviction scoped to id-based deletes for now &#8212; filter-based <code>deleteMany</code> doesn&#8217;t evict (no way to know which ids were affected without querying first).</p><div><hr></div><h2>Interactive Rubber Ducking result</h2><h3>Pongo 2nd Level Cache &#8212; Specification</h3><h3>Overview</h3><p>Add a document-level (by <code>_id</code>) 2nd level cache to Pongo with a pluggable provider interface. Ships with an in-memory provider backed by <code>lru-cache</code>. Enabled by default. Designed to be minimally invasive to existing code.</p><h3>Cache Interface</h3><h4><code>PongoCacheProvider&lt;T&gt;</code></h4><p>A thin, Pongo-owned interface. No coupling to any third-party API.</p><pre><code><code>type MaybePromise&lt;T&gt; = T | PromiseLike&lt;T&gt;;

interface PongoCacheProvider {
  get(key: string): MaybePromise&lt;PongoDocument | null | undefined&gt;;
  set(key: string, value: PongoDocument, options?: { ttl?: number }): MaybePromise&lt;void&gt;;
  delete(key: string): MaybePromise&lt;void&gt;;
  getMany(keys: string[]): MaybePromise&lt;(PongoDocument | null | undefined)[]&gt;;
  setMany(entries: { key: string; value: PongoDocument; ttl?: number }[]): MaybePromise&lt;void&gt;;
  deleteMany(keys: string[]): MaybePromise&lt;void&gt;;
  clear(): MaybePromise&lt;void&gt;;
}</code></code></pre><ul><li><p><code>MaybePromise</code> return types: sync providers (in-memory) return values directly, async providers (Redis) return Promises. <code>await</code> handles both transparently.</p></li><li><p>Batch methods (<code>getMany</code>, <code>setMany</code>, <code>deleteMany</code>) are first-class. Default in-memory implementation may loop internally, but the interface allows optimized batch ops for external providers.</p></li><li><p><code>clear()</code> is internal/advanced &#8212; not exposed to typical users. When sharing a cache instance across collections, scoping is handled via key prefixing by Pongo, not by the provider.</p></li></ul><h4>Cache key strategy</h4><p>Pongo manages key prefixing internally: <code>{dbName}:{collectionName}:{documentId}</code>.</p><p>The cache provider works with plain string keys &#8212; namespacing is Pongo&#8217;s concern, not the provider&#8217;s.</p><h4>Event hooks</h4><p>The Pongo cache wrapper supports basic callbacks:</p><ul><li><p><code>onHit?(key: string): void</code></p></li><li><p><code>onMiss?(key: string): void</code></p></li><li><p><code>onEvict?(key: string): void</code></p></li><li><p><code>onError?(error: unknown, operation: string): void</code></p></li></ul><p>These are optional and intended for debugging and future observability integration.</p><h3>Configuration</h3><h4><code>CacheConfig</code></h4><pre><code><code>type CacheConfig = {
 type: string;               // e.g., 'in-memory', 'redis', etc.
 max?: number;               // max entries (general param, cascades)
 ttl?: number;               // TTL in ms (general param, cascades)
  // type-specific options live here too, keyed by type
 [key: string]: unknown;
} | 'disabled';</code></code></pre><p>Three states:</p><ul><li><p><code>undefined</code> &#8212; inherit from parent level</p></li><li><p><code>'disabled'</code> &#8212; explicitly turn off caching at this level</p></li><li><p><code>CacheConfig</code> object &#8212; explicit configuration</p></li></ul><h4>Cascading configuration</h4><p>Cache config can be set at multiple levels. Each level inherits from its parent unless explicitly overridden:</p><p><strong>client &#8594; db &#8594; collection &#8594; session &#8594; per-operation</strong></p><p>Inheritance rules:</p><ul><li><p>General params (<code>max</code>, <code>ttl</code>) cascade down.</p></li><li><p>If a lower level switches <code>type</code>, type-specific params reset to defaults (not inherited from parent).</p></li><li><p>Session overrides collection &#8212; session is a logical grouping of operations, natural place to override runtime behavior (e.g., disable cache for a bulk import).</p></li><li><p>Per-operation <code>skipCache?: boolean</code> is the most granular escape hatch.</p></li></ul><h4>Passing a cache instance</h4><p>Users can provide either:</p><ul><li><p><strong>Settings</strong> &#8212; Pongo creates and manages the cache provider.</p></li><li><p><strong>A pre-built cache provider instance</strong> &#8212; Pongo uses it directly.</p></li></ul><p>If settings are the same across multiple collections, Pongo can reuse the same provider instance internally. When a user passes an instance, multiple collections can explicitly share one cache.</p><h4>Defaults</h4><ul><li><p><strong>Enabled by default</strong></p></li><li><p><code>max</code>: follow <code>lru-cache</code> recommended defaults (1000)</p></li><li><p><code>ttl</code>: follow <code>lru-cache</code> recommended defaults</p></li><li><p>Default provider: in-memory (<code>lru-cache</code>)</p></li></ul><h3>Integration points</h3><h4>Where: <code>pongoCollection</code> factory function</h4><p>Cache logic is added directly inside <code>pongoCollection</code>, not as an external decorator/wrapper. This gives cache operations access to internal state (filter inspection, <code>_version</code>, write outcomes) and keeps observability precise.</p><h4>Read operations</h4><p><code>findOne</code><strong>:</strong></p><ul><li><p>If the filter targets <code>_id</code>, check cache first.</p></li><li><p>Cache hit &#8594; return cached document.</p></li><li><p>Cache miss &#8594; query DB, populate cache, return.</p></li><li><p><code>skipCache?: boolean</code> option available.</p></li></ul><p><code>findMany</code><strong> / other query methods:</strong></p><ul><li><p>If the filter is a list of <code>_id</code> values, check cache for each.</p></li><li><p>Return cached hits, query DB for misses, populate cache with DB results.</p></li><li><p>Non-<code>_id</code> filters bypass cache entirely (cache is by-id only).</p></li></ul><h4>Write operations</h4><p><code>insertOne</code><strong>:</strong></p><ul><li><p>After successful insert, put the document into cache.</p></li></ul><p><code>insertMany</code><strong>:</strong></p><ul><li><p>After successful insert, put all documents into cache.</p></li></ul><p><code>updateOne</code><strong> / </strong><code>updateMany</code><strong> / </strong><code>replaceOne</code><strong>:</strong></p><ul><li><p>After successful write, update the cache entry with the new document state.</p></li></ul><p><code>deleteOne</code><strong>:</strong></p><ul><li><p>After successful delete, evict from cache.</p></li></ul><p><code>deleteMany</code><strong>:</strong></p><ul><li><p>If deleting by ids, evict those ids from cache.</p></li><li><p>Filter-based <code>deleteMany</code> does not evict (no way to know affected ids without extra query). Future improvement possible.</p></li></ul><h3>Optimistic concurrency</h3><ul><li><p>On concurrency error (version mismatch), <strong>evict the stale entry</strong> from cache. This is critical &#8212; a concurrency error means someone else updated the record (e.g., from another node), so the cached version is stale.</p></li><li><p>If a caller provides <code>expectedVersion</code> and the cached document has <code>_version</code>, compare them in memory. If they don&#8217;t match, we know it&#8217;s stale without hitting the DB &#8212; cheap short-circuit.</p></li></ul><h4><code>handle</code> method</h4><ul><li><p>Accepts <code>id: string | string[]</code> (overloaded: string returns single result, string[] returns array).</p></li><li><p>Read phase uses cache by default.</p></li><li><p>Write phase updates/evicts cache based on outcome.</p></li><li><p><code>skipCache?: boolean</code> available in handle options.</p></li><li><p>For batch: loads all ids from cache, fetches misses from DB in one query, processes handler per document, batch writes to DB, updates cache after commit.</p></li></ul><h4>Transactions</h4><ul><li><p><strong>Writes update cache only after commit.</strong> Not during the transaction &#8212; uncommitted data should not enter the cache.</p></li><li><p><strong>Reads within a transaction check cache</strong> by default (most transactions are short-lived, cached data is probably valid). If the transaction modified a document and then reads it, cache miss falls through to DB which returns the correct uncommitted state.</p></li><li><p>Cache can be disabled per-session: <code>startSession({ cache: 'disabled' })</code>.</p></li><li><p>Per-session cache settings follow the cascade (session overrides collection).</p></li></ul><h3>Error handling</h3><p>Cache provider errors are <strong>swallowed</strong> &#8212; treated as cache misses. The application never fails because of a cache failure.</p><ul><li><p>Cache is an optimization, not a correctness requirement.</p></li><li><p>Errors trigger <code>onError</code> hook for debugging.</p></li><li><p>Future: errors will become metrics and trace events when Pongo&#8217;s observability layer lands.</p></li></ul><h3>Future considerations (out of scope for v1)</h3><ul><li><p><strong>Cache warming</strong>: pluggable strategy for pre-populating cache on startup. Interface hook defined but no default implementation shipped.</p></li><li><p><strong>Query-result caching</strong>: caching results of non-<code>_id</code> queries. Complex invalidation &#8212; deferred.</p></li><li><p><strong>Cross-node cache sharing</strong>: Redis/Memcached providers. The interface supports it; no provider shipped in v1.</p></li><li><p><strong>Filter-based </strong><code>deleteMany</code><strong> eviction</strong>: would require querying affected ids before delete.</p></li><li><p><strong>Observability integration</strong>: metrics (hit ratio, latency saved, eviction counts), trace attributes, log entries for cache operations.</p></li></ul><h3>Implementation approach</h3><ol><li><p>Define <code>PongoCacheProvider</code> interface and <code>CacheConfig</code> types.</p></li><li><p>Implement in-memory provider using <code>lru-cache</code>.</p></li><li><p>Build Pongo cache wrapper (key prefixing, event hooks, error swallowing).</p></li><li><p>Wire cascading config resolution (client &#8594; db &#8594; collection &#8594; session &#8594; per-op).</p></li><li><p>Integrate into <code>pongoCollection</code>: intercept <code>_id</code>-based reads, update cache on writes, evict on deletes and concurrency errors.</p></li><li><p>Extend <code>handle</code> to accept <code>string | string[]</code> with batch support.</p></li><li><p>Add <code>skipCache</code> to operation options.</p></li><li><p>Tests: unit tests for cache provider, integration tests for cache + collection operations, e2e tests for concurrency scenarios.</p></li></ol><div><hr></div><p>Cheers!</p><p>Oskar</p><p><strong>p.s, and yes, Nano Banana also did pretty well generating the cover of this post!</strong></p><p>p.s.2. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[The End of Coding? Wrong Question]]></title><description><![CDATA[Be careful what you wish for, because your wish may come true.]]></description><link>https://www.architecture-weekly.com/p/the-end-of-coding-wrong-question</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/the-end-of-coding-wrong-question</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 09 Mar 2026 10:54:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7ks-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7ks-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7ks-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!7ks-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!7ks-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!7ks-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7ks-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png" width="800" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:319978,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.architecture-weekly.com/i/190367899?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7ks-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!7ks-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!7ks-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!7ks-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdab4d70f-1e28-4ea4-9c02-1f84a903244d_800x500.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Be careful what you wish for, because your wish may come true.</strong> </p><p>What LLMs revealed is how many people in our industry don't like to code. <br><br>It's intriguing that now they claim and showcase what they "built with Claude", whereas usually that means they generated a PoC.<br><br>It's funny, as people still focus on how they're building, so it's all about the code. And if that's the message sent outside, together with the thought that LLMs are already better than "average coder Joe", then the logical follow-up question is: why do we need those humans in the loop?<br><br><strong>I think that most people look at the forest and see trees.</strong> The current way of working with LLMs is not scalable. It's transition phase. I can't imagine calling myself an engineer and doing ONLY stringly-typed development with chat or markdown.<br><br>What I can imagine is getting help from it, and using those tools as a help for research, for generating the OUTPUT, still keeping me responsible for the OUTCOME.<br><br>Why am I saying that we're in the transition phase? As prompting in our natural language is not precise, it's verbose, and adding a translation layer between our freeform prompts and programming languages is a waste of time (and tokens, which LLM vendors love).<br><br><strong>I think that we'll still be coding, but with some other layer, as LLMs are good with structured input, like programming languages.</strong> So we might need other programming languages than we have atm. Might we need different tools to evaluate LLMs' output to make it deterministic? Might we need a different approach for engineering to make it scalable? Might we need more?<br><br>Still, I don't see those discussions. <br><br>I mostly see: noise and celebrities who don't code showing their beautiful PoCs. And doing a mic drop about the end of coding. Just like PoC represents the whole Software Development Life Cycle.<br><br><strong>So, will we code or not?</strong> <br><br>If the answer is yes, let's talk about what's next, then let's discuss how and when.<br><br>If the answer is not, then let's talk about how our job will change, ask if it's still engineering, etc.<br><br>Let's try to make our discussions more precise, more focused on the essence, and avoid them from becoming just a bunch of anecdotal evidence. <br><br>Noone outside our industry cares how we code unless it changes the cost, quality, or delivery time.<br><br>Let's discuss the impact that matters, rather than just the amount of code we produce (or not).</p><p>Now, guess who I am quoting:</p><blockquote><p>Imagine you&#8217;re a software application developer. Your programming language of choice (or the language that&#8217;s been foisted on you) is Java or Typescript . You&#8217;ve been at this for quite a while and your job doesn&#8217;t seem to be getting any easier. </p><p>These past few years you&#8217;ve seen the growth of multiple incompatible  architectures. Now you&#8217;re supposed to cope with all this and make your applications work in a distributed client-server environment. The growth of the Internet, the World-Wide Web, and &#8220;electronic commerce&#8221; haveintroduced new dimensions of complexity into the development process.</p><p>The tools you use to develop applications don&#8217;t seem to help you much. You&#8217;re still coping with the same old problems; the fashionable new object-oriented techniques seem to have added new problems without solving the old ones.</p><p>You say to yourself and your friends, &#8220;There has to be a better way&#8221;!</p><p><strong>The Better Way is Here Now</strong></p><p>Now there is a better way&#8212;it&#8217;s our new model. Imagine, if you will, this development world&#8230;</p><p>&#8226; It&#8217;s still dead simple.</p><p>&#8226; Your development cycle is much faster.</p><p>&#8226; Your applications can be created across multiple platforms. Write your spec once, and you never need to port them&#8212;they will be recreated if you without your hand-rolled modification on multiple operating systems and hardware architectures.</p><p>&#8226; Your applications are adaptable to changing environments.</p><p>&#8226; Your end users can trust that your applications are secure, and you can use protection against viruses and tampering through security scans.</p><p>You don&#8217;t need to dream about these features. They&#8217;re here now. </p></blockquote><p>And also this from another source:</p><blockquote><p>When I started interviewing programmers in 2005, I would generally let them use any language or tool they wanted to solve the coding problems I gave them. 99% of the time, they chose Java.</p><p>Nowadays, they tend to choose LLMs.</p><p>Now, don&#8217;t get me wrong: there&#8217;s nothing wrong with LLM as an implementation tool.</p><p>Wait a minute, I want to modify that statement. I&#8217;m not claiming, <em>in this particular article,</em> that there&#8217;s anything wrong with LLM as an implementation tool. There are lots of things wrong with it but those will have to wait for a different article.</p><p>Instead what I&#8217;d like to claim is that LLM is not, generally, a hard enough programming tool that it can be used to discriminate between great programmers and mediocre programmers. It may be a fine tool to work in, but that&#8217;s not today&#8217;s topic. I would even go so far as to say that the fact that LLMs aere not hard enough is a feature, not a bug, but it does have this one problem.</p></blockquote><p>Well, I cheated you, but only a bit. I changed &#8220;Java&#8221; to &#8220;LLM &#8221; and cut some phrases.</p><p>The first one is from <a href="https://www.stroustrup.com/1995_Java_whitepaper.pdf">&#8220;The Java Language Environment&#8221;</a> by Sun Microsystems, introducing Java in 1995. </p><p>The second one was from Joel Spolsky&#8217;s <a href="https://www.joelonsoftware.com/2005/12/29/the-perils-of-javaschools-2/">&#8220;The Perils of JavaSchools&#8221;</a> article, written in 2001.</p><p>Let me be clear. I&#8217;m not trying to do grandpa talk on the old days and claim that it&#8217;s the same old thing.</p><p>What I&#8217;m trying to say is that we were continuously introducing new abstractions into our development cycle to scale it. By scale, I mean: getting more people to deliver more code. Even Java was invented for precisely this goal. Yes, the one that&#8217;s together with craftsmanship madness presented as &#8220;the enterprisy complex environment&#8221;. In the early days, it was just said that it&#8217;ll make us dumber.</p><p><strong>The goal of abstraction is not to gatekeep but to allow us to reduce cognitive load.</strong> We invented new languages to help us, but then added more components like a distributed environment, multiregion, because of globalisation. For the same reasons, we use cloud-native tooling so we don't have to deal with it.</p><p>Some of the architecture and security tools were commoditised by the cloud. We don&#8217;t need to think about much stuff we had to do before.</p><p>Will Claude-native do the same?</p><p>We don&#8217;t need to learn Assembler, C++, Lisp anymore; we have lost a lot of mechanical sympathy. We deal with higher abstractions, but we still engineer solution, we still code, is it a different coding? It is. Is it better or worse? Well, it is how it is. As Gerald Weinberg said:</p><blockquote><p>Things are the way they are because they got that way</p></blockquote><p>And now the question is whether we&#8217;re fine with the way we&#8217;re doing stuff, and where we're getting to. </p><p>Personally, I don&#8217;t think that stringly-typed markdown or chat-based design will be the thing in the future.</p><p>Knowing how skilled we were always with:<br>- breaking down tasks into smaller chunks,<br>- writing precisely what we had in mind,<br>- thinking before doing,<br>- waging tradeoffs,<br><br>I'm optimistic about the Spec-Driven Design idea. It's going to be great.</p><p>Not.</p><p>If we want to call ourselves engineers, we need to put more structure and determinism.</p><p>I agree that reviewing all code generated by the GenAI is not sustainable.</p><p>But I also don&#8217;t think that generating tons of code is, in general, sustainable.</p><p>With the current state of the art we have, sure, that&#8217;s some solution to just generate based on the tools we have.</p><p>But let&#8217;s start to think what&#8217;s next.</p><p>Simon Wardley <a href="https://www.linkedin.com/feed/update/urn:li:activity:7426977059677077504/">brought an interesting point</a>:</p><blockquote><p>That said, it is fine for an entire culture to decide that producing outputs matters more than understanding mechanisms. You only have to compare the practical engineering of the Roman Empire and the loss of inquiry from science in the Hellenistic age to see this. When the Roman Empire collapsed, the practical knowledge embedded in those institutions (how to maintain aqueducts, how to produce certain grades of concrete) was lost remarkably quickly. Not because people decided to forget it, but because the knowledge was procedural, embedded in chains of practice rather than recorded as transferable understanding. When the chains of practice broke, that embedded knowledge went with them. We had to rediscover the art of inquiry (i.e. Science) to bring them back.</p></blockquote><p>So if we don&#8217;t code, how do we hone our skills? How will newcomers from LLMSchools (using Joel&#8217;s terms) be able to decide whether something is wrong or right? I don&#8217;t think that you can be good on something without doing it.</p><p>The paradox with code is that it&#8217;s not an asset; it&#8217;s a liability. Some say that code doesn&#8217;t matter, only proper design. But how do you define &#8220;proper design&#8221;? Yes, code style doesn&#8217;t matter as long as it works as expected. But&#8230;</p><p>But code eventually matters, as that&#8217;s the source of truth for what&#8217;s on production. As Alberto Brandolini said:</p><blockquote><p>It's <em>developers</em>' (mis)understanding, <em>not</em> domain experts' knowledge, that gets released in production.</p></blockquote><p>Now, it&#8217;s the developers&#8217; and LLMs&#8217; misunderstandings that are deployed to production, not the expert&#8217;s knowledge. Neither the markdown spec.</p><p>And coding is <em>just one danger.</em></p><p><strong>Outsourcing thinking is an even more dangerous path, as:</strong></p><ul><li><p>If LLMs are doing everything, then again, what are humans for? Aren&#8217;t we cutting the branch on which we&#8217;re sitting?</p></li><li><p>LLMs are statistical parrots. They repeat the most possible answer. Which means mediocre. This can still be fine enough for many cases, but for those we want to make a difference for? Definitely not.</p></li><li><p>Just like we&#8217;re losing our coding skills by not doing them, we&#8217;re losing design skills by not practising them.</p></li></ul><p>Of course, whatever happens, LLMs will stay with us. How and where it&#8217;s hard to say. Unless you have a Magic 8 Ball of 100% correct predictions. I don&#8217;t.</p><p>That&#8217;s why I&#8217;d like our industry to finally start mature discussions on the real impact. I would like us to stop acting like children, bragging about generating code and then claiming that code doesn&#8217;t matter.</p><p>I&#8217;d like to think about how to reshape our SDLC process and make it sustainable.</p><p>I&#8217;d like us to think about what tools we need, and how to change what we have.</p><p>If we won&#8217;t finally start to do it, then things will be the way they are because they got that way. </p><p>And it might not be what we wished for.</p><p>Cheers</p><p>Oskar</p><p><strong>p.s.</strong> to kinda prove that I&#8217;m more sceptic and pragmatic than hater, I recently started playing with building an Agent with Emmett to better understand those tools. If you&#8217;d like to read about the findings and honest thoughts I have while doing it, please comment!</p><p>p.s.2. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p><p></p>]]></content:encoded></item><item><title><![CDATA[Parse, Don't Guess]]></title><description><![CDATA[Last time, I shared with you how sneaky I was on transaction handling.. Today, the opposite: I&#8217;ll tell you how I fixed the issue when I tried to be too sneaky. I already told you that Sneaky Code Bites Back. The moral? Do as I tell, not how I do.]]></description><link>https://www.architecture-weekly.com/p/parse-dont-guess</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/parse-dont-guess</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 02 Mar 2026 17:18:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Rc3Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abd1e13-5804-495b-bbe8-a3cca0867aca_800x420.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Rc3Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abd1e13-5804-495b-bbe8-a3cca0867aca_800x420.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Rc3Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abd1e13-5804-495b-bbe8-a3cca0867aca_800x420.png 424w, https://substackcdn.com/image/fetch/$s_!Rc3Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abd1e13-5804-495b-bbe8-a3cca0867aca_800x420.png 848w, https://substackcdn.com/image/fetch/$s_!Rc3Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abd1e13-5804-495b-bbe8-a3cca0867aca_800x420.png 1272w, https://substackcdn.com/image/fetch/$s_!Rc3Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abd1e13-5804-495b-bbe8-a3cca0867aca_800x420.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Rc3Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abd1e13-5804-495b-bbe8-a3cca0867aca_800x420.png" width="800" height="420" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9abd1e13-5804-495b-bbe8-a3cca0867aca_800x420.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:420,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;cover&quot;,&quot;title&quot;:&quot;cover&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="cover" title="cover" srcset="https://substackcdn.com/image/fetch/$s_!Rc3Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abd1e13-5804-495b-bbe8-a3cca0867aca_800x420.png 424w, https://substackcdn.com/image/fetch/$s_!Rc3Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abd1e13-5804-495b-bbe8-a3cca0867aca_800x420.png 848w, https://substackcdn.com/image/fetch/$s_!Rc3Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abd1e13-5804-495b-bbe8-a3cca0867aca_800x420.png 1272w, https://substackcdn.com/image/fetch/$s_!Rc3Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abd1e13-5804-495b-bbe8-a3cca0867aca_800x420.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://event-driven.io/en/cloudflare_d1_transactions_and_tradeoffs/">Last time, I shared with you how sneaky I was on transaction handling.</a>. Today, the opposite: I&#8217;ll tell you how I fixed the issue when I tried to be too sneaky. I already told you that <a href="https://www.architecture-weekly.com/p/sneaky-code-bites-back">Sneaky Code Bites Back</a>. The moral? Do as I tell, not how I do.</p><p>In some environments, we&#8217;re spoiled. We&#8217;re getting a lot from a Base Class Library or standard frameworks, so we stop thinking that those issues can exist. For instance, serialisation. Do you know how many data types JSON has? 6. Six. Sze&#347;&#263;.</p><p>Exactly those:</p><ul><li><p>string,</p></li><li><p>number,</p></li><li><p>boolean,</p></li><li><p>object,</p></li><li><p>array,</p></li><li><p>and (TADA!) null.</p></li></ul><p>What about number precision and size? It is. That&#8217;s what I can tell you, but it&#8217;s not enough, e.g., to keep big int/long, etc. What about Dates? Also, there are none. I wrote about it longer <a href="https://event-driven.io/en/fun_with_json_serialisation/">here or how much fun that brings</a>.</p><p>If you use statically typed languages and runtimes like C#, Java, etc., your serialiser can, in addition to parsing, perform mapping and, sometimes, validation. And it can also be tricky, as nicely <a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/">Alexis King put in her &#8220;Parse, don&#8217;t validate&#8221;</a>.</p><p>If you&#8217;re in a dynamic environment, like JavaScript, then you&#8217;re left with parsing and explicit mapping afterwards. What about TypeScript? Same case, types are only used during compilation, then erased and not visible at runtime. So, the place where we do parsing.</p><p>Because JSON was defined a long time ago, JavaScript moved on and now supports bigints (Big Integers) and Dates natively (what an achievement!), which created a gap I wanted to fill.</p><p>As you know from my previous articles (e.g. <a href="https://event-driven.io/en/checkpointing_message_processing/">this one</a>), big integers are quite important in distributed processing. You can represent the position in log with them. Since your log may be quite long, regular numbers aren&#8217;t enough. Or they&#8217;re long enough, until they overflow, then they&#8217;re not anymore.</p><p>I&#8217;m using those bigint types extensively in internals in <a href="https://github.com/event-driven-io/emmett">Emmett</a> and <a href="https://github.com/event-driven-io/pongo">Pongo</a>. And I store them in JSONs. I store them as alphanumeric strings, because strings don&#8217;t have a max length (or at least I don&#8217;t know such).</p><p>So, for instance, an event payload can look like:</p><pre><code><code>{
  "type": "InvoiceIssued",
  "data": {
    "invoiceNumber": "123",
    "version": 1,
    "issuer": "John Doe",
    "issuedAt": "2026-02-23T14:07:20Z"
  },
  "metadata": {
    "streamPosition": "3",
    "globalPosition": "928391"
  }
}</code></code></pre><p>As you can see in the metadata stream and global positions, the values are bigints (even if they&#8217;re smaller than the maximum value), and data can also use bigints if the user decides to (e.g., invoice number).</p><p>And encoding data is simple: you convert it to a string, call it a day. But how to get it back?</p><p>And here&#8217;s where my struggles started. How do you know that someone intentionally used bigint when they just wanted to store a number as a string?</p><p>There are several options. The first one is: encode value.</p><p>We could store it, for instance, as:</p><ul><li><p>prefixed value: <strong>&#8220;_bigint:928391&#8221;</strong>. But then you need to find a prefix that will be unique enough not to cause conflicts,</p></li><li><p>nested object, e.g. <strong>{ &#8220;_kind&#8221;: &#8220;bigint&#8221;, value: &#8220;928391&#8221; }</strong>.</p></li></ul><p>Then, either based on the prefix or the object structure, we could automatically decode the value. Still, ther creates other issues, as the structure no longer matches the original value. If we&#8217;re just storing and retrieving, that shouldn&#8217;t be so bad, but&#8230; But remember that in <a href="https://github.com/event-driven-io/pongo">Pongo</a> I&#8217;m allowing the use of PostgreSQL and SQLite as document databases, supporting such queries:</p><pre><code><code>const invoices = pongoDb.collection&lt;Invoice&gt;("invoices");

const invoiceNumber = 123n;
const invoice = await invoices.findOne({ invoiceNumber });</code></code></pre><p>That gets translated into a <a href="https://www.architecture-weekly.com/p/postgresql-jsonb-powerful-storage">fancy JSONB SQL query</a>.</p><p>Of course, I could work around it by encoding the value, but&#8230; But I was lazy!</p><p>I decided to use a Get Out of Jail Free Card and just treat all strings with numbers as bigints. Sneaky. And it will get even sneakier.</p><p>In JavaScript, JSON.parse accepts a parameter that allows you to provide custom mapping logic. I decided to use it and check if the string is alphanumeric, and gulp, I&#8217;ve used a regular expression to parse it:</p><pre><code><code>const bigIntReviver: JSONReviver = (_key, value) =&gt; {
  if (typeof value === 'string' &amp;&amp; /^[+-]?\d+n?$/.test(value)) {
    return BigInt(value);
  }

  return value;
};</code></code></pre><p>Yes, it&#8217;s either DNS or Regex. Or both.</p><p>I <a href="https://www.architecture-weekly.com/p/typescript-migrates-to-go-whats-really">explained in another article</a> that JavaScript runtime doesn&#8217;t like where you do CPU-heavy computations.</p><p>Small Regex isn&#8217;t CPU-heavy, but if you consider that ther will be done for each string in each document or event you try to deserialise, and multiply that by the number of concurrent requests? That can cause the <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Execution_model">JavaScript event loop to freeze</a>.</p><p>What&#8217;s more, I plugged that automatically into <a href="https://node-postgres.com/">node-postgres driver</a> custom type handling, so each JSONB deserialization goes through it.</p><p>Again, not shit, Sherlock. I should have known it wasn&#8217;t the best choice, but I was busy trying to be sneaky at that moment.</p><p>Fortunately, a user, Dawid, benchmarked and noticed CPU freezes. It wasn&#8217;t catastrophic, but clearly needed a fix.</p><h2>The Shift</h2><p><strong>And here I had several options.</strong> I could keep hacking on the same idea- maybe replace the Regex with a simpler string check, still globally. Or I could just ignore bigints during deserialisation entirely, let them stay strings, call it a day, move on. Or I could apply the encoding I mentioned earlier, prefixed values, and nested objects. All of those would fix the performance issue. And all of those would be the same kind of wrong choice I already made: trying to solve a schema problem without the schema.</p><p>Because that&#8217;s the actual mistake here, not the Regex. The <em>pg</em> driver has no idea what your schema looks like. It doesn&#8217;t know that <em>&#8220;928391&#8221;</em> is a bigint and <em>&#8220;John Doe&#8221;</em> is a name. It doesn&#8217;t know that <em>&#8220;123&#8221;</em> is an invoice number (bigint!) and <em>&#8220;90210&#8221;</em> is a zip code (string!). I asked it to guess, and it guessed wrong, because there is no right guess at that level.</p><p>Enough is enough. I had been planning to do ther properly for a while, and the performance issue gave me the push I needed.</p><p><strong><a href="https://wiki.c2.com/?MakeItWorkMakeItRightMakeItFast">Old rule says: &#8220;Make it work, make it right, make it pretty&#8221;.</a></strong> I had <em>&#8220;make it work&#8221;</em> covered for a long time. Now it was time for <em>&#8220;make it right&#8221;</em>.</p><p>And honestly? It wasn&#8217;t that hard. Maybe because &#8220;make it work&#8221; came first, I already understood the problem well enough to see the shape of the solution.</p><p>In <a href="https://github.com/event-driven-io/Pongo/pull/149">Pongo</a>, I dropped the automatic bigint parsing from the driver entirely. If you want bigint or date parsing, you say so at the client level:</p><pre><code><code>const client = pongoClient({
  driver: databaseDriver,
  connectionString: postgresConnectionString,
  serialization: {
    options: {
      parseBigInts: true,
      parseDates: true,
    }
  },
});</code></code></pre><p>By default, strings stay strings. You opt in. I didn&#8217;t want to break things for users who don&#8217;t care about bigint precision or don&#8217;t have performance-sensitive workloads. The serializer became an explicit parameter passed down to each query, each collection, each operation- instead of a global that silently changed everything.</p><p><strong>That was the &#8220;make it right&#8221; part for Pongo.</strong> But disabling alone isn&#8217;t a solution, it&#8217;s a band-aid. Users who need bigints and dates still need a way to get them back after deserialisation. The question is: where does that conversion happen?</p><p>And that&#8217;s where upcasting comes in. Let me start with a simple example in <a href="https://github.com/event-driven-io/Pongo/pull/149">Pongo</a>, then build up.</p><p>Say you have a user document. In the database, dates are stored as ISO strings, and the version counter is a numeric string (because JSON). But in your application, you want proper Date objects and bigints:</p><pre><code><code>type UserDocStored = {
  name: string;
  createdAt: string;
  lastLogin: string;
};

type UserDoc = {
  name: string;
  createdAt: Date;
  lastLogin: Date;
};</code></code></pre><p>The upcast function does the conversion:</p><pre><code><code>const upcast = (doc: UserDocStored): UserDoc =&gt; ({
  name: doc.name,
  createdAt: new Date(doc.createdAt),
  lastLogin: new Date(doc.lastLogin),
});</code></code></pre><p>You wire it into the collection, and every read goes through it:</p><pre><code><code>const collection = pongoDb.collection&lt;UserDoc, UserDocStored&gt;(
  'users',
  {
    schema: { versioning: { upcast } },
  },
);</code></code></pre><p>What&#8217;s in the database:</p><pre><code><code>{ "name": "Alice", "createdAt": "2024-01-15T10:30:00.000Z", ... }</code></code></pre><p>What you get back:</p><pre><code><code>{ name: 'Alice', createdAt: Date, ... }</code></code></pre><p>That&#8217;s all. <em>new Date(str)</em> is cheap. Running a Regex against every string in the document is not. The CPU freeze Dawid spotted came from that check running millions of times at the driver level for every field on every concurrent request. With upcasting, the conversion runs only for the fields you declared, in a plain function, no Regex.</p><p>But ther is just type mapping - the simplest case. As I wrote about <a href="https://event-driven.io/en/fun_with_json_serialisation/">in my serialisation article</a>, the explicit mapping pattern is useful for much more than just fixing types. It&#8217;s the same pattern you need for schema versioning. It defines the stored schema and the application schema separately together with function to transform one into the other.</p><p>Let&#8217;s say business requirements changed. You now need to group user data differently: a <em>profile</em> object for identity data, and a <em>timestamps</em> object for temporal data. The V1 documents are flat. The new V2 shape is nested:</p><pre><code><code>type UserDocV1 = {
  name: string;
  createdAt: string;
  lastLogin: string;
};

type UserDocV2 = {
  profile: {
    name: string;
  };
  timestamps: {
    createdAt: Date;
    lastLogin: Date;
  };
};</code></code></pre><p>Ther isn&#8217;t just a type change like string-to-Date anymore. The structure itself is different. Flat fields became nested objects, and field names moved into sub-objects. And you have thousands of V1 documents already stored. You can&#8217;t migrate them all at once (or don&#8217;t want to, because it&#8217;s risky, and some consumers might still expect V1). But your application now expects V2.</p><h2>Compatibility FTW</h2><p>Ther is where backward and forward compatibility come in.</p><p><strong>Backward compatibility</strong> means: old data still works. V1 documents stored months ago need to be readable by the V2 code. The upcast handles ther. It reads the document in whatever shape it has and transforms it into V2.</p><p><strong>Forward compatibility</strong> means: new data doesn&#8217;t break old consumers. If you have another service or an older deployment that still reads the V1 format, it needs to keep working. The downcast handles ther. When storing V2 documents, it writes the V1 fields alongside the V2 fields, so older readers can still find what they expect.</p><p>Together:</p><pre><code><code>type StoredPayload = UserDocV1 &amp; UserDocV2;

const upcast = (doc: StoredPayload): UserDocV2 =&gt; ({
  profile: doc.profile ?? { name: doc.name },
  timestamps: {
    createdAt: new Date(doc.timestamps?.createdAt ?? doc.createdAt),
    lastLogin: new Date(doc.timestamps?.lastLogin ?? doc.lastLogin),
  },
});

const downcast = (doc: UserDocV2): StoredPayload =&gt; ({
  name: doc.profile.name,
  createdAt: doc.timestamps.createdAt.toISOString(),
  lastLogin: doc.timestamps.lastLogin.toISOString(),
  profile: doc.profile,
  timestamps: doc.timestamps,
});</code></code></pre><p>Look at the upcast: if the nested <em>profile</em> or <em>timestamps</em> fields exist (document was written by V2 code), it uses them. If they don&#8217;t exist (for the old V1 document), it falls back to the flat fields. One function handles both old and new documents: that&#8217;s backward compatibility.</p><p>And look at the downcast: it writes <em>name</em>, <em>createdAt</em>, <em>lastLogin</em> as flat string fields (V1 shape) alongside <em>profile</em> and <em>timestamps</em> (V2 shape). A service still reading V1 sees the flat fields and works fine. A service reading V2 sees the nested ones. That&#8217;s forward compatibility.</p><p>You wire both into the collection:</p><pre><code><code>const collection = pongoDb.collection&lt;UserDocV2, StoredPayload&gt;(
  'users',
  {
    schema: { versioning: { upcast, downcast } },
  },
);</code></code></pre><p>From here, your application code only deals with V2. The collection handles the translation in both directions:</p><pre><code><code>const v2Doc: UserDocV2 = {
  profile: { name: 'Alice' },
  timestamps: {
    createdAt: new Date('2024-01-15T10:30:00.000Z'),
    lastLogin: new Date('2024-06-20T14:45:00.000Z'),
  },
};

await collection.insertOne(v2Doc);</code></code></pre><p>What&#8217;s stored (downcasted, both shapes for compatibility):</p><pre><code><code>{
  "name": "Alice", 
  "createdAt": "2024-01-15T10:30:00.000Z",
  "lastLogin": "2024-06-20T14:45:00.000Z",
  "profile": { 
    "name": "Alice" 
  },
  "timestamps": { 
    "createdAt": "2024-01-15T10:30:00.000Z",
    "lastLogin": "2024-06-20T14:45:00.000Z" 
  } 
}</code></code></pre><p>Then you can read it back with:</p><pre><code><code>const doc = await collection.findOne({ ... });</code></code></pre><p>And get upcasted to V2 data in your application code:</p><pre><code><code>{
  "profile": { 
    "name": "Alice" 
  },
  "timestamps": { 
    "createdAt": "2024-01-15T10:30:00.000Z",
    "lastLogin": "2024-06-20T14:45:00.000Z" 
  } 
}</code></code></pre><p>Same collection, V1 and V2 documents coexisting. <em>insertMany</em>, <em>replaceOne</em>, <em>findOne</em>: all go through the upcast/downcast. No batch migration needed. You roll out the new code, and old documents are handled transparently.</p><p>There&#8217;s another thing the downcast gives you: querying remains backwards-compatible. Because the downcast writes the flat V1 fields alongside the nested V2 ones, a query like <em>collection.findOne({ name: &#8216;Alice&#8217; })</em> still works even though V2 code doesn&#8217;t use <em>name</em> directly anymore. The V1 field is there in the stored document. That matters if you have queries or indexes built against the old shape. They don&#8217;t break.</p><p>Now, for events, ther matters even more. In event sourcing, stored events are immutable- the log is append-only, and you don&#8217;t modify what was already written. I wrote about <a href="https://event-driven.io/en/simple_events_versioning_patterns/">versioning patterns</a> in more detail. The core idea is: your business evolves, your code evolves, your event schemas evolve, but the events in the store stay as they were. You can&#8217;t go back and rewrite them (well, you can, but you really shouldn&#8217;t). Upcasting is how you bridge the gap.</p><p>For <a href="https://github.com/event-driven-io/emmett/pull/292">Emmett</a>, the same pattern works at the event store level. You define the stored shape (what JSON gives you from the database) and the application shape (what your code works with):</p><pre><code><code>type ShoppingCartOpenedFromDB = Event&lt;
  'ShoppingCartOpened',
  { openedAt: string; loyaltyPoints: string }
&gt;;

type ShoppingCartOpened = Event&lt;
  'ShoppingCartOpened',
  { openedAt: Date; loyaltyPoints: bigint }
&gt;;</code></code></pre><p>And an upcast that handles each event type:</p><pre><code><code>const upcast = (event: Event): ShoppingCartEventWithDatesAndBigInt =&gt; {
  switch (event.type) {
    case 'ShoppingCartOpened': {
      const e = event as ShoppingCartOpenedFromDB;
      return {
        ...e,
        data: {
          openedAt: new Date(e.data.openedAt),
          loyaltyPoints: BigInt(e.data.loyaltyPoints),
        },
      };
    }
    case 'ShoppingCartConfirmed': {
      const e = event as ShoppingCartConfirmedFromDB;
      return {
        ...e,
        data: {
          confirmedAt: new Date(e.data.confirmedAt),
          totalCents: BigInt(e.data.totalCents),
        },
      };
    }
    default:
      return event as ShoppingCartEventWithDatesAndBigInt;
  }
};</code></code></pre><p>You pass it when reading a stream:</p><pre><code><code>const { state } = await eventStore.aggregateStream&lt;
  ShoppingCartState,
  ShoppingCartEventWithDatesAndBigInt
&gt;(shoppingCartId, {
  evolve: evolveState,
  initialState,
  read: { schema: { versioning: { upcast } } },
});</code></code></pre><p>Or in a command handler:</p><pre><code><code>const handle = CommandHandler&lt;ShoppingCart, ShoppingCartEvent&gt;({
  evolve,
  initialState: () =&gt; ({ ... }),
  schema: { versioning: { upcast: upcastDatesAndBigInt } },
});</code></code></pre><p>The difference with events is that you can&#8217;t update them in place. For documents, you have both directions: upcast on read, downcast on write. For events, upcasting is the main tool because the event store is append-only. Old events stay as they were written. But downcasting has its place too.</p><p>Consider ther: you have a projection or a subscriber that was built months ago against the old event schema. Maybe it&#8217;s a read model that listens to <em>ShoppingCartOpened</em> and expects <em>clientId</em> as a flat string. But your current code evolved. Now <em>ShoppingCartOpened</em> carries a <em>client</em> object with <em>id</em> and <em>name</em>:</p><pre><code><code>// What old subscribers expect
type ShoppingCartOpenedV1 = Event&lt;
  'ShoppingCartOpened',
  { clientId: string; openedAt: string }
&gt;;

// What current code produces
type ShoppingCartOpenedV2 = Event&lt;
  'ShoppingCartOpened',
  { client: { id: string; name: string }; openedAt: Date }
&gt;;</code></code></pre><p>Upcasting enables the current code to read older events. The ones stored with just <em>clientId</em>. Downcasting helps old subscribers consume new events. It transforms the new <em>client</em> object back into the flat <em>clientId</em> they expect. Same principle as with documents, but especially important here because event subscribers often live in separate services or deployments that you can&#8217;t update all at once.</p><p>And the same upcast function that started as simple type mapping <em>string &#8594; Date</em>, <em>string &#8594; bigint</em> handles ther structural change too. You just add another case to the switch:</p><pre><code><code>case 'ShoppingCartOpened': {
  const e = event as ShoppingCartOpenedV1;
  return {
    ...e,
    data: {
      client: { id: e.data.clientId, name: 'Unknown' },
      openedAt: new Date(e.data.openedAt),
    },
  };
}</code></code></pre><p>Old events get the <em>client</em> object synthesised from the flat <em>clientId</em>. New events already have it. The evolve function only deals with the V2 shape.</p><p>And here&#8217;s where things started to click for me. I added upcasting to fix the bigint problem: explicit type mapping instead of a Regex. But the same mechanism, without any changes, also handles structural versioning. The simple <em>string &#8594; Date</em> mapping from the first example is the same code path as the <em>clientId &#8594; client</em> migration above. It&#8217;s one function, one place, one pattern for all of it: type coercion, field restructuring, schema migration.</p><h2>Right decisions stack</h2><p>Right decisions stack. The Regex hack was blocking the slot where upcasting should have been all along. Once I removed it, the performance got fixed, and I got schema versioning on top. One fix created room for the next one, which created room for the next. That doesn&#8217;t happen when you keep patching around the same bad decision.</p><p>Looking back, maybe the Regex wasn&#8217;t the wrong first move. The rule is <em>&#8220;make it work, make it right, make it pretty&#8221;</em>, in that order. The Regex made it work. It had performance problems, but it still let me ship and learn where the real problem was. If I had tried to design the upcast/downcast system from scratch, without having lived with the Regex for a while, I might have over-engineered it or missed the connection to schema versioning entirely. The understanding came from living with the shortcut.</p><p>Dawid raised a performance issue with Pongo projections, but the same Regex was running in Emmett as well. I could have fixed it in one place and called it a day. Instead, I used it as a push to do the thing I&#8217;d been planning anyway and applied it to both Pongo and Emmett to keep things consistent. Because I already understood the problem well enough, &#8220;make it right&#8221; turned out easier than I expected.</p><p>You can recover from shortcuts. You should. But you also shouldn&#8217;t be afraid to take them in the first place, as long as you come back and do it properly.</p><p>Full changes:</p><ul><li><p><a href="https://github.com/event-driven-io/Pongo/pull/149">Pongo PR #149</a>,</p></li><li><p><a href="https://github.com/event-driven-io/emmett/pull/292">Emmett PR #292</a>.</p></li></ul><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[How I cheated on transactions]]></title><description><![CDATA[Or how to make tradeoffs based on my Cloudflare D1 support]]></description><link>https://www.architecture-weekly.com/p/how-i-cheated-on-transactions</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/how-i-cheated-on-transactions</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 16 Feb 2026 16:53:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rs0c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff316a45c-df05-42c2-951e-c71583545c96_800x500.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We&#8217;re being told that software design is the art of making tradeoffs. But... Are we taught how to make them?</p><p>Not that it&#8217;s easy to teach tradeoffs, it&#8217;s a subtle art of explanation. You need to provide enough context and be precise so others don&#8217;t treat tradeoffs as a general best practice. Because they&#8217;re usually not such, they typically come from the muddy banks of the Wishkah River.</p><p>I think that a decent way is to tell the story. Not the fairy tale, but the specific tradeoff applied in practice. That&#8217;s what I&#8217;m going to do today: tell you how I cheated on database transactions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rs0c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff316a45c-df05-42c2-951e-c71583545c96_800x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rs0c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff316a45c-df05-42c2-951e-c71583545c96_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!rs0c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff316a45c-df05-42c2-951e-c71583545c96_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!rs0c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff316a45c-df05-42c2-951e-c71583545c96_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!rs0c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff316a45c-df05-42c2-951e-c71583545c96_800x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rs0c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff316a45c-df05-42c2-951e-c71583545c96_800x500.png" width="800" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f316a45c-df05-42c2-951e-c71583545c96_800x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:173338,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.architecture-weekly.com/i/188131828?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff316a45c-df05-42c2-951e-c71583545c96_800x500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rs0c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff316a45c-df05-42c2-951e-c71583545c96_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!rs0c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff316a45c-df05-42c2-951e-c71583545c96_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!rs0c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff316a45c-df05-42c2-951e-c71583545c96_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!rs0c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff316a45c-df05-42c2-951e-c71583545c96_800x500.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2><strong>Dumbo</strong></h2><p>Do you know Dumbo? Of course, flying elephant, such a sweetie. It&#8217;s also a codename of my Open Source project. I didn&#8217;t tell you about it so far, as it&#8217;s a shared dependency for <a href="https://github.com/event-driven-io/pongo">Pongo</a> and <a href="https://event-driven-io.github.io/emmett/getting-started.html">Emmett</a>, responsible for:</p><ul><li><p>connection pooling,</p></li><li><p>safe connection lifetime management,</p></li><li><p>handling transactions,</p></li><li><p>migrations,</p></li><li><p>SQL execution.</p></li></ul><p>Not that small a scope for a Dumb tool, aye? Still, the intention is to make usage dumb, hiding the weirdness of a specific SQL database, so I solve it once and don&#8217;t need to be constantly distracted thinking about it, but focus on:</p><ul><li><p>How to append and process events the best way in Emmett,</p></li><li><p>How to make the best JSON handling and translation into specific SQL dialects in Pongo.</p></li></ul><p>Dumbo usage is quite simple:</p><pre><code><code>import { dumbo } from '@event-driven-io/dumbo';
import { pgDumboDriver } from '@event-driven-io/dumbo/pg';

const pool = dumbo({ connectionString, driver: pgDatabaseDriver });</code></code></pre><p>You need to set up a specific database driver (e.g. <em>pg</em> for PostgreSQL or <em>sqlite3</em> for SQLite), as Dumbo now supports multiple relational databases.</p><p>Having that, you can do stuff like:</p><pre><code><code>import { SQL } from '@event-driven-io/dumbo';

await pool.execute.batchCommand([
  SQL`CREATE TABLE test_users (id SERIAL PRIMARY KEY, name TEXT)`,
  SQL`INSERT INTO test_users (name) VALUES ('Alice'), ('Bob')`,
]);</code></code></pre><p>And also do queries:</p><pre><code><code>const count = pool.execute.query&lt;{count: number}&gt;(
  SQL`SELECT COUNT(*) as count 
      FROM test_users 
      WHERE ${SQL.in('id', userIds)}`,
);</code></code></pre><p>It&#8217;ll handle query parameterisation, data escaping, etc.</p><p>It can also handle transactions:</p><pre><code><code>const users = await pool.withTransaction(async (tx) =&gt; {
  await tx.execute.command(
    SQL`INSERT INTO test_users (name) 
        VALUES (${firstUserName}), (${secondUserName})`,
  );

  return execute.query&lt;User&gt;(
    SQL`SELECT *
        FROM test_users 
        WHERE ${SQL.in('id', userIds)}`,
   );
});</code></code></pre><p>As simple as this looks, you should already have the question about the first tradeoff I made in your head.</p><p><strong>Why on earth I think that&#8217;s a sane move to write my own multi-database driver?!</strong></p><p>Well, I agree, that&#8217;s not the best move at the first glance, but let me explain to you why in <strong>MY</strong> context this actually makes more sense:</p><ol><li><p>Yes, there are tools like Knex, Kysely, Drizzle, etc. in Node.js land that handle similar stuff. They&#8217;re nice, I really like them, I really do, but&#8230; But they are all big and are bringing a lot of their conventions, and when I&#8217;m building storage tools like Emmett and Pongo, I need to have more control. I don&#8217;t want those tools and their limitations drive my architecture decisions. I also don&#8217;t want to be surprised when the creator decides to drop working on it, or when I become a victim of a supply chain attack. Still, when I want a new kitchen table, I don&#8217;t start by going to the forest with a saw. I&#8217;m still using existing database drivers, what&#8217;s more I&#8217;m allowing people to choose the one they prefer. I just don&#8217;t want to be driven (and also my users be driven) by some other higher abstraction tool.</p></li><li><p>Even with that, this decision can seem like a bold &#8220;how hard can it be?!&#8221; statement. And it is, but&#8230; But I&#8217;ve built, or was co-authoring, such tools in the past. Some were proprietary, some were Open Sourced (see <a href="https://github.com/JasperFx/weasel">Weasel</a>). And they served me well.</p></li><li><p>And last but not least, the last point. Well, I only had those two above. OK, I can add that I didn&#8217;t plan to make it a general-usage tool, just a small one for my needs.</p></li></ol><p>So I did.</p><p>And here we&#8217;re at the moment when I had to cheat on transactions because of those decisions.</p><h2>Cloudflare D1</h2><p>All relational databases seem similar, but only until you start using them extensively, or until you need to write your own storage library. Then you learn stuff that not necessarily the stuff you&#8217;d like to spend time on. For instance:</p><ul><li><p>What&#8217;s the difference between databases like PostgreSQL and single-threaded databases like SQLite or DuckDB, and how concurrent processing can be surprising</p></li><li><p>or that sqlite3 only calls the first query, but the next silently ignores,</p></li><li><p>etc.</p></li></ul><p>Still, well, abstractions like Dumbo give the possibility to <em>massage</em> such cases behind the scenes.</p><p>And then Cloud Databases and SaaS Databases came into play, like Cloudlfare D1.</p><p>I was motivated by the generous sponsorship from <a href="https://www.linkedin.com/in/samhatoum">Sam Hatoum</a> to make supporting Emmett and Pongo a top priority. Thanks Sam, appreciate that! And I made it, but I had to cheat a bit.</p><p>Databases like Cloudflare D1 and Supabase expose databases as pay-as-you-go services. They optimise deployment, making it highly scalable, so you don&#8217;t need to care about it. It&#8217;s cost-effective for start-ups and scenarios when you&#8217;re not under a huge load. If you are, then you&#8217;ll need to pay more, but they also give you autoscaling at least.</p><p>They do it by exposing the database API through HTTP API, for instance <a href="https://docs.postgrest.org/en/v14/">PostgREST</a>. This gives them easier management around throughput, security, etc., as each call is made as an HTTP request through the exposed API.</p><p>Yet, that kills some options like: ekhm, transactions. The challenge with transactions is that they don&#8217;t scale (that&#8217;s why <a href="https://www.youtube.com/watch?v=b2F-DItXtZs">MongoDB is WebScale</a>). They don&#8217;t scale, as you&#8217;d need to open a transaction, do some freehand operations, and then commit or rollback. That means (typically) you need to keep a connection open during that time. If you&#8217;re building SaaS, it&#8217;s a no-go, because sneaky users would open it for a few hours, do crazy stuff, and kill your SaaS resources&#8217; utilisation.</p><p>But, boy, aren&#8217;t transactions one of the selling points of relational databases? They do, so how to proceed?</p><p><strong>Then you need to cheat, as I did.</strong></p><h2>Repeatable reads, batches, etc.</h2><p>Cloudflare D1 doesn&#8217;t provide transaction data for the reasons I outlined, but it doesn&#8217;t leave us without other tools. Those two tools to let me do the smoke and mirrors trick are:</p><ul><li><p>sessions,</p></li><li><p>SQL batches.</p></li></ul><p>What are sessions? <a href="https://developers.cloudflare.com/d1/best-practices/read-replication">Per Cloudflare Docs</a>:</p><blockquote><p>A session encapsulates all the queries from one logical session for your application. For example, a session may correspond to all queries coming from a particular web browser session. All queries within a session read from a database instance which is as up-to-date as your query needs it to be. Sessions API ensures <a href="https://developers.cloudflare.com/d1/best-practices/read-replication/#replica-lag-and-consistency-model">sequential consistency</a> for all queries in a session.</p></blockquote><p>Essentially, that means that we&#8217;re getting <a href="https://jepsen.io/consistency/models/repeatable-read">repeatable reads</a>. So when we&#8217;re starting specific sessions, they will be handled sequentially.</p><p><strong>And now, the second ingredient: Batches. Per <a href="https://developers.cloudflare.com/d1/worker-api/d1-database/#batch">Cloudflare docs</a></strong></p><blockquote><p>Sends multiple SQL statements inside a single call to the database. This can have a huge performance impact as it reduces latency from network round trips to D1. D1 operates in auto-commit. Our implementation guarantees that each statement in the list will execute and commit, sequentially, non-concurrently.</p><p>Batched statements are <a href="https://www.sqlite.org/lang_transaction.html">SQL transactions</a>. If a statement in the sequence fails, then an error is returned for that specific statement, and it aborts or rolls back the entire sequence.</p><p>To send batch statements, provide D1Database::batch a list of prepared statements and get the results in the same order.</p></blockquote><p>So Cloudflare didn&#8217;t allow us to do the full, freehand transaction, but they allowed us to send multiple statements that internally will be executed as a SQLite Transaction. When the request is handled, it&#8217;ll open a transaction, run the statements, and return the results.</p><p><strong>Cool, let&#8217;s mix this soup together, as having that, I decided to:</strong></p><ol><li><p>Fail automatically if someone tries to create a transaction on Cloudflare D1 with an error:</p></li></ol><blockquote><p>D1 does not support SQL transactions (BEGIN/COMMIT/ROLLBACK/SAVEPOINT). Use { mode: &#8220;session_based&#8221; } to opt-in to session+batch semantics, or use &#8216;connection.execute.batchCommand() for atomic multi-statement execution.</p></blockquote><p>Then they get clear information on the first try.</p><ol start="2"><li><p>Allow users to explicitly open a session-based transaction by providing mode:</p></li></ol><pre><code><code>const users = await pool.withTransaction(
  async (tx) =&gt; {
    await tx.execute.command(
      SQL`INSERT INTO test_users (name) 
      VALUES (${firstUserName}), (${secondUserName})`,
    );

    return tx.execute.query&lt;User&gt;(
      SQL`SELECT *
              FROM test_users 
              WHERE ${SQL.in('id', userIds)}`,
    );
  },  
  { mode: 'session_based' }
);</code></code></pre><p>When they do it, they will need to be aware of the limitations of the tool they have. So that this will internally create a D1 session, and only handle a single batch of operations properly. We&#8217;re mimicking the sequential processing by the session-based repeatable reads capability. Still, we need to remember that we won&#8217;t be able to roll back changes across multiple statements. Only a single command or batch command is an atomic operation.</p><p>We can&#8217;t, for instance, run a batch of updates, and fail the whole batch if one update didn&#8217;t change any record. The batch will only fail if the database throws an exception. An exception can be in SQLite only called by a table constraint or trigger.</p><p>By making this choice to require explicit mode and naming it explicitly, I didn&#8217;t manage to cover all cases, but at least made it safe, so people need to learn about this non-default behaviour and be more careful about it. When designing an API, it&#8217;s usually better to start with a more strict option and do a bit of <em>&#8220;scarification&#8221;</em> sometimes.</p><p>Still, when I implemented that in <a href="https://event-driven-io.github.io/emmett/getting-started.html">Emmett</a> and <a href="https://github.com/event-driven-io/pongo">Pongo</a>, I intentionally used it to enable event appends, but document operations, etc.</p><p>If you&#8217;d like to try it, you can check Emmett&#8217;s or Pongo&#8217;s beta versions.</p><p>For Pongo, you can install it with:</p><pre><code><code>npm install @event-driven-io/pongo@0.17.0-beta.21</code></code></pre><p>And use it as:</p><pre><code><code>import { d1PongoDriver } from '@event-driven-io/pongo/cloudflare';

const client = pongoClient({
  driver: d1PongoDriver,
  database,
  transactionOptions: { mode: 'session_based' },
});</code></code></pre><p>Or in Emmett by installing:</p><pre><code><code>npm install @event-driven-io/emmett-sqlite@0.43.0-beta.1</code></code></pre><p>And use it as:</p><pre><code><code>import { getSQLiteEventStore } from '@event-driven-io/emmett-sqlite';
import { d1EventStoreDriver } from '@event-driven-io/emmett-sqlite/cloudflare';

const eventStore = getSQLiteEventStore({
  driver: d1EventStoreDriver,
  database,
});</code></code></pre><p>Still, even if you don&#8217;t care about Emmett, Pongo, and my Open Source project, I hope that this will give you decent inspiration for your own tradeoffs analysis.</p><p>I hope that you learned a bit about how design APIs work, how to check the guarantees of your tools and learn to walkaround them when you have to.</p><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[On rebuilding read models, Dead-Letter Queues and Why Letting Go is Sometimes the Answer]]></title><description><![CDATA[In the last article, I explained how to rebuild Event-Driven Read Models in a safe and resilient way. I asked readers to let me know if they find any blind spots in my design.]]></description><link>https://www.architecture-weekly.com/p/on-rebuilding-read-models-dead-letter</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/on-rebuilding-read-models-dead-letter</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 19 Jan 2026 18:01:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!jBdJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247ed8c5-16a9-4d5a-93cd-01077982a5a7_800x500.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jBdJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247ed8c5-16a9-4d5a-93cd-01077982a5a7_800x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jBdJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247ed8c5-16a9-4d5a-93cd-01077982a5a7_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!jBdJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247ed8c5-16a9-4d5a-93cd-01077982a5a7_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!jBdJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247ed8c5-16a9-4d5a-93cd-01077982a5a7_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!jBdJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247ed8c5-16a9-4d5a-93cd-01077982a5a7_800x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jBdJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247ed8c5-16a9-4d5a-93cd-01077982a5a7_800x500.png" width="800" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/247ed8c5-16a9-4d5a-93cd-01077982a5a7_800x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;cover&quot;,&quot;title&quot;:&quot;cover&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="cover" title="cover" srcset="https://substackcdn.com/image/fetch/$s_!jBdJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247ed8c5-16a9-4d5a-93cd-01077982a5a7_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!jBdJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247ed8c5-16a9-4d5a-93cd-01077982a5a7_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!jBdJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247ed8c5-16a9-4d5a-93cd-01077982a5a7_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!jBdJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F247ed8c5-16a9-4d5a-93cd-01077982a5a7_800x500.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the last article, I explained <a href="https://event-driven.io/en/rebuilding_event_driven_read_models/">how to rebuild Event-Driven Read Models in a safe and resilient way</a>. I asked readers to let me know if they find any blind spots in my design.</p><p>Well, I found one myself.</p><p>This article is about that edge case, but more importantly, it&#8217;s about the rabbit hole I went down thinking about how to &#8220;fix&#8221; it. At the end of this hole, there is a nice learning about dealing with distributed systems. Sometimes the best engineering isn&#8217;t about preventing all failures. It&#8217;s about recognising your blind spots and making sure they don&#8217;t catch you off guard.</p><h2>Everyone has a plan until they get punched in the mouth</h2><p>Let me recap the situation described in <a href="https://event-driven.io/en/rebuilding_event_driven_read_models/">the previous article</a>. We&#8217;re using PostgreSQL to store events and read models. We want to rebuild an inline projection, one that&#8217;s applied in the same transaction as the appended event.</p><p>In the design from my previous article, during a rebuild, we:</p><ol><li><p>Mark the projection as &#8220;rebuilding&#8221;.</p></li><li><p>Skip inline projections (the rebuild process will catch up anyway).</p></li><li><p>Process all historical events.</p></li><li><p>Mark the projection as &#8220;active&#8221; again.</p></li></ol><p>The hybrid locking strategy with advisory locks and status checks ensures that inline projections know when to skip and that only one rebuild runs at a time.</p><p>Sounds solid. Here&#8217;s where it breaks:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eFXU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9438d06-2f76-4cf8-bb2d-12db61049962_800x548.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eFXU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9438d06-2f76-4cf8-bb2d-12db61049962_800x548.png 424w, https://substackcdn.com/image/fetch/$s_!eFXU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9438d06-2f76-4cf8-bb2d-12db61049962_800x548.png 848w, https://substackcdn.com/image/fetch/$s_!eFXU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9438d06-2f76-4cf8-bb2d-12db61049962_800x548.png 1272w, https://substackcdn.com/image/fetch/$s_!eFXU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9438d06-2f76-4cf8-bb2d-12db61049962_800x548.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eFXU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9438d06-2f76-4cf8-bb2d-12db61049962_800x548.png" width="800" height="548" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b9438d06-2f76-4cf8-bb2d-12db61049962_800x548.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:548,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;mermaid1&quot;,&quot;title&quot;:&quot;mermaid1&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="mermaid1" title="mermaid1" srcset="https://substackcdn.com/image/fetch/$s_!eFXU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9438d06-2f76-4cf8-bb2d-12db61049962_800x548.png 424w, https://substackcdn.com/image/fetch/$s_!eFXU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9438d06-2f76-4cf8-bb2d-12db61049962_800x548.png 848w, https://substackcdn.com/image/fetch/$s_!eFXU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9438d06-2f76-4cf8-bb2d-12db61049962_800x548.png 1272w, https://substackcdn.com/image/fetch/$s_!eFXU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9438d06-2f76-4cf8-bb2d-12db61049962_800x548.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The rebuild process checks for new events, sees none (because event 1000 is still in an uncommitted transaction), declares victory, and sets the projection to active. Meanwhile, the append transaction has already decided to skip the inline projection. When it finally commits, the event exists but was never projected. The rebuild process is done, and only new events will update the projection during the next update to the new event append.</p><p>We could handwave it with <a href="https://event-driven.io/en/no_it_can_never_happen/">it won&#8217;t ever happen!</a>. But, well, under load, this will happen eventually. You might not even notice it, but it will. The timing window is small, but with enough throughput, it becomes a certainty.</p><h2>The Rabbit Hole of &#8220;Fixes&#8221;</h2><p>My first instinct was to engineer my way out of this. Surely with enough clever coordination, we can close this gap?</p><p>Let me walk you through the rabbit hole.</p><h3>Attempt 1: Wait for In-Flight Transactions</h3><p>PostgreSQL provides <em>pg_snapshot_xmin(pg_current_snapshot())</em> which tells us the oldest transaction that&#8217;s still running. The idea: after processing events, wait until all potentially-skipping transactions have completed before marking the projection as active.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NO-_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bd1f3e2-25af-4609-804c-220315d4a021_800x418.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NO-_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bd1f3e2-25af-4609-804c-220315d4a021_800x418.png 424w, https://substackcdn.com/image/fetch/$s_!NO-_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bd1f3e2-25af-4609-804c-220315d4a021_800x418.png 848w, https://substackcdn.com/image/fetch/$s_!NO-_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bd1f3e2-25af-4609-804c-220315d4a021_800x418.png 1272w, https://substackcdn.com/image/fetch/$s_!NO-_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bd1f3e2-25af-4609-804c-220315d4a021_800x418.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NO-_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bd1f3e2-25af-4609-804c-220315d4a021_800x418.png" width="800" height="418" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4bd1f3e2-25af-4609-804c-220315d4a021_800x418.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:418,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;mermaid2&quot;,&quot;title&quot;:&quot;mermaid2&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="mermaid2" title="mermaid2" srcset="https://substackcdn.com/image/fetch/$s_!NO-_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bd1f3e2-25af-4609-804c-220315d4a021_800x418.png 424w, https://substackcdn.com/image/fetch/$s_!NO-_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bd1f3e2-25af-4609-804c-220315d4a021_800x418.png 848w, https://substackcdn.com/image/fetch/$s_!NO-_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bd1f3e2-25af-4609-804c-220315d4a021_800x418.png 1272w, https://substackcdn.com/image/fetch/$s_!NO-_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bd1f3e2-25af-4609-804c-220315d4a021_800x418.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Why it fails:</strong> We don&#8217;t know what we&#8217;re waiting for. While we wait for existing transactions to complete, new transactions start and make their own skip decisions. The target keeps moving. We can never &#8220;catch up&#8221; because new skips happen while we&#8217;re waiting for old ones to become visible.</p><h3>Attempt 2: Use Transaction IDs as Boundaries</h3><p>PostgreSQL assigns monotonic transaction IDs to each transaction when it starts. This seems useful&#8212;what if we record a &#8220;sealing&#8221; transaction ID when we&#8217;re ready to complete the rebuild, and use it to make decisions?</p><p>The logic would be:</p><p>When rebuild is ready to complete, record sealing_txid = current_transaction_id</p><p>Any event from a transaction with ID lower than the sealing point was &#8220;in flight&#8221; during rebuild, so it should be handled by async Events from transactions with higher IDs started after sealing, so they can safely use inline projections (the read model will be ready)</p><p>Sounds reasonable?</p><p>Here&#8217;s why it doesn&#8217;t work.</p><p><strong>The core problem: transaction ID order &#8800; commit order.</strong></p><p>When a transaction starts, PostgreSQL assigns it the next available transaction ID. But transactions don&#8217;t commit in the order they started. A transaction that started earlier (lower ID) might take longer to complete and commit after a transaction that started later (higher ID).</p><p>Let&#8217;s trace through a concrete scenario:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xuir!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74809f4-37b6-43c5-af79-4b1bbb24c5ec_800x688.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xuir!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74809f4-37b6-43c5-af79-4b1bbb24c5ec_800x688.png 424w, https://substackcdn.com/image/fetch/$s_!xuir!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74809f4-37b6-43c5-af79-4b1bbb24c5ec_800x688.png 848w, https://substackcdn.com/image/fetch/$s_!xuir!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74809f4-37b6-43c5-af79-4b1bbb24c5ec_800x688.png 1272w, https://substackcdn.com/image/fetch/$s_!xuir!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74809f4-37b6-43c5-af79-4b1bbb24c5ec_800x688.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xuir!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74809f4-37b6-43c5-af79-4b1bbb24c5ec_800x688.png" width="800" height="688" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e74809f4-37b6-43c5-af79-4b1bbb24c5ec_800x688.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:688,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;mermaid3&quot;,&quot;title&quot;:&quot;mermaid3&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="mermaid3" title="mermaid3" srcset="https://substackcdn.com/image/fetch/$s_!xuir!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74809f4-37b6-43c5-af79-4b1bbb24c5ec_800x688.png 424w, https://substackcdn.com/image/fetch/$s_!xuir!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74809f4-37b6-43c5-af79-4b1bbb24c5ec_800x688.png 848w, https://substackcdn.com/image/fetch/$s_!xuir!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74809f4-37b6-43c5-af79-4b1bbb24c5ec_800x688.png 1272w, https://substackcdn.com/image/fetch/$s_!xuir!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe74809f4-37b6-43c5-af79-4b1bbb24c5ec_800x688.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here&#8217;s what happened:</p><ol><li><p><strong>Tx 99</strong> started first and got the lowest transaction ID. It inserted an event and decided to skip the inline projection (status was &#8216;rebuilding&#8217;). But then it got slow&#8212;maybe network latency, maybe the application did other work.</p></li><li><p><strong>Tx 100</strong> (the rebuild) started second, recorded <em>sealing_txid = 100</em>, and prepared to complete.</p></li><li><p><strong>Tx 101</strong> started third. It checked: &#8220;Is my transaction ID (101) &gt;= sealing_txid (100)?&#8221; Yes, so it assumed the read model was ready and processed its inline projection. It committed successfully.</p></li><li><p><strong>Tx 100</strong> marked the projection as active and committed.</p></li><li><p><strong>Tx 99</strong> finally committed. But it had already decided to skip the projection back when status was &#8216;rebuilding&#8217;. That decision was made, the skip happened, and the event is now missing from the read model.</p></li></ol><p>The fundamental issue: <strong>we can&#8217;t see uncommitted transactions.</strong> When Tx 100 set the sealing point, it had no way to know that Tx 99 was still out there, holding an event that would skip projection. Transaction 99 is invisible until it commits, but by then it&#8217;s too late.</p><p>You might think: &#8220;Just wait until all transactions before the sealing point have committed!&#8221; PostgreSQL even provides <em>pg_snapshot_xmin(pg_current_snapshot())</em> which tells you the oldest active transaction. But this leads us back to Attempt 1&#8212;while we wait, new transactions start and make their own skip decisions. The target keeps moving.</p><p>I wrote about this exact problem in <a href="https://event-driven.io/en/ordering_in_postgres_outbox/">How Postgres sequences issues can impact your messaging guarantees</a>. The same visibility challenges that affect outbox ordering apply here. Transaction IDs are useful for ordering within committed data, but they can&#8217;t help us coordinate with transactions that haven&#8217;t committed yet.</p><h3>Attempt 3: Lock Appends During Transition</h3><p>What if we use advisory locks more aggressively? The idea: when rebuild is ready to complete, acquire an exclusive lock that blocks the entire append path. While holding this lock, flip the status to &#8216;active&#8217;. No appends can be in progress during the flip, so no race condition, right?</p><p>Here&#8217;s the proposed flow:</p><ul><li><p>Rebuild finishes processing historical events</p></li><li><p>Acquire exclusive advisory lock (all new appends must wait)</p></li><li><p>Set status = &#8216;active&#8217;</p></li><li><p>Release lock</p></li><li><p>Waiting appends resume and sees &#8216;active&#8217;, processes inline normally</p></li></ul><p>It should work. In theory. We&#8217;re creating a synchronisation point where nothing can slip through. But there&#8217;s a subtle problem: <strong>the skip decision was already made inside the append transaction.</strong></p><p>An append transaction doesn&#8217;t just check the lock and status at one instant. It does several things:</p><ol><li><p>BEGIN transaction</p></li><li><p>INSERT the event</p></li><li><p>Try to acquire a shared advisory lock</p></li><li><p>Check projection status</p></li><li><p>Decide: process inline or skip?</p></li><li><p>Execute that decision</p></li><li><p>COMMIT</p></li></ol><p>The decision in step 5 happens <em>inside</em> the transaction. If the transaction saw &#8216;rebuilding&#8217; status, it was skipped. That decision is now part of the transaction&#8217;s pending work. The transaction might be waiting to commit or doing other work, but the skip decision has already been made.</p><p>Our exclusive lock in the completion flow blocks step 3 - new transactions can&#8217;t acquire the shared lock. But what about transactions that already passed step 5 and are just waiting to commit?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mreF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e69c73-b6c3-43d4-9b12-317eb5d3a66e_800x730.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mreF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e69c73-b6c3-43d4-9b12-317eb5d3a66e_800x730.png 424w, https://substackcdn.com/image/fetch/$s_!mreF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e69c73-b6c3-43d4-9b12-317eb5d3a66e_800x730.png 848w, https://substackcdn.com/image/fetch/$s_!mreF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e69c73-b6c3-43d4-9b12-317eb5d3a66e_800x730.png 1272w, https://substackcdn.com/image/fetch/$s_!mreF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e69c73-b6c3-43d4-9b12-317eb5d3a66e_800x730.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mreF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e69c73-b6c3-43d4-9b12-317eb5d3a66e_800x730.png" width="800" height="730" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/43e69c73-b6c3-43d4-9b12-317eb5d3a66e_800x730.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:730,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;mermaid4&quot;,&quot;title&quot;:&quot;mermaid4&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="mermaid4" title="mermaid4" srcset="https://substackcdn.com/image/fetch/$s_!mreF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e69c73-b6c3-43d4-9b12-317eb5d3a66e_800x730.png 424w, https://substackcdn.com/image/fetch/$s_!mreF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e69c73-b6c3-43d4-9b12-317eb5d3a66e_800x730.png 848w, https://substackcdn.com/image/fetch/$s_!mreF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e69c73-b6c3-43d4-9b12-317eb5d3a66e_800x730.png 1272w, https://substackcdn.com/image/fetch/$s_!mreF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e69c73-b6c3-43d4-9b12-317eb5d3a66e_800x730.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The timing here is tricky. Advisory locks in PostgreSQL can be either:</p><ul><li><p><strong>Transaction-scoped</strong> (<em>pg_advisory_xact_lock</em>): released automatically when transaction commits</p></li><li><p><strong>Session-scoped</strong> (<em>pg_advisory_lock</em>): held until explicitly released or session ends</p></li></ul><p>If we use transaction-scoped locks for inline projections (which makes sense&#8212;you want the lock tied to the transaction lifetime), then the append transaction might have already released its shared lock by the time we try to acquire an exclusive lock. The lock protected the status check, but the transaction is still running with its skip decision already made.</p><p>Even if we could perfectly synchronise the lock acquisition, there&#8217;s another problem: <strong>advisory locks are session-scoped, not durable.</strong></p><p>If the connection dies while holding the exclusive lock:</p><ul><li><p>The lock releases immediately (that&#8217;s how advisory locks work)</p></li><li><p>We might have partially updated the status</p></li><li><p>The system is now in an unknown state</p></li><li><p>Other transactions resume with potentially inconsistent data</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mBnK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ae43c21-aaa7-44eb-9c96-db54e695c00d_800x498.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mBnK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ae43c21-aaa7-44eb-9c96-db54e695c00d_800x498.png 424w, https://substackcdn.com/image/fetch/$s_!mBnK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ae43c21-aaa7-44eb-9c96-db54e695c00d_800x498.png 848w, https://substackcdn.com/image/fetch/$s_!mBnK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ae43c21-aaa7-44eb-9c96-db54e695c00d_800x498.png 1272w, https://substackcdn.com/image/fetch/$s_!mBnK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ae43c21-aaa7-44eb-9c96-db54e695c00d_800x498.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mBnK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ae43c21-aaa7-44eb-9c96-db54e695c00d_800x498.png" width="800" height="498" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5ae43c21-aaa7-44eb-9c96-db54e695c00d_800x498.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:498,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;mermaid5&quot;,&quot;title&quot;:&quot;mermaid5&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="mermaid5" title="mermaid5" srcset="https://substackcdn.com/image/fetch/$s_!mBnK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ae43c21-aaa7-44eb-9c96-db54e695c00d_800x498.png 424w, https://substackcdn.com/image/fetch/$s_!mBnK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ae43c21-aaa7-44eb-9c96-db54e695c00d_800x498.png 848w, https://substackcdn.com/image/fetch/$s_!mBnK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ae43c21-aaa7-44eb-9c96-db54e695c00d_800x498.png 1272w, https://substackcdn.com/image/fetch/$s_!mBnK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ae43c21-aaa7-44eb-9c96-db54e695c00d_800x498.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We can&#8217;t build reliable coordination on something that vanishes when connections fail. This is exactly why the original article combined advisory locks with persistent status checks&#8212;but that combination doesn&#8217;t solve this particular race condition.</p><h3>TLDR on Attempts</h3><p>Every &#8220;fix&#8221; follows the same pattern:</p><ol><li><p>Identify a coordination point.</p></li><li><p>Discover there&#8217;s a window before that point we can&#8217;t see.</p></li><li><p>Try to close that window.</p></li><li><p>Create a new window somewhere else.</p></li><li><p>Repeat.</p></li></ol><p>We&#8217;re not solving the problem. We&#8217;re relocating it.</p><p>So is it a bug in the initial design? Depending how you look on that. I see it as an example of fundamental property of concurrent systems. PostgreSQL&#8217;s isolation guarantees mean uncommitted transactions are invisible to other transactions. That&#8217;s a feature, not a bug - but it means there&#8217;s always a window we can&#8217;t see into.</p><h2>Stop Fighting, Start Tracking</h2><p>After chasing my tail through various &#8220;solutions,&#8221; I stepped back and asked a different question.</p><p>Instead of: <em>&#8220;How do we prevent events from being skipped?&#8221;</em> (which requires blocking appends or seeing into uncommitted transactions&#8212;both unacceptable because of performance, guarantees etc.)</p><p>I started to think on: <em>&#8220;How do we know when an event was skipped?&#8221;</em> and <em>&#8220;How do we ensure skipped events get processed eventually?&#8221;</em></p><p>Both of these are solvable.</p><p>If we can&#8217;t prevent skips from happening, let&#8217;s make them visible. When an inline projection skips, it could record that it skipped in the same transaction as the event append.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wdhL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5513455f-be66-4a42-990f-f0e31ef00583_800x573.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wdhL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5513455f-be66-4a42-990f-f0e31ef00583_800x573.png 424w, https://substackcdn.com/image/fetch/$s_!wdhL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5513455f-be66-4a42-990f-f0e31ef00583_800x573.png 848w, https://substackcdn.com/image/fetch/$s_!wdhL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5513455f-be66-4a42-990f-f0e31ef00583_800x573.png 1272w, https://substackcdn.com/image/fetch/$s_!wdhL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5513455f-be66-4a42-990f-f0e31ef00583_800x573.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wdhL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5513455f-be66-4a42-990f-f0e31ef00583_800x573.png" width="800" height="573" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5513455f-be66-4a42-990f-f0e31ef00583_800x573.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:573,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;mermaid6&quot;,&quot;title&quot;:&quot;mermaid6&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="mermaid6" title="mermaid6" srcset="https://substackcdn.com/image/fetch/$s_!wdhL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5513455f-be66-4a42-990f-f0e31ef00583_800x573.png 424w, https://substackcdn.com/image/fetch/$s_!wdhL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5513455f-be66-4a42-990f-f0e31ef00583_800x573.png 848w, https://substackcdn.com/image/fetch/$s_!wdhL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5513455f-be66-4a42-990f-f0e31ef00583_800x573.png 1272w, https://substackcdn.com/image/fetch/$s_!wdhL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5513455f-be66-4a42-990f-f0e31ef00583_800x573.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If the append transaction rolls back, the skip record also rolls back (there&#8217;s no event to worry about). If it commits, we have a durable record of exactly what was skipped.</p><p>In Emmett, I&#8217;m using a dedicated <em>emt_system_messages</em> table rather than reusing the regular <em>emt_messages</em> table or creating a simple &#8220;skipped events&#8221; table. This might seem like over-engineering&#8212;why not just create a simple table with projection ID and event position? Or why not just reuse the existing messages table?</p><p><strong>Why a dedicated system messages table?</strong></p><p>The regular <em>emt_messages</em> table is for business events&#8212;the actual domain events that drive your application. Mixing system-level concerns (like &#8220;this projection skipped this event during rebuild&#8221;) with business events pollutes the event log and makes it harder to reason about. A simple &#8220;skipped_events&#8221; table with just (projection_id, event_position) could be enough, but in the long term, it may be hard to maintain. As systems evolve, we&#8217;ll have more cases where we might want to skip events. Plus, having system events recorded with all metadata gives us full observability of the internals of our system. We could even make tables for projections and processors&#8217; status, built as read models from system events! Still, let&#8217;s hold our horses and get back to our use case.</p><p>The system messages table could mirror the structure of regular messages with:</p><ul><li><p>Global position sequencing (with all the transaction visibility handling from the <a href="https://event-driven.io/en/ordering_in_postgres_outbox/">outbox ordering article</a>)</p></li><li><p>Transaction ID tracking for proper visibility checks</p></li><li><p>Archiving support via <em>is_archived</em> flag</p></li><li><p>Partitioning for performance</p></li></ul><p>If we created a throwaway &#8220;skipped events&#8221; table, we&#8217;d need to solve all these problems again. We&#8217;d essentially be building a second event log with the same guarantees.</p><p>The skip could be stored as a system message where:</p><ul><li><p><strong>Stream ID</strong> = the projection identifier (name + version), so we can query all skips for a specific projection</p></li><li><p><strong>Message data</strong> = reference to the original event (sequence ID from the event log)</p></li><li><p><strong>Message metadata</strong> = processor ID, reason for skip, timestamp</p></li><li><p><strong>Message type</strong> = indicates this is a &#8220;skip during rebuild&#8221; system event</p></li></ul><p><strong>Won&#8217;t this table grow too much? It may, but we can archive skip events when they&#8217;re no longer needed.</strong></p><p>When the rebuild processor handles a skipped event, we could archive it. In Emmett, this means setting is_archived = true on the record. The table is partitioned by this flag. Archived records automatically move to a separate partition, which can even be on a different disk drive.</p><p>Why archive instead of delete?</p><ul><li><p>Audit trail: You can see what was skipped and when it was processed</p></li><li><p>Debugging: If something goes wrong, you have history to investigate</p></li><li><p>Idempotency: If a processor crashes and restarts, it won&#8217;t reprocess already - archived skips</p></li></ul><p>Later, we could define retention policies and clean up old archived records. We could do it more aggressively than for business events, since these are operational records, not business history. You might keep business events forever, but archived skip records only need to stick around long enough for debugging (days or weeks, not years).</p><p>The same processor that handles regular events for the projection could also process its skipped records. When it reads an event from the main event log and applies the projection, it also checks for and archives any corresponding skip record for that event position. This keeps everything in sync.</p><p>As mentioned, this also opens the door for other system events in the future - not just rebuild skips, but potentially failed projections, poison messages, or audit records. The infrastructure would already be in place, separated from the business event log.</p><h2>The Final Flow</h2><p>Let me walk through how this works in practice, bringing together all the pieces.</p><h3>During rebuild</h3><p>The projection status is &#8216;rebuilding&#8217;. The async rebuild processor works through historical events from the beginning. Meanwhile, normal operations continue. Events are appended, and inline projections check the status. When they see &#8216;rebuilding&#8217;, they skip the projection but record a skip message in emt_system_messages within the same transaction.</p><h3>When rebuild catches up</h3><p>The rebuild processor eventually reaches the current position where no new events are visible. At this point, it sets the status to &#8216;active&#8217;. From now on, new appends will process their inline projections normally.</p><p>But what about those skip records? The events they reference exist, but their projections were never applied.</p><h3>Draining the skipped events</h3><p>The rebuild processor (or a dedicated processor, or a manual trigger&#8212;your choice) now queries emt_system_messages for skip records belonging to this projection. Using the same transaction visibility rules from the outbox pattern (transaction_id &lt; pg_snapshot_xmin), it only sees skip records from committed transactions.</p><p>For each skip record:</p><ol><li><p>Find the referenced event in the event log (or use its data if stored in skipped event)</p></li><li><p>Apply the projection for that event</p></li><li><p>Archive the skip record (set is_archived = true)</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xE1k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5445a586-1080-470f-bfe1-03689fe52bd9_800x730.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xE1k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5445a586-1080-470f-bfe1-03689fe52bd9_800x730.png 424w, https://substackcdn.com/image/fetch/$s_!xE1k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5445a586-1080-470f-bfe1-03689fe52bd9_800x730.png 848w, https://substackcdn.com/image/fetch/$s_!xE1k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5445a586-1080-470f-bfe1-03689fe52bd9_800x730.png 1272w, https://substackcdn.com/image/fetch/$s_!xE1k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5445a586-1080-470f-bfe1-03689fe52bd9_800x730.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xE1k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5445a586-1080-470f-bfe1-03689fe52bd9_800x730.png" width="800" height="730" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5445a586-1080-470f-bfe1-03689fe52bd9_800x730.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:730,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;mermaid7&quot;,&quot;title&quot;:&quot;mermaid7&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="mermaid7" title="mermaid7" srcset="https://substackcdn.com/image/fetch/$s_!xE1k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5445a586-1080-470f-bfe1-03689fe52bd9_800x730.png 424w, https://substackcdn.com/image/fetch/$s_!xE1k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5445a586-1080-470f-bfe1-03689fe52bd9_800x730.png 848w, https://substackcdn.com/image/fetch/$s_!xE1k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5445a586-1080-470f-bfe1-03689fe52bd9_800x730.png 1272w, https://substackcdn.com/image/fetch/$s_!xE1k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5445a586-1080-470f-bfe1-03689fe52bd9_800x730.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The skip record and the event are committed in the same transaction. This gives us a simple invariant: if an event exists, either its projection was applied inline (status was &#8216;active&#8217;) or a skip record exists (status was &#8216;rebuilding&#8217;). There&#8217;s no third option where an event exists without a trace.</p><p>The drain process might find new skip records appearing. Transactions that were in flight during the status change, committed after we started draining. That&#8217;s fine. We keep querying until no more visible skip records exist. They will stop appearing as we alread stopped rebuilding processor, so no more inline projections should be skipped.</p><p>The &#8220;drain skipped events&#8221; phase can be automatically triggered by finishing projection rebuild. It could be handled as the 2nd phase of rebuild processor, or triggering a dedicated one. It could be also just handled by a human initiating the drain.</p><p>This flexibility lets user choose the approach based on the specific operational requirements.</p><h2>The Dead Letter Queue Pattern</h2><p>What we&#8217;ve built is essentially a <strong>Dead Letter Queue (DLQ)</strong>&#8212;a place where messages that couldn&#8217;t be processed normally are stored for later handling.</p><p>This pattern exists in every serious messaging system:</p><ul><li><p><strong>Apache Kafka:</strong> Dead Letter Topics for messages that fail consumer processing</p></li><li><p><strong>RabbitMQ:</strong> Dead letter exchanges for rejected or expired messages</p></li><li><p><strong>AWS SQS:</strong> Redrive policies that move messages to a DLQ after N failures</p></li><li><p><strong>Azure Service Bus:</strong> Built-in dead-letter sub-queues for each queue</p></li></ul><p>The pattern is universal because all messaging systems face the same fundamental problem: sometimes messages can&#8217;t be processed immediately, and you need a place to store them without blocking the rest of the system.</p><p>It&#8217;s a topic in its own right, as it&#8217;s not perfect and should be used cautiously. Many teams fail to apply them correctly. The DLQ becomes like a car alarm in a parking lot, technically signalling a problem, practically ignored by everyone. And that&#8217;s what I see in many systems: teams set up DLQs and do nothing about them.</p><p>It starts innocently. You configure the DLQ &#8220;just in case.&#8221; A few messages end up there during a deployment. Someone says, &#8220;We&#8217;ll look at it later.&#8221;</p><p>More messages accumulate. The DLQ becomes background noise&#8212;a number on a dashboard that nobody checks. Eventually, something critical lands there, and nobody notices until a customer complains.</p><p>That&#8217;s why in the discussed design, skip records aren&#8217;t meant to accumulate indefinitely. The rebuild processor drains them during completion.</p><p>Retention policies clean up archived records after a reasonable period. If skip records exist for too long, that&#8217;s a signal that something is wrong with the rebuild process - and users should know about it.</p><p>A DLQ is only helpful if it&#8217;s monitored, processed, and understood why messages end up there. Otherwise, it&#8217;s just a fancy way to lose data slowly rather than immediately.</p><h2>The Broader Lesson</h2><p>This article isn&#8217;t really about PostgreSQL advisory locks or projection rebuilding. It&#8217;s about how we approach problems in distributed systems.</p><p>When we find a race condition, the instinct is to fix it. Add a lock. Add a check. Add a coordination phase. But each &#8220;fix&#8221; often just moves the race condition somewhere else. We went through three attempts&#8212;waiting for transactions, using transaction ID boundaries, locking appends&#8212;and each one failed for a different reason. We weren&#8217;t solving the problem; we were relocating it.</p><p>At some point, I had to ask myself: am I making this system more reliable, or just more complicated?</p><p>The answer came when I changed the question. Instead of asking &#8220;how do I prevent events from being skipped?&#8221; I asked, &#8220;How do I know when an event was skipped, and how do I make sure it gets processed eventually?&#8221;</p><p>The first question has no good answer, not without blocking appends, which defeats the purpose. The second question is straightforward: record the skip in the same transaction as the event, and process it later.</p><p>A system isn&#8217;t trustworthy because it never fails. That&#8217;s impossible for anything sufficiently complex. A system is trustworthy because you know when it can fail, how it will fail, and how to recover. The skip tracking approach doesn&#8217;t prevent failures during the transition period. It makes them visible and recoverable. That&#8217;s a stronger guarantee than complex coordination machinery with hidden edge cases.</p><p><strong>Sometimes the best engineering decision is to accept what you can&#8217;t control and focus on what you can.</strong> We can&#8217;t control PostgreSQL&#8217;s transaction visibility rules. We can&#8217;t see into uncommitted transactions. We can&#8217;t make the append decision and the rebuild completion atomic without stopping the world.</p><p>What we can control is recording skipped items, ensuring those skips are processed, and making the whole thing observable. That&#8217;s enough.</p><p>The blind spot I found wasn&#8217;t really about the specific race condition. It was a reminder that distributed systems have fundamental constraints we can&#8217;t engineer around, at least not without trade-offs worse than the original problem.</p><p>Transaction isolation and concurrent systems mean we can&#8217;t have zero-downtime rebuilds with perfect inline projection consistency and no coordination overhead, all at the same time.</p><p>Something has to give.</p><p>What we get instead is explicit tracking of what was skipped, guaranteed eventual processing via the system messages table, observability into the transition period, and crash recovery that doesn&#8217;t lose data. The read model might be briefly inconsistent during the transition, but we know exactly what&#8217;s missing, and we have a clear path to fix it.</p><p>That&#8217;s the kind of system I can reason about and trust.</p><div><hr></div><p><strong>If you&#8217;re dealing with similar challenges, I&#8217;m happy to help through <a href="mailto:oskar@event-driven.io">consulting or mentoring</a>.</strong> You can also join the discussion in our <a href="https://discord.gg/fTpqUTMmVa">Emmett Discord</a>&#8212;we have a nice community working through these exact problems.</p><p>Or check also other related resources:</p><ul><li><p><a href="https://event-driven.io/en/rebuilding_event_driven_read_models/">Rebuilding Event-Driven Read Models in a safe and resilient way</a></p></li><li><p><a href="https://event-driven.io/en/ordering_in_postgres_outbox/">How Postgres sequences issues can impact your messaging guarantees</a></p></li><li><p><a href="https://event-driven.io/en/projections_and_read_models_in_event_driven_architecture/">Guide to Projections and Read Models in Event-Driven Architecture</a>,</p></li><li><p><a href="https://www.architecture-weekly.com/p/distributed-locking-a-practical-guide">Distributed Locking: A Practical Guide</a>,</p></li><li><p><a href="https://event-driven.io/en/consumers_processors_in_emmett/">Consumers, projectors, reactors and all that messaging jazz in Emmett</a>,</p></li><li><p><a href="https://event-driven.io/en/how_to_scale_projections_in_the_event_driven_systems/">How to scale projections in the event-driven systems?</a>,</p></li><li><p><a href="https://event-driven.io/en/checkpointing_message_processing/">Checkpointing the message processing</a>,</p></li><li><p><a href="https://event-driven.io/en/lets_talk_about_positions_in_event_stores/">Let&#8217;s talk about positions in event stores</a>.</p></li></ul><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[Rebuilding Event-Driven Read Models in a safe and resilient way]]></title><description><![CDATA[And what distributed locking, PostgreSQL Advisory Locks have to do with that]]></description><link>https://www.architecture-weekly.com/p/rebuilding-event-driven-read-models</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/rebuilding-event-driven-read-models</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 05 Jan 2026 19:01:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!9xrQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75073730-6b26-4f4a-ac3e-74b77d704aee_800x500.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9xrQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75073730-6b26-4f4a-ac3e-74b77d704aee_800x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9xrQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75073730-6b26-4f4a-ac3e-74b77d704aee_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!9xrQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75073730-6b26-4f4a-ac3e-74b77d704aee_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!9xrQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75073730-6b26-4f4a-ac3e-74b77d704aee_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!9xrQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75073730-6b26-4f4a-ac3e-74b77d704aee_800x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9xrQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75073730-6b26-4f4a-ac3e-74b77d704aee_800x500.png" width="800" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/75073730-6b26-4f4a-ac3e-74b77d704aee_800x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;cover&quot;,&quot;title&quot;:&quot;cover&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="cover" title="cover" srcset="https://substackcdn.com/image/fetch/$s_!9xrQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75073730-6b26-4f4a-ac3e-74b77d704aee_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!9xrQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75073730-6b26-4f4a-ac3e-74b77d704aee_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!9xrQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75073730-6b26-4f4a-ac3e-74b77d704aee_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!9xrQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75073730-6b26-4f4a-ac3e-74b77d704aee_800x500.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Let&#8217;s make a soup today: a blog soup. We&#8217;ll mix multiple ingredients like:</p><ul><li><p>events (obviously),</p></li><li><p>read models,</p></li><li><p>inline and async projections,</p></li><li><p>rebuilding read models,</p></li><li><p>backfilling new ones with data from existing events,</p></li><li><p>scaling async processing horizontally,</p></li><li><p>distributed locking,</p></li><li><p>PostgreSQL and its Advisory Locks.</p></li></ul><p>Sounds a lot? Well, the soup should be nutritious.</p><p>In an event-driven way, after handling business logic, we record new facts and call them events. They gather information about what has happened. That brings many benefits, such as business observability by keeping a log of them. Especially if we&#8217;re doing Event Sourcing, we can make the next decision based on them.</p><p>Typically, we&#8217;re using events in two ways:</p><ul><li><p>reacting to them, triggering and integrating the steps of our business workflow,</p></li><li><p>projecting them, and getting the flattened interpretation of our system state inside read models.</p></li></ul><p>Such processing can happen asynchronously, but doesn&#8217;t have to. If we&#8217;re using a transactional database (like PostgreSQL, SQLite, or even MongoDB), we can update our read models in the same atomic transaction that stores new events. Such a process is typically called <em>inline projection</em>. Event stores like <a href="https://event-driven.io/en/emmett_projections_testing/">Emmett</a> and <a href="https://event-driven.io/en/projections_in_marten_explained/">Marten</a> allows that. Still, you can do the same if you&#8217;re using <a href="https://event-driven.io/en/outbox_inbox_patterns_and_delivery_guarantees_explained/">outbox pattern</a>, then you can achieve the same without Event Sourcing.</p><p>Inline processing is tempting because we&#8217;re getting immediate consistency. Yet, there&#8217;s no free lunch here. We&#8217;re slowing down our event append as we need to process more, and our transactions will be open longer, which can cause deadlocks, etc. We also can&#8217;t take advantage of batching event processing. </p><p>Also, as <a href="https://www.linkedin.com/in/bastian-waidelich-84865221?lipi=urn%3Ali%3Apage%3Ad_flagship3_profile_view_base_contact_details%3B8r9f%2Fno%2BRbyTnp4YOzWo9g%3D%3D">Bastian Waidelich</a> rightfully pointed out to me, using inline projections also increases the coupling and fragility of our business logic. If the inline projection fails (due to a bug in its projection logic, a database constraint or other random issue), then we won&#8217;t be able to append our event, which is counterintuitive, as why would read model block our business logic (e.g. confirming shopping cart).</p><p>My thumb rule is that for single stream, simple projections, I prefer inline projections, but for more complex or workflow processing, I&#8217;d go with async.</p><p>The big benefit of a durable event log is that we can correct past mistakes and gain more insights from existing data.</p><p>How does it look in practice? Let&#8217;s say we initially had a basic read model that showed a summary of the specific shopping cart. In TypeScript this could look as follows:</p><pre><code><code>type ShoppingCartSummary = {
  _id?: string;
  productItemsCount: number;
  totalAmount: number;
};</code></code></pre><p>Besides the data, we also need a method that represents how we apply events on top of the existing state to get the next, evolved state.</p><pre><code><code>const evolve = (
  document: ShoppingCartSummary | null,
  event: ProductItemAdded | ProductItemRemoved,
): ShoppingCartSummary =&gt; {
  document = document ?? { totalAmount: 0, productItemsCount: 0 };

  switch (type) {
    case 'ProductItemAdded':
      return withAdjustedTotals({
        document,
        productItem: event.data.productItem,
        by: 'adding',
      });
    case 'ProductItemRemoved':
      return withAdjustedTotals({
        document,
        productItem: event.data.productItem,
        by: 'removing',
      });
  }
};

const withAdjustedTotals = (options: {
  document: ShoppingCartSummary;
  productItem: PricedProductItem;
  by: 'adding' | 'removing';
}) =&gt; {
  const { document, productItem, by } = options;
  const plusOrMinus = by === 'adding' ? 1 : -1;

  return {
    ...document,
    totalAmount:
      document.totalAmount +
      productItem.unitPrice * productItem.quantity * plusOrMinus,
    productItemsCount:
      document.productItemsCount + productItem.quantity * plusOrMinus,
  };
};</code></code></pre><p>Sounds fine, but well, it may appear that either we, implementing it, or our business, through requirements, forgot about some requirements. Or we didn&#8217;t forget anything, but just requirements evolved as they tend to always do.</p><p>What if we need to handle now also:</p><ul><li><p>Show the shopping cart status and show whether the shopping cart is open, confirmed or cancelled.</p></li><li><p>not only show totals, but also a list of product items with their details.</p></li></ul><p>Our new read model would look like:</p><pre><code><code>type ShoppingCartSummary = {
  _id?: string;
  productItemsCount: number;
  totalAmount: number;
  status: 'Opened' | 'Confirmed' | 'Cancelled';
  productItems: ProductItem[];
};

type ProductItem = {
  productId: string;
  name: string;
  quantity: number;
  unitPrice: number;
}</code></code></pre><p>And here we should ask ourselves the following questions:</p><ul><li><p>Is it really the same read model or another one that&#8217;ll be used in another place in our UI? Maybe the initial just shows basic data in the menu bar, and this will be used as the summary before confirmation?</p></li><li><p>If the read model is the same, then are you fine with downtime where you clean the old data, and reprocess events?</p></li><li><p>If it&#8217;s the new read model, do we need to backfill it with the old data?</p></li></ul><p>There&#8217;s no best practice here; we need to do the drill and be prepared for multiple options.</p><h2>Add new vs update existing.</h2><p>In our case, my initial guess would be that this should be a new read model. We need to add significantly more data and new event handling for confirmation and cancellation.</p><p>You could say that:</p><blockquote><p>Why add a new read model with similar data? Can&#8217;t we just do a subquery?</p></blockquote><p>Of course, you can. I wrote about that more in <a href="https://event-driven.io/en/how_to_create_projections_of_events_for_nested_object_structures/">How to create projections of events for nested object structures?</a>. This can make sense if those read models will always evolve together.</p><p>If you need to run multiple <em>views</em> on the same read models, you&#8217;re increasing coupling. As such, when you&#8217;re rebuilding the read model, then potential downtime will impact both. Also, each time you adjust one view, ensure you haven&#8217;t broken the others.</p><p>You&#8217;re getting a smaller storage size, and potentially don&#8217;t need to remember multiple read models, but are doing it just one.</p><p>My experience shows that optimising for the end storage until we check that it&#8217;s too big isn&#8217;t a good driver. Nowadays, storage is cheap, so my default is to keep read models separated, and also not reuse the same evolve logic. A bit of code duplication won&#8217;t harm us, but we&#8217;ll see benefits as our models evolve. We&#8217;ll decrease the cognitive load.</p><p>Still, if those models are indeed the same, or we bet they&#8217;ll constantly evolve together, then it could be fine to reuse them.</p><h2>In place update, blue/green rebuild and backfilling data</h2><p>Ok, I was already using <em>rebuilding</em> word multiple times. But how do we actually do it?</p><p>If you&#8217;ve read articles on <a href="https://event-driven.io/en/checkpointing_message_processing/">checkpointing</a> or <a href="https://event-driven.io/en/lets_talk_about_positions_in_event_stores/">positions in event stores</a>, you already know that each event in the event store/outbox can have its unique, monotonic position. We can subscribe to notifications about new events and process them one by one. That&#8217;s how <a href="https://event-driven.io/en/consumers_processors_in_emmett/">consumers and processors</a> work in <a href="https://github.com/event-driven-io/emmett">Emmett</a>. Once we process a specific event, we can store the checkpoint in the end storage. This enables resilient failover when our processor dies for some reason. When we restart, we&#8217;ll read the last processed checkpoint first and start listening for events from that point.</p><p>We could reuse this not only for failover but also for rebuilds. Instead of starting processing from the last known checkpoint, we could reset the checkpoint in our database to a specific point (e.g., the beginning of the log). Then we&#8217;ll get recorded events again.</p><p>And this is the moment we need to decide whether to do an in-place update or a blue/green rebuild.</p><p>In-place means that we&#8217;re the same storage target. We need to truncate it and then apply data from scratch, typically. The downside is that when we clear it, our read models won&#8217;t be up to date with the current state of our event store. We already have newer events recorded, but our read models are empty. We need to fill them in.</p><p>The same happens when we create a new read model based on the existing events. When we create a new read model, it won&#8217;t magically contain data from existing events.</p><p>Typically, we need to spin up a background worker (e.g., <a href="https://event-driven.io/en/consumers_processors_in_emmett/">Emmett consumer with a projector plugged in</a>) that pulls all events since the beginning and runs projection logic. Obviously, that means we need to run it asynchronously, and depending on the data size, we&#8217;ll need to wait until it catches up. During that time, querying for the data will return outdated information - eventual consistency in practice.</p><p>That&#8217;s why there&#8217;s another option. Instead of truncating existing data, we can keep it as it is, or even keep updating it in the old form. Having that, we can build a new read model in parallel. This read model could have a different name, or just a suffix, e.g. V2, V2.0.1, whatever we find helpful.</p><p>Then, once this new read model catches up (so it has processed all events, or is close enough to the latest event with a defined threshold), we can switch queries from the old to the new read model storage.</p><p>This is actually the preferred way, but it&#8217;s a bit more challenging when it comes to dynamically switching the query target. If you&#8217;re using Pongo, that&#8217;s not that hard, since you just switch the text-based collection name, which is just another table. But if you&#8217;re using an ORM, adding a new table dynamically and mapping it can be much more challenging.</p><h2>Concurrency issues while (re)building read models</h2><p>To make it harder, eventual consistency is not the only challenge we&#8217;ll face. We also need to deal with concurrency and parallel processing.</p><p>What should we do when a new event is appended while we&#8217;re rebuilding/backfilling the inline projection? If we try to update the end storage in the middle of the processing, we can end up with an inconsistent or erroneous state.</p><p>Another issue is what to do if we&#8217;re in a Kubernetes-like setup and don&#8217;t have full control over the number of instances of the same service? Or what happens if we accidentally spin up multiple rebuild workers?</p><p>Then we&#8217;re doomed, or at least the consistency of our read model data.</p><p>How do we solve it? Let me explain my plan for <a href="https://github.com/event-driven-io/emmett">Emmett</a>.</p><h2>Distributed Locking and PostgreSQL advisory locks</h2><p>I encourage you to check my other article: <a href="https://www.architecture-weekly.com/p/distributed-locking-a-practical-guide">Distributed Locking: A Practical Guide</a>. Yet, don&#8217;t worry, I won&#8217;t leave you with Read-The-Fucking-Manual type of answer.</p><p>Distributed locks are a fundamental tool for coordinating concurrency across systems. We can use a central place, typically scalable on its own, that&#8217;ll be used in multiple instances of our service, to ensure that exactly one can request a lock and run specific code.</p><p>In the mentioned article, I described multiple options and popular tools for handling that (e.g., Redis, Zookeeper, Kubernetes Replica Sets), as well as PostgreSQL and its Advisory Locks. Let me focus today on the last one and describe how I want to use it in <a href="https://github.com/event-driven-io/emmett">Emmett</a>&#8217;s PostgreSQL projection handling.</p><p>PostgreSQL gives us two options for coordination.</p><p><strong>Row-level locks</strong> lock individual table rows. You do <em>SELECT &#8230; FOR UPDATE</em>, and anyone else trying to modify that row waits. The lock is tied to the specific row in a table.</p><pre><code><code>BEGIN;
SELECT * FROM some_table WHERE id = @key FOR UPDATE;
-- make changes
COMMIT;</code></code></pre><p>Such locks are straightforward, as we explicitly state what we want to lock. We could define the following table for projections:</p><pre><code><code>CREATE TABLE IF NOT EXISTS emt_projections(
      version                       INT                    NOT NULL DEFAULT 1,  
      type                          VARCHAR(1)             NOT NULL,
      name                          TEXT                   NOT NULL,
      status                        TEXT                   NOT NULL,  
      PRIMARY KEY (name, version)
);</code></code></pre><p>The background worker could lock the projection by a specific name and version and set the status to &#8220;rebuilding&#8221;. Then, during handling the inline projection, we could use the same lock and check whether the status is &#8220;active&#8221;; if not, skip processing. Using a lock-in inline projection would also prevent the rebuilding process from starting, as they wouldn&#8217;t acquire the lock.</p><p>And that&#8217;s a nuke option, as it would work but could create a performance problem. If every inline projection needs to grab a row lock on a coordination row, they will all be processed sequentially, one at a time. That&#8217;s a throughput killer when you&#8217;re appending thousands of events per second. Each of these append would require access to the lock for the specific projection type (e.g. our shopping cart summary), making not only updates but also event appends sequential. That&#8217;s not acceptable for most cases.</p><p><strong><a href="https://www.postgresql.org/docs/current/explicit-locking.html#ADVISORY-LOCKS">Advisory locks</a></strong> are different. They&#8217;re locks on arbitrary integers. PostgreSQL doesn&#8217;t care what the integer means&#8212;it just manages who holds the lock. No table rows involved. It also allows two modes: exclusive or shared.</p><pre><code><code>-- Shared lock: many can hold simultaneously
SELECT pg_try_advisory_xact_lock_shared(12345);

-- Exclusive lock: only one holder, blocks shared locks
SELECT pg_advisory_lock(12345);</code></code></pre><p>Those locks are either scoped to an open connection session or an opened transaction (with <em>xact</em> with name) and released automatically once they end.</p><p>Shared locks allow multiple sessions to access the lock with the same value. Exclusive lock blocks both other exclusive locks and new shared locks.</p><p><strong>This allows a design where shared locks are used for readers</strong>, and exclusive locks for writers and maps directly to our problem:</p><ul><li><p>Inline projections take <strong>shared</strong> locks as they run concurrently, and just need to check if there&#8217;s no async job updating these projections</p></li><li><p>Rebuilds take <strong>exclusive</strong> locks - they block inlines and other instances of async processing.</p></li></ul><p>We just need to do one more thing: since advisory locks take integers, we need to map our projection name and version to them. We can do it by a consistent hash algorithm, either in the application code or in PostgreSQL.</p><p>PostgreSQL provides a built-in MD5 hash function. It&#8217;s not perfect, as it&#8217;s not a sophisticated hash, but it&#8217;s fast enough and predictable. In our case, we won&#8217;t have thousands of projections in our application, so the risk of a <a href="https://en.wikipedia.org/wiki/Hash_collision">hash collision</a> is negligible. If you&#8217;re still worried it&#8217;s too high, we could store id in our projections table and use it instead of hash-mapping. Still, if we used md5 function, it could look as follows:</p><pre><code><code>// shared for inline projections
SELECT pg_try_advisory_xact_lock_shared(
        ('x' || substr(md5(?), 1, 16))::bit(64)::bigint
) AS acquired;

// exclusive for inline projections
SELECT pg_try_advisory_xact_lock(
        ('x' || substr(md5(?), 1, 16))::bit(64)::bigint
) AS acquired;</code></code></pre><p>Where as query param, we&#8217;d pass the joined projection name and its version.</p><p>Thanks to that, multiple inline projections can access the lock if it&#8217;s not held exclusively by the async (re)building worker. Thanks to that, we&#8217;re not blocking event appends because of the lock on the inline projections.</p><p>An exclusive lock can be held only when there&#8217;s no single inline projection being applied at the moment.</p><p>What&#8217;s also cool is that advisory locks can be used with two strategies: fail fast when the lock is held, or wait for the lock to be released. The first option would be useful for inline projections, and the second for rebuilds.</p><p>Is that all? Well, not quite.</p><h2>Why advisory locks alone aren&#8217;t enough</h2><p>Advisory locks have a gap: they&#8217;re session-scoped. If the connection holding the lock dies, the lock releases automatically. For most cases, that&#8217;s fine and expected, but think through this scenario:</p><ol><li><p>We have 1000 events in our event store.</p></li><li><p>Rebuild starts, acquires exclusive lock.</p></li><li><p>It truncates the projection and processes events 1-500.</p></li><li><p>Connection dies (network blip, out of memory kill, whatever).</p></li><li><p>Lock releases automatically.</p></li><li><p>Projection contains data only for events 1-500.</p></li><li><p>Inlines see no lock, start processing.</p></li><li><p>Inline applies event 1001, ending up with a potentially corrupted state (as there may be some events between 500 and 1001 that would impact the state).</p></li></ol><p>Advisory locks can&#8217;t persist across connection failures. We need to add something that does.</p><h2>The hybrid-locking approach</h2><p>Emmett already maintains tables for tracking processors and projections. The relevant ones:</p><pre><code><code>CREATE TABLE IF NOT EXISTS emt_projections(
    version         INT         NOT NULL DEFAULT 1,  
    type            VARCHAR(1)  NOT NULL,
    name            TEXT        NOT NULL,
    partition       TEXT        NOT NULL DEFAULT 'emt:default',
    kind            TEXT        NOT NULL, 
    status          TEXT        NOT NULL, 
    definition      JSONB       NOT NULL DEFAULT '{}'::jsonb, 
    PRIMARY KEY (name, partition, version)
) PARTITION BY LIST (partition);

CREATE TABLE IF NOT EXISTS emt_processors(
    last_processed_transaction_id XID8    NOT NULL,
    version                       INT     NOT NULL DEFAULT 1,
    processor_id                  TEXT    NOT NULL,
    partition                     TEXT    NOT NULL DEFAULT 'emt:default',
    status                        TEXT    NOT NULL DEFAULT 'stopped', 
    last_processed_checkpoint     TEXT    NOT NULL,    
    processor_instance_id         TEXT    DEFAULT 'emt:unknown',
    PRIMARY KEY (processor_id, partition, version)
) PARTITION BY LIST (partition);</code></code></pre><p>The <em>status</em> column in the projections table can do what advisory locks can&#8217;t: persist status between connection crashes. Even if the connection dies, <em>status = &#8216;rebuilding&#8217;</em> stays in the table. Inlines could check this and skip processing.</p><p>The processor&#8217;s table tracks checkpoint progress. When a rebuild runs, it updates <em>last_processed_checkpoint</em> as it goes. If it crashes and restarts, it can resume from where it left off rather than starting over.</p><p>Potentially, we could reuse the processor&#8217;s table, but having a dedicated projection table could also be useful for diagnostics (e.g., storing the projection definition) and for switching queries when a new projection version catches up. We could select the highest active projection version.</p><p>Having that, the query checking if the inline projection should be processed or skipped could look as follows:</p><pre><code><code>WITH lock_check AS (
    SELECT pg_try_advisory_xact_lock_shared(
        ('x' || substr(md5($1), 1, 16))::bit(64)::bigint
    ) AS acquired
),
status_check AS (
    SELECT status = 'active' AS is_active
    FROM emt_projections
    WHERE partition = $2 AND name = $3 AND version = $4
)
SELECT
    COALESCE((SELECT acquired FROM lock_check), false) AS acquired,
    COALESCE((SELECT is_active FROM status_check), true) AS is_active;</code></code></pre><p>Two checks, both must pass:</p><ol><li><p>Can we get a shared advisory lock? Fails if a rebuild holds exclusive.</p></li><li><p>Is the projection status &#8216;active&#8217;? Fails if it&#8217;s &#8216;rebuilding&#8217;.</p></li></ol><p>Then for async (re)building worker:</p><pre><code><code>WITH lock_check AS (
    SELECT pg_try_advisory_lock(
        ('x' || substr(md5($1), 1, 16))::bit(64)::bigint
    ) AS acquired
),
ownership_check AS (
    INSERT INTO emt_processors (
        processor_id,
        partition,
        version,
        processor_instance_id,
        status,
        last_processed_checkpoint,
        last_processed_transaction_id
    )
    VALUES ($2, $3, $4, $5, 'running', '0', '0'::xid8)
    ON CONFLICT (processor_id, partition, version) DO UPDATE
    SET processor_instance_id = $5,
        status = 'running'
    WHERE 
       -- We already own it
       emt_processors.processor_instance_id = $5 
       -- Unclaimed
       OR emt_processors.processor_instance_id = 'emt:unknown' 
       -- Previous instance finished or crashed
       OR emt_processors.status = 'stopped
    RETURNING last_processed_checkpoint
)
SELECT
    COALESCE((SELECT acquired FROM lock_check), false) AS acquired,
    (SELECT last_processed_checkpoint FROM ownership_check) AS checkpoint;</code></code></pre><p>Together Advisory locks prevent the race at the transition point where the connection was closed, and the rebuilding job is restarting. The status check handles crash recovery</p><p>The rebuild acquires the exclusive lock <em>before</em> updating the status. It waits for in-flight inlines (which hold shared locks) to finish.</p><p>If the checkpoint is null, the UPDATE matched no rows, then another instance owns the processor and is actively running. Back off.</p><p>If both succeed, you own the lock and the processor. The checkpoint tells you where to resume from (with a fallback to the beginning). Only then does it flip the status and start async processing.</p><p>If the rebuild crashes, the lock is released, but the status remains &#8216;rebuilding&#8217;. Inlines check status and skip.</p><h2>The scenarios walkthrough</h2><p>As a single good image speaks more than thousands of words, let me give you some diagrams to summarise how it works.</p><p><strong>1. During inline projection while event appends:</strong></p><ul><li><p>Each inline projection grabs a shared lock before applying,</p></li><li><p>Multiple inlines can run concurrently (shared locks are compatible),</p></li><li><p>Projection updates happen normally.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Mrgj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F370212d5-cdbc-4712-9553-60c9c4b3bc40_800x695.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Mrgj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F370212d5-cdbc-4712-9553-60c9c4b3bc40_800x695.png 424w, https://substackcdn.com/image/fetch/$s_!Mrgj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F370212d5-cdbc-4712-9553-60c9c4b3bc40_800x695.png 848w, https://substackcdn.com/image/fetch/$s_!Mrgj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F370212d5-cdbc-4712-9553-60c9c4b3bc40_800x695.png 1272w, https://substackcdn.com/image/fetch/$s_!Mrgj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F370212d5-cdbc-4712-9553-60c9c4b3bc40_800x695.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Mrgj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F370212d5-cdbc-4712-9553-60c9c4b3bc40_800x695.png" width="800" height="695" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/370212d5-cdbc-4712-9553-60c9c4b3bc40_800x695.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:695,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;mermaid1&quot;,&quot;title&quot;:&quot;mermaid1&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="mermaid1" title="mermaid1" srcset="https://substackcdn.com/image/fetch/$s_!Mrgj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F370212d5-cdbc-4712-9553-60c9c4b3bc40_800x695.png 424w, https://substackcdn.com/image/fetch/$s_!Mrgj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F370212d5-cdbc-4712-9553-60c9c4b3bc40_800x695.png 848w, https://substackcdn.com/image/fetch/$s_!Mrgj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F370212d5-cdbc-4712-9553-60c9c4b3bc40_800x695.png 1272w, https://substackcdn.com/image/fetch/$s_!Mrgj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F370212d5-cdbc-4712-9553-60c9c4b3bc40_800x695.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>2. When a rebuild starts:</strong></p><ul><li><p>Rebuild grabs an exclusive lock (waits for any in-flight inlines to finish),</p></li><li><p>Marks projection as &#8220;rebuilding&#8221;,</p></li><li><p>New inlines see the lock or status and skip,</p></li><li><p>Rebuild processes all historical events,</p></li><li><p>Marks the projection as &#8220;active&#8221; and releases the lock,</p></li><li><p>Inlines resume.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dgRz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d3af5c7-33e4-428f-965c-798c0630daf9_800x583.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dgRz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d3af5c7-33e4-428f-965c-798c0630daf9_800x583.png 424w, https://substackcdn.com/image/fetch/$s_!dgRz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d3af5c7-33e4-428f-965c-798c0630daf9_800x583.png 848w, https://substackcdn.com/image/fetch/$s_!dgRz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d3af5c7-33e4-428f-965c-798c0630daf9_800x583.png 1272w, https://substackcdn.com/image/fetch/$s_!dgRz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d3af5c7-33e4-428f-965c-798c0630daf9_800x583.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dgRz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d3af5c7-33e4-428f-965c-798c0630daf9_800x583.png" width="800" height="583" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5d3af5c7-33e4-428f-965c-798c0630daf9_800x583.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:583,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;mermaid2&quot;,&quot;title&quot;:&quot;mermaid2&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="mermaid2" title="mermaid2" srcset="https://substackcdn.com/image/fetch/$s_!dgRz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d3af5c7-33e4-428f-965c-798c0630daf9_800x583.png 424w, https://substackcdn.com/image/fetch/$s_!dgRz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d3af5c7-33e4-428f-965c-798c0630daf9_800x583.png 848w, https://substackcdn.com/image/fetch/$s_!dgRz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d3af5c7-33e4-428f-965c-798c0630daf9_800x583.png 1272w, https://substackcdn.com/image/fetch/$s_!dgRz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d3af5c7-33e4-428f-965c-798c0630daf9_800x583.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>3. If rebuild crashes:</strong></p><ul><li><p>Lock releases automatically</p></li><li><p>Status stays &#8220;rebuilding&#8221;</p></li><li><p>Inlines keep skipping until another rebuild completes</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s8cJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471bbf4f-801c-41ac-9a0b-797ef45297a0_800x783.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s8cJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471bbf4f-801c-41ac-9a0b-797ef45297a0_800x783.png 424w, https://substackcdn.com/image/fetch/$s_!s8cJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471bbf4f-801c-41ac-9a0b-797ef45297a0_800x783.png 848w, https://substackcdn.com/image/fetch/$s_!s8cJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471bbf4f-801c-41ac-9a0b-797ef45297a0_800x783.png 1272w, https://substackcdn.com/image/fetch/$s_!s8cJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471bbf4f-801c-41ac-9a0b-797ef45297a0_800x783.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s8cJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471bbf4f-801c-41ac-9a0b-797ef45297a0_800x783.png" width="800" height="783" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/471bbf4f-801c-41ac-9a0b-797ef45297a0_800x783.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:783,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;mermaid3&quot;,&quot;title&quot;:&quot;mermaid3&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="mermaid3" title="mermaid3" srcset="https://substackcdn.com/image/fetch/$s_!s8cJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471bbf4f-801c-41ac-9a0b-797ef45297a0_800x783.png 424w, https://substackcdn.com/image/fetch/$s_!s8cJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471bbf4f-801c-41ac-9a0b-797ef45297a0_800x783.png 848w, https://substackcdn.com/image/fetch/$s_!s8cJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471bbf4f-801c-41ac-9a0b-797ef45297a0_800x783.png 1272w, https://substackcdn.com/image/fetch/$s_!s8cJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F471bbf4f-801c-41ac-9a0b-797ef45297a0_800x783.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>4. Multiple async processors:</strong></p><ul><li><p>Each processor tries to acquire the exclusive lock,</p></li><li><p>First one wins, others wait or skip,</p></li><li><p>Guarantees that only one processor handles a projection at a time.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lA3t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0788fea4-468c-4de6-8f8d-5c754e82fa93_800x822.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lA3t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0788fea4-468c-4de6-8f8d-5c754e82fa93_800x822.png 424w, https://substackcdn.com/image/fetch/$s_!lA3t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0788fea4-468c-4de6-8f8d-5c754e82fa93_800x822.png 848w, https://substackcdn.com/image/fetch/$s_!lA3t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0788fea4-468c-4de6-8f8d-5c754e82fa93_800x822.png 1272w, https://substackcdn.com/image/fetch/$s_!lA3t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0788fea4-468c-4de6-8f8d-5c754e82fa93_800x822.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lA3t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0788fea4-468c-4de6-8f8d-5c754e82fa93_800x822.png" width="800" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0788fea4-468c-4de6-8f8d-5c754e82fa93_800x822.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;mermaid4&quot;,&quot;title&quot;:&quot;mermaid4&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="mermaid4" title="mermaid4" srcset="https://substackcdn.com/image/fetch/$s_!lA3t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0788fea4-468c-4de6-8f8d-5c754e82fa93_800x822.png 424w, https://substackcdn.com/image/fetch/$s_!lA3t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0788fea4-468c-4de6-8f8d-5c754e82fa93_800x822.png 848w, https://substackcdn.com/image/fetch/$s_!lA3t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0788fea4-468c-4de6-8f8d-5c754e82fa93_800x822.png 1272w, https://substackcdn.com/image/fetch/$s_!lA3t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0788fea4-468c-4de6-8f8d-5c754e82fa93_800x822.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>5. When you add a new projection:</strong></p><ul><li><p>No status row exists yet,</p></li><li><p>Inlines skip automatically,</p></li><li><p>First rebuild creates the row and backfills data.</p></li></ul><h2>TLDR</h2><p>Just like making delicious soup, designing robust, fault-tolerant and performant distributed systems is not that easy. Building an event store is not that hard. But only if we exclude the async processing part.</p><p>I hope this walkthrough covers both the conceptual and practical aspects of handling projection rebuilds.</p><p>We used PostgreSQL and Advisory Locks, as PostgreSQL is cool and is a driving force in <a href="https://github.com/event-driven-io/emmett">Emmett</a>. But all the same principles apply to other tools and storage (with their specifics).</p><p>I explained why advisory locks and status columns complement each other:</p><ul><li><p><strong>Advisory locks</strong> handle the fast path (in-memory, no disk I/O for normal operations) and prevent races at transition points (rebuild can&#8217;t start until in-flight inlines finish)</p></li><li><p><strong>Status column</strong> handles crash recovery (persists across connection failures) and new projection bootstrapping (inlines skip until first rebuild completes)</p></li></ul><p>Neither alone is sufficient. Together, they provide the guarantees we need without external infrastructure. Just PostgreSQL doing what PostgreSQL does.</p><p>The cost on the hot path is microseconds: one in-memory lock check, one indexed read on a tiny cached table. For most systems, that&#8217;s an acceptable tradeoff.</p><p>I hope this article also shows you how to use distributed locking in practice.</p><p><strong>Please tell me your thoughts and concerns, especially if you see any blind spots in this design!</strong> You can do that in our <a href="https://discord.gg/fTpqUTMmVa">Emmett Discord</a>, come on in, we have a nice community!</p><p><strong>If you&#8217;re dealing with such issues, I&#8217;m happy to help you through consulting or mentoring. <a href="mailto:oskar@event-driven.io">Contact me</a> and we&#8217;ll find a way to unblock you!</strong></p><p>Or check also other related resources:</p><ul><li><p><a href="https://github.com/event-driven-io/emmett/pull/286">Emmett&#8217;s Pull Request implementing described approach</a></p></li><li><p><a href="https://event-driven.io/en/projections_and_read_models_in_event_driven_architecture/">Guide to Projections and Read Models in Event-Driven Architecture</a>,</p></li><li><p><a href="https://www.architecture-weekly.com/p/distributed-locking-a-practical-guide">Distributed Locking: A Practical Guide</a>,</p></li><li><p><a href="https://event-driven.io/en/consumers_processors_in_emmett/">Consumers, projectors, reactors and all that messaging jazz in Emmett</a>,</p></li><li><p><a href="https://event-driven.io/en/how_to_scale_projections_in_the_event_driven_systems/">How to scale projections in the event-driven systems?</a>,</p></li><li><p><a href="https://event-driven.io/en/checkpointing_message_processing/">Checkpointing the message processing</a>,</p></li><li><p><a href="https://event-driven.io/en/lets_talk_about_positions_in_event_stores/">Let&#8217;s talk about positions in event stores</a>.</p></li></ul><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[Architecture Weekly is 5 years old! 🎉]]></title><description><![CDATA[Yup, I just realised that it happened! Let's do a small celebration and recap how I got here!]]></description><link>https://www.architecture-weekly.com/p/architecture-weekly-is-5-years-old</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/architecture-weekly-is-5-years-old</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 29 Dec 2025 10:40:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!SC8-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Welcome to the new week!</p><p>Who said that LinkedIn notifications are useless? I did, several times. Yet! Today, I opened my computer for the first time since last Tuesday and just saw a notification that <strong>Architecture Weekly is 5 years old now!</strong> </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SC8-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SC8-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SC8-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SC8-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SC8-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SC8-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1003858,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.architecture-weekly.com/i/182844029?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SC8-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SC8-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SC8-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SC8-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50caeedf-229d-41f2-a0c5-0f1ddec01669_3088x2316.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>At this point, you probably know that I&#8217;m not the anniversary-type-of-guy (my wife <em>loves</em> that part of me&#8230;), but well, that&#8217;s a nice milestone!</p><p><strong>Architecture Weekly started as a way to organise my readings.</strong> I&#8217;m a curious guy, and I like to read a lot about different topics. During my career, I worked on a lot of weird projects, which exposed me to many technologies and approaches. It was not always pleasant, but it taught me that technology is a tool, and there&#8217;s a big diversity of tools that can lead you to a proper solution.</p><p><strong>So, I had always had hundreds of open browser tabs until my Firefox crashed from time to time (yes, I&#8217;m a Firefox-type-of-guy).</strong> I&#8217;m sure you know that too. I tried multiple ways to organise that: tools like Notion, e-mail threads, etc. None of them was perfect. At some point, my friends treated me like a librarian and asked me for interesting links on a topic or what interesting things I&#8217;d read recently. I decided to put my readings publicly into a Git repository and use Markdown, so I can easily share them. This repository still exists: <a href="https://github.com/oskardudycz/ArchitectureWeekly">https://github.com/oskardudycz/ArchitectureWeekly</a>. At first, just links organised in groups.</p><p>I started sharing it with my friends, and they told me it was useful, so I started sharing it with their friends. It seemed so useful to others that I shared it on the other channels, socials, and groups I was part of. Feedback was still positive, and some people started pushing me to create the newsletter, even though they could subscribe to GitHub release e-mail notifications; they still landed in the same notification swamp, as usual, with other notifications. So they asked if I could set up &#8220;a proper newsletter&#8221;. I was reluctant to do it, but well, as you already know, I did. I used Substack because it helped me avoid the usual accidental complexity by writing my own blogging engine from scratch.</p><p>At some point, I decided to try making a paid version of the newsletter. I decided to build a Discord community for paid subscribers and my <a href="https://github.com/sponsors/oskardudycz">GitHub Sponsors</a>. I started to also run webinars every month, also inviting guests. </p><p><strong>Complete list of webinars we had so far:</strong></p><ul><li><p><a href="https://www.architecture-weekly.com/p/webinar-1-from-crud-to-event-sourcing">#1 - From CRUD to Event Sourcing</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-2-keep-your-streams-short">#2 - Keep your streams short! Or how to model Event-Sourced systems efficiently</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-3-implementing-distributed">#3 - Implementing Distributed Processes</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-4-from-cqrs-to-crud-in-practice">#4 - From CRUD to CQRS in Practice</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-5-architecture-weekly-100">#5 - Architecture Weekly 100 Edition - Live Q&amp;A</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-6-webinar-with-alexey-zimarev">#6 - Alexey Zimarev - You don&#8217;t need an Event Sourcing framework. Or do you?</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-7-design-and-test-event-driven">#7 - Design and test Event-Driven projections and read models</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-8-slim-down-your-aggregates">#8 - Slim down your aggregates!</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-9-radek-maziarka-modularization">#9 - Radek Maziarka - Modularization with Event Storming Process Level</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-10-postgresql-superpowers">#10 - PostgreSQL Superpowers in Practice</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-11-maciej-mj-jedrzejewski">#11 - Maciej &#8220;MJ&#8221; J&#281;drzejewski - Evolutionary Architecture: The What. The Why. The How.</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-12-jeremy-d-miller-simplify">#12 - Jeremy D. Miller: Simplify your architecture with Wolverine</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-13-yves-goeleven-the-fantastic">#13 - Yves Goeleven - The Fantastic 9</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-14-mateusz-jendza-why-verified">#14 - Mateusz Jendza - Why Verified Credentials is the Future of Digital Identity!</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-15-mario-bittencourt-leveraging">#15 - M&#225;rio Bittencourt: Leveraging BPMN for Seamless Team Collaboration in Software Development</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/papers-we-love-1-sagas-hector-garcia">#16 - Papers We Love #1 - Sagas (Hector Garcia-Molina, Kenneth Salem)</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-16-simple-patterns-for-events">#17 - Simple patterns for events schema versioning</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-17-andrea-magnorsky-introducing">#18 - Andrea Magnorsky: Introducing Bytesize Architecture Sessions!</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-18-laila-bougria-debug-your">#19 - La&#239;la Bougri&#226;: Debug your thinking</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/papers-we-love-2-how-do-committees">#20 - Papers We Love #2 - How do committees invent? (Melvin E. Conway)</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-21-michael-drogalis-building">#21 - Michael Drogalis: Building the product on your own terms</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-22-on-performance-testing">#22 - On Performance Testing with Jaros&#322;aw Pa&#322;ka</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/webinar-23-gojko-adzic-on-designing">#23 - Gojko Adzic on designing product development experiments with Lizard Optimization</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/frontent-architecture-backend-architecture">#24 - Frontent Architecture, Backend Architecture or just Architecture? With Tomasz Ducin</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/applying-observability-from-strategy">#25 - Applying Observability: From Strategy to Practice with Hazel Weakly</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/react-query-a-solution-for-frontend">#26 - React Query: A solution for Frontend State Management challenges? With Tomasz Ducin</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/documenting-event-driven-architecture">#27 - Documenting Event-Driven Architecture with EventCatalog and David Boyne</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/practical-introduction-to-event-sourcing">#28 - Practical Introduction to Event Sourcing with Emmett</a></p></li></ul><p><strong>And now, yes, all of them are <a href="https://www.architecture-weekly.com/p/whole-architecture-weekly-content">available for free</a>.</strong> You can watch them. It&#8217;s around 50 hours of free knowledge. I believe that it is of much better quality than some paid courses. Why did I make them public? </p><p><strong>At some point, I realised that I don&#8217;t have much fun with organising those links; it was more of a burden, as curating such lists takes a lot of time.</strong> I&#8217;m also a perfectionist-type-of-guy, so I wanted to ensure that what I recommend is worth the read. It triggers the movement of brain cells. So if you read the article, you might disagree with it, but at least it&#8217;s sparking some thought. I also wanted to synthesise them and put my comments. It may seem like a simple thing to provide such articles. But try to keep it at the proper level week after week, and you&#8217;ll see it might not be as easy as it seems.</p><p><strong>There were not enough paid subscribers to justify the time spent on it, even though I was named a <a href="https://www.architecture-weekly.com/p/architecture-weekly-became-the-substack">Substack Bestseller</a>.</strong> Yes, running a newsletter is not a great business plan. There are some exceptions to that, but even if you&#8217;re in the top few per cent, then it might not be enough to even justify the time spent on it.</p><p>I decided to try a different angle and made Architecture Weekly a fully paid newsletter. Starting with a <a href="https://www.architecture-weekly.com/p/architecture-weekly-189-mastering">deep dive into Database Connection Pooling</a>. I tried to see if I could grind for a while, providing high-quality content that&#8217;s not available elsewhere. I think I managed to deliver a lot of articles like that, and the paid subscriber count grew, but it was still not enough to justify my work. I enjoyed writing such articles a lot, and I still do, but it has put more pressure on me. And preparing such content takes time and effort. The effort I could spend elsewhere, e.g., with my family, in better-paid work like my <a href="https://event-driven.io/en/training/">workshops and consulting</a>, or in my <a href="https://github.com/event-driven-io/">OSS projects</a>.</p><p>And now, here we are, 10 months later and well, I still delivered a lot of deep dives, but shifted closer to the stuff I do daily, so helping other humans build systems and getting benefit from the Event-Driven approach. As you see, recently I cross-posted articles to my <a href="https://event-driven.io">blog</a> and here. I also used the angle of my work on tools like Emmett to explain general concepts. </p><p><strong>What&#8217;s next?</strong> I&#8217;ll probably continue doing it as is; I might also consider merging the blog and newsletter into one. I even own domain event-driven.news, so maybe some rebranding will happen. But it&#8217;s not yet decided.</p><p> I&#8217;ll definitely continue sharing what I learned, blogging, but the exact form? We&#8217;ll see. I&#8217;m open to suggestions! Comment under this post and tell me your thoughts. </p><p><strong>What type of content would you like to see here? Or is there some that you&#8217;d be even open to paying for?</strong> </p><p>Can&#8217;t promise that I&#8217;ll deliver all of them, but it&#8217;ll give me a food for thought.</p><p>Nevertheless, it&#8217;s time for a small celebration!</p><p>Thanks for being here with me through this journey!</p><p>Cheers!</p><p>Oskar</p><p></p>]]></content:encoded></item><item><title><![CDATA[Multi-tenancy and dynamic messaging workload distribution]]></title><description><![CDATA[This time, I discussed how to setup and scale message processing when you have multi-tenant environment or other dynamic setup. You'll see considerations where to scale horizontally, where vertically and why that matters to you]]></description><link>https://www.architecture-weekly.com/p/multi-tenancy-and-dynamic-messaging</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/multi-tenancy-and-dynamic-messaging</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 15 Dec 2025 15:50:13 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/32fd7ec9-9275-4d0d-be17-732832a1d518_800x500.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There are several reasons why I&#8217;m blogging.</p><p>The first one is that I forget, and writing helps me to remember and organise my findings.</p><p>The other is that I like to share my journey with the intention of sparing you, my dear reader, some of my struggles.</p><p>Last but not least, it gives me the chance to learn from discussions inspired by them. It&#8217;s always a chance to meet new perspectives, correct what I did or just trigger some recollection.</p><p>After the article about <a href="https://event-driven.io/en/consumers_processors_in_emmett/">consumer, processors and all that messaging jazz</a>, there was a great discussion on <a href="https://discord.gg/fTpqUTMmVa">Emmett Discord</a>, there were a lot of interesting threads there, but the one that inspired what you read today came from the <a href="https://ismaelcelis.com/">Ismael Celis</a> question. Ismael was curious about:</p><h2>Distributing the workload dynamically between processors</h2><p>And as that&#8217;s something I haven&#8217;t planned to deliver initially, but it&#8217;s a great question, let me put down some notes on my plan around it.</p><p>Why would we want to distribute the workload dynamically? What does that even mean?</p><p>Let&#8217;s start from the classical approach. We can define projections that will build read models based on the upcoming events. In <a href="https://github.com/event-driven-io/emmett">Emmett</a>, it can look like that when <a href="https://event-driven-io.github.io/emmett/getting-started.html#read-models">using Pongo as the storage tool</a>:</p><pre><code><code>const cartsSummaryProjection = pongoSingleStreamProjection({
  collectionName: shoppingCartsSummaryCollectionName,
  getDocumentId: (event) =&gt; event.data.shoppingCartId,
  evolve,
  canHandle: [&#8217;ProductItemAdded&#8217;, &#8216;ShoppingCartConfirmed&#8217;],
  initialState: () =&gt; ({
    status: &#8216;pending&#8217;,
    productItemsCount: 0,
 }),
})</code></code></pre><p>This means that when the projection handles only <em>ProductItemAdded</em> and <em>ShoppingCartConfirmed</em> event types. It&#8217;ll insert or update rows in the table based on the shopping cart id from the event data.</p><p>We can plug this projection into a projector (the specific type of processor responsible for running projections).</p><pre><code><code>const cartsSummaryProjector = postgreSQLProjector({ 
  processorId: &#8216;shoppingCartSummary&#8217;,
  projection: cartsSummaryProjection,
 });</code></code></pre><p>And plug it into the consumer:</p><pre><code><code>const consumer = postgreSQLEventStoreConsumer({
  connectionString,
  processors: [
    cartsSummaryProjector
  ],
});</code></code></pre><p>Now, consumers will poll the PostgreSQL event store and pipe filtered events by type into our projector.</p><p>That&#8217;s simple, we know the filtering criteria, as consumers will know which event types their processor(s) can handle.</p><p>Things get harder when we don&#8217;t know those criteria upfront. When can that happen? What if we were building an e-commerce SaaS product that lets shop owners buy a subscription to run their shops? Then we won&#8217;t know all of our tenants upfront. They will register as they go.</p><h2>What options do we have for such a multi-tenant setup?</h2><p>The simplest option would be to add a custom filter that allows filtering events on the consumer by the tenant.</p><p><a href="https://github.com/event-driven-io/emmett">Emmett</a> doesn&#8217;t support such a feature yet, but it could (and will at some point). We could extend our event metadata to contain the tenant and spin up a consumer for a dedicated tenant.</p><p>We could even wrap it into a dedicated function.</p><pre><code><code>function tenantedConsumer(
  connectionString: string,
  tenant: string,
  processors: MessageProcessor[]) {
  return postgreSQLEventStoreConsumer({
    connectionString,
    processors,
    filterBy: (event) =&gt; event.metadata.tenant === tenant1,
    projection,
  });
}</code></code></pre><p>Of course, this wouldn&#8217;t be fully dynamic, but we could make it dynamic at the infrastructure level, e.g., by passing the tenant ID from an environment variable and spinning up a new Container (e.g., a Kubernetes Pod).</p><pre><code><code>const consumer = tenantedConsumer(
  process.env.PG_CONNECTION_STRING,
  process.env.TENANT_ID,
  [ cartsSummaryProjector ],
);</code></code></pre><p>We could also start a new job from the API endpoint, running it in the existing deployment; there are plenty of options. You can go wild and think about other ways.</p><p>Still, logically, they would look more or less like that. Either you&#8217;re scaling horizontally and sharding physically, tenants</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XKNu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61c397bb-090d-4464-be3d-b52f5fa3e483_800x640.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XKNu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61c397bb-090d-4464-be3d-b52f5fa3e483_800x640.png 424w, https://substackcdn.com/image/fetch/$s_!XKNu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61c397bb-090d-4464-be3d-b52f5fa3e483_800x640.png 848w, https://substackcdn.com/image/fetch/$s_!XKNu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61c397bb-090d-4464-be3d-b52f5fa3e483_800x640.png 1272w, https://substackcdn.com/image/fetch/$s_!XKNu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61c397bb-090d-4464-be3d-b52f5fa3e483_800x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XKNu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61c397bb-090d-4464-be3d-b52f5fa3e483_800x640.png" width="800" height="640" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/61c397bb-090d-4464-be3d-b52f5fa3e483_800x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;containers&quot;,&quot;title&quot;:&quot;containers&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="containers" title="containers" srcset="https://substackcdn.com/image/fetch/$s_!XKNu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61c397bb-090d-4464-be3d-b52f5fa3e483_800x640.png 424w, https://substackcdn.com/image/fetch/$s_!XKNu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61c397bb-090d-4464-be3d-b52f5fa3e483_800x640.png 848w, https://substackcdn.com/image/fetch/$s_!XKNu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61c397bb-090d-4464-be3d-b52f5fa3e483_800x640.png 1272w, https://substackcdn.com/image/fetch/$s_!XKNu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F61c397bb-090d-4464-be3d-b52f5fa3e483_800x640.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Or you&#8217;re scaling vertically and running multiple tenants inside your box. Then, instead of spinning up new containers, you&#8217;re spinning up new jobs (processes, threads, virtual threads, etc.).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VKZ6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2494b73-0096-410a-8e04-676c899aa51c_800x494.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VKZ6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2494b73-0096-410a-8e04-676c899aa51c_800x494.png 424w, https://substackcdn.com/image/fetch/$s_!VKZ6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2494b73-0096-410a-8e04-676c899aa51c_800x494.png 848w, https://substackcdn.com/image/fetch/$s_!VKZ6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2494b73-0096-410a-8e04-676c899aa51c_800x494.png 1272w, https://substackcdn.com/image/fetch/$s_!VKZ6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2494b73-0096-410a-8e04-676c899aa51c_800x494.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VKZ6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2494b73-0096-410a-8e04-676c899aa51c_800x494.png" width="800" height="494" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f2494b73-0096-410a-8e04-676c899aa51c_800x494.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:494,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;threads&quot;,&quot;title&quot;:&quot;threads&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="threads" title="threads" srcset="https://substackcdn.com/image/fetch/$s_!VKZ6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2494b73-0096-410a-8e04-676c899aa51c_800x494.png 424w, https://substackcdn.com/image/fetch/$s_!VKZ6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2494b73-0096-410a-8e04-676c899aa51c_800x494.png 848w, https://substackcdn.com/image/fetch/$s_!VKZ6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2494b73-0096-410a-8e04-676c899aa51c_800x494.png 1272w, https://substackcdn.com/image/fetch/$s_!VKZ6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2494b73-0096-410a-8e04-676c899aa51c_800x494.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That&#8217;s nothing special for Emmett; other tools also do this, e.g., <a href="https://docs.spring.io/spring-kafka/reference/kafka/receiving-messages/message-listener-container.html#using-ConcurrentMessageListenerContainer">Spring Boot Kafka concurrent listeners</a>.</p><p>And hey, btw. there&#8217;s one more reason why I&#8217;m writing about my Emmett design: to let you benchmark your design against it. Even if you&#8217;re not planning to use it, then those are considerations for you either as:</p><ul><li><p>internal tooling creator,</p></li><li><p>user of other OSS tooling, to check how they solved it.</p></li></ul><p>Separating tenants through sharding give you:</p><ul><li><p>better option to scale and align the needs to the specific tenant workload,</p></li><li><p>makes possible full separation in terms of networking, storage, etc.</p></li><li><p>can be more costly.</p></li></ul><p>Separating tenants through partitioning the load inside one box:</p><ul><li><p>is usually cheaper,</p></li><li><p>easier to manage,</p></li><li><p>doesn&#8217;t give you a full separation and can fall into <a href="https://learn.microsoft.com/en-us/azure/architecture/antipatterns/noisy-neighbor/noisy-neighbor">Noisy Neighbour issue</a>.</p></li></ul><p>So there&#8217;s no golden rule to choose, it depends on your tooling, needs, etc.</p><p>You can, of course, have a mixed solution, so running most of the small tenants in the same box, and the one with bigger security needs gets a special setup. Especially that dynamic split doesn&#8217;t have to be only per tenant. You can, e.g., have per region, per product range, per chain, etc.</p><p>Also, such a dedicated setup can work for a set number of dynamic options. If you have 1000 tenants, would you spin up 1000 containers or jobs? You can, but just because you can doesn&#8217;t mean that you should. The more containers or threads you have, the more you pay for the coordination costs. At some point, it&#8217;s better to group processing into a manageable range of containers or threads.</p><p>Ok, but how do we know which messages to process where? We could use a <a href="https://en.wikipedia.org/wiki/Consistent_hashing">consistent hash</a>. I wrote about it in detail in <a href="https://www.architecture-weekly.com/p/understanding-kafkas-consumer-protocol">Understanding Kafka&#8217;s Consumer Protocol: A Deep Dive into How Consumers Talk to Brokers</a>.</p><p>Kafka, by default, partitions its data on the producer side. The topic represents a logical split (e.g. all messages from the E-Commerce module), and the partition represents physical layout. Consumer groups receive messages. Kafka guarantees that precisely one consumer from the consumer group will receive messages from the specific partition. If we have fewer consumers than partitions, then of course, we can get more partitions to handle.</p><p>The pseudo code for distributing load to consumers could look as follows:</p><pre><code><code>const partitionId = message.headers.partitionId;
const hash = consistentHashFunction(partitionId);
const consumerId = hash % totalNumberOfConsumersWithinGroup;</code></code></pre><p>In our case, we could use the tenant id as the partition id.</p><p>Kafka&#8217;s strategy is also simpler than we might need. The partitioning is done on the producer side. The producer sends a message to the topic that already has a certain number of partitions:</p><pre><code><code>const hash =  consistentHashFunction(message.header.recordId);
const partition = hash % totalNumberOfPartitionsWithinTopic;</code></code></pre><p>In our case, if we&#8217;d like to allow distribution by any property, we&#8217;d need to load the message, read the field we want to partition/shard on, and send it to the appropriate container or thread.</p><p>The mechanism can get pretty hefty. You&#8217;d need a more sophisticated mechanism to know where to spin up what, how to distribute the load between containers or threads, and to make it resilient. We&#8217;re getting into the area of distributed consensus algorithms such as <a href="https://raft.github.io/">Raft</a> and <a href="https://en.wikipedia.org/wiki/Paxos_(computer_science)">Paxos</a>.</p><p>Do I want to go into this area with Emmett? Definitely not for free! I don&#8217;t think that&#8217;d be even worth it, as we have mature solutions like Kafka, RabbitMQ, and other messaging systems that <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum">implement such algorithms and specialise in that</a>. I&#8217;d prefer to make it easier to forward messages to them and let them do the work.</p><p>What is definitely on the plate is the second option, which allows partitioned producers. You could define it as such:</p><pre><code><code>const cartsSummaryProjector = postgreSQLProjector({ 
  processorId: &#8216;shoppingCartSummary&#8217;,
  projection: cartsSummaryProjection,
  partitionBy: (event) =&gt; event.metadata.tenant,
 });</code></code></pre><p>Then the consumer would poll messages from all tenants, forward them to projectors, and the projectors would internally spin up <a href="https://nodejs.org/api/worker_threads.html">worker threads</a> per tenant or group them by consistent hashing.</p><p>If we add to that distributed locking or <a href="https://event-driven.io/en/checkpointing_message_processing/">detecting conflicting checkpointing</a> to ensure that there&#8217;s only one worker instance handling messages for the processor and partition id, then this should be good enough for the majority of cases.</p><p>Keeping in mind the flexibility in which you can group (or not) projectors within consumers, you can define your own topologies.</p><p>What are your thoughts? How do you deal with such cases? Would you like me to expand more on some cases?</p><p>Or maybe you&#8217;d like to help me and <a href="https://github.com/sponsors/event-driven-io">sponsor my work in this area in Emmett</a>? Then your project could also benefit faster from it!</p><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[Checkpointing the message processing]]></title><description><![CDATA[I realised that checkpointing our (message) processing is similar to using cheat codes in Super Frog and similar old games.We need to recover and use the code to get where we were. Of course, sometimes that means going through the half of the level again, but well, better than from the beginning.I also shared with you implementation based on the PostgreSQL how to do it, and how checkpointing can also help in a generic idempotence check (with caution) and to detect noisy neighbours (with understanding).]]></description><link>https://www.architecture-weekly.com/p/checkpointing-the-message-processing</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/checkpointing-the-message-processing</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 08 Dec 2025 15:15:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!PhAK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec7cb38-3378-4779-9848-eed89246e8c0_800x500.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Let&#8217;s start by asking you two questions.</p><ol><li><p><strong>What <a href="https://en.wikipedia.org/wiki/Superfrog">Super Frog</a> has to do with messaging?</strong></p></li><li><p><strong>When was the last time you wrote if statements in SQL?</strong> If it&#8217;s been a long time, have you at least seen them? If not, (don&#8217;t) worry, you&#8217;ll see them today.</p></li></ol><p>Will it be a post about weird SQL usage? Not necessarily.</p><p><strong>We&#8217;ll talk today about checkpointing our processing.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PhAK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec7cb38-3378-4779-9848-eed89246e8c0_800x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PhAK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec7cb38-3378-4779-9848-eed89246e8c0_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!PhAK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec7cb38-3378-4779-9848-eed89246e8c0_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!PhAK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec7cb38-3378-4779-9848-eed89246e8c0_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!PhAK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec7cb38-3378-4779-9848-eed89246e8c0_800x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PhAK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec7cb38-3378-4779-9848-eed89246e8c0_800x500.png" width="800" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1ec7cb38-3378-4779-9848-eed89246e8c0_800x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;2025 12 08 cover&quot;,&quot;title&quot;:&quot;2025 12 08 cover&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="2025 12 08 cover" title="2025 12 08 cover" srcset="https://substackcdn.com/image/fetch/$s_!PhAK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec7cb38-3378-4779-9848-eed89246e8c0_800x500.png 424w, https://substackcdn.com/image/fetch/$s_!PhAK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec7cb38-3378-4779-9848-eed89246e8c0_800x500.png 848w, https://substackcdn.com/image/fetch/$s_!PhAK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec7cb38-3378-4779-9848-eed89246e8c0_800x500.png 1272w, https://substackcdn.com/image/fetch/$s_!PhAK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ec7cb38-3378-4779-9848-eed89246e8c0_800x500.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;ve started my relationship with computers with games. I still have my Amiga 500. In those days, computers didn&#8217;t always have a hard disk. You&#8217;ve got a bunch of diskettes with different chapters of the game. Not all of them were simple games; many were quite sophisticated, and it took some time to finish them.</p><p>Yet they were dealing with limited diskette space, so it was best if they didn&#8217;t have to use any of it. How can you then allow you to stop playing and return to the previous state? Or how to not force you to start from the beginning of the game when you fell from the platform, and well, you died? You died in the game, ofc, that at least you should be able to recover, right?</p><p>As mentioned earlier, the limited space on diskettes and the additional complexity that came with it led many game makers to adopt a simple solution: checkpoints with codes.</p><p><strong>After you passed a level, you got a code you could type when you started the game, and instead of starting from the beginning, you could go directly to the place where you left off.</strong> That worked pretty well for the platform and race car games, since your game&#8217;s storyline was always the same, immutable. If you had to go to level 27, the starting point and your character would always look the same. Of course, for RPG and strategy games, that&#8217;s a different story.</p><p>Surprisingly, this parallel also matches the recovery from a business process failure.</p><p>Let&#8217;s say we&#8217;re using message-based communication to streamline and make it more resilient. We don&#8217;t want to make it vulnerable to scenarios where we store information in one system, our process dies, and we don&#8217;t manage to notify the other parts.</p><p>We&#8217;re using <a href="https://event-driven.io/en/outbox_inbox_patterns_and_delivery_guarantees_explained/">Outbox pattern</a> to enable that technically. We&#8217;re storing messages in the relational table within the same transaction, updating the state after running business logic. Thanks to that, either both states are updated, and the message is scheduled, or none of it is. We&#8217;re getting (eventual) consistency thanks to that.</p><p>Now we&#8217;re on the receiving end, so where we were in the previous article with the explanation of <a href="https://event-driven.io/en/consumers_processors_in_emmett/">Consumers, Processors and all that jazz</a>.</p><p>Let&#8217;s say that we&#8217;re using PostgreSQL and our Outbox structure looks as explained in the <a href="https://event-driven.io/en/ordering_in_postgres_outbox/">other article</a>:</p><pre><code><code>CREATE TABLE outbox(
   -- the autoincremented position of the message to respect the order
   position        BIGSERIAL                 PRIMARY KEY,
   -- used to detect gaps in numbering
   transaction_id    xid8 NOT NULL,
   -- unique message id, which can be used for deduplication or idempotency
   message_id       VARCHAR(250)             NOT NULL,
   -- the message type, e.g. `TransactionRecorded`
   message_type     VARCHAR(250)             NOT NULL,
   -- serialised message data, e.g. to JSON
   data             JSONB                    NOT NULL,
   -- diagnostic information on when the message was scheduled
   scheduled       TIMESTAMP WITH TIME ZONE  NOT NULL    default (now())
);</code></code></pre><p>As you can see, besides the message ID, type, and data, we&#8217;re also storing the (global) position number and transaction ID (to ensure we don&#8217;t skip in-flight transactions that have requested the global position number, read more <a href="https://event-driven.io/en/ordering_in_postgres_outbox/">here for reasoning</a>).</p><p>Now we can be polling it with the query like:</p><pre><code><code>SELECT 
     position, message_id, message_type, data
FROM
     outbox
WHERE
     (
          (transaction_id = last_processed_transaction_id
               AND position &gt; last_processed_position)
          OR
          (transaction_id &gt; last_processed_transaction_id)
     )
     AND transaction_id &lt; pg_snapshot_xmin(pg_current_snapshot())
ORDER BY
    transaction_id ASC,
    position ASC
LIMIT 100;</code></code></pre><p>Now, thanks to that, we can have the global ordering guarantee on the receiving end. We&#8217;re trading a bit of performance for greater correctness. Not always acceptable, but for internal module communication or forwarding to the messaging system, that&#8217;s usually more than enough.</p><p>That&#8217;s also the place where we&#8217;re back to our checkpointing. How do we know the last processed position?</p><p>By default, that&#8217;s simple, we could either say:</p><ul><li><p><strong>start from the beginning</strong> and use -1 or some other hardcoded position,</p></li><li><p><strong>start from the end</strong> and use the position of the last message in the table plus one.</p></li></ul><p>Those cases can be fine when we need to handle all messages (e.g., adding a new process or reading a model), or when we don&#8217;t care about the past and need to process the newest notifications.</p><p>Still, the reality is just like in the old games, we&#8217;d like to start somewhere in the middle, precisely where we left off.</p><p>We need one more table for storing our checkpoints. It can look as follows:</p><pre><code><code>CREATE TABLE processor_checkpoints
(
   -- subscription name
   processor_id          VARCHAR(100) PRIMARY KEY,
   -- information about the position of the last processed message
   last_processed_position  INTEGER      NULL,
   -- used to detect gaps in numbering
   last_processed_transaction_id    xid8 NOT NULL
);</code></code></pre><p>Simple stuff: the processor ID should be unique and reflect the processor&#8217;s logical name and the last processed position(s).</p><p>If that looks simple, then let me follow up with the next set of potentially simple questions:</p><ul><li><p>How to store it?</p></li><li><p>When to store it?</p></li></ul><p>If your answer is: &#8220;just do upsert statement&#8221;, then you&#8217;re kinda right, but that wouldn&#8217;t be simple, it&#8217;d be over-simplification. At least if you&#8217;d like to run it in production.</p><p>Let&#8217;s start with how. And, for that, let me bring you now the promised stored procedure with if statements, it&#8217;s a bit simplified version from Emmett:</p><pre><code><code>CREATE OR REPLACE FUNCTION store_processor_checkpoint(
  p_processor_id VARCHAR(100),
  p_position BIGINT,
  p_expected_position BIGINT,
  p_transaction_id xid8
) RETURNS INT AS $$
DECLARE
  current_position BIGINT;
BEGIN
  -- Handle the case when p_check_position is provided
  IF p_expected_position IS NOT NULL THEN
      -- Try to update if the position matches p_check_position
      UPDATE processor_checkpoints
      SET 
        &#8220;last_processed_position&#8221; = p_position, 
        &#8220;last_processed_transaction_id&#8221; = p_transaction_id
      WHERE &#8220;processor_id&#8221; = p_processor_id AND &#8220;last_processed_position&#8221; = p_check_position;

      IF FOUND THEN
          RETURN 1;  -- Successfully updated
      END IF;

      -- Retrieve the current position
      SELECT &#8220;last_processed_position&#8221; INTO current_position
      FROM processor_checkpoints
      WHERE &#8220;processor_id&#8221; = p_processor_id;

      -- Return appropriate codes based on current position
      IF current_position = p_position THEN
          RETURN 0;  -- Idempotent check: position already set
      ELSIF current_position &gt; p_expected_position THEN
          RETURN 2;  -- Failure: current position is greater
      ELSE
          RETURN 3;  -- Default failure case for mismatched positions
      END IF;
  END IF;

  -- Handle the case when p_check_position is NULL: Insert if not exists
  BEGIN
      INSERT INTO processor_checkpoints (&#8221;processor_id&#8221;, &#8220;last_processed_position&#8221;, &#8220;last_processed_transaction_id&#8221;)
      VALUES (p_processor_id, p_position, p_transaction_id);
      RETURN 1;  -- Successfully inserted
  EXCEPTION WHEN unique_violation THEN
      -- If insertion failed, it means the row already exists
      SELECT &#8220;last_processed_position&#8221; INTO current_position
      FROM processor_checkpoints
      WHERE &#8220;processor_id&#8221; = p_processor_id;

      IF current_position = p_position THEN
          RETURN 0;  -- Idempotent check: position already set
      ELSIF current_position &gt; p_expected_position THEN
          RETURN 2;  -- Insertion failed, row already exists with a greater position
      ELSE
          RETURN 3;  -- Default failure case for mismatched positions
      END IF;
  END;
END;
$$ LANGUAGE plpgsql;</code></code></pre><p>Oooh, even by copy and pasting, I&#8217;m already tired; there&#8217;s a fair reason why we&#8217;re not doing that too often nowadays.</p><p>Let me untangle that for you:</p><ol><li><p>We&#8217;re trying to update the existing position or insert it if we&#8217;re storing it for the first time.</p></li><li><p>If all went fine, we&#8217;re returning 1 as a result to denote complete success.</p></li><li><p>If we saw that the checkpoint in the database had the same value, we&#8217;re returning 0.</p></li><li><p>If we saw that the checkpoint is different from what was expected and further away from it, then we&#8217;re returning 2.</p></li><li><p>Otherwise, we&#8217;re returning 3, which means that the checkpoint is different from the expected and older.</p></li></ol><p>Essentially, by passing the expected position, we can detect whether we:</p><ul><li><p>already handled the specific position,</p></li><li><p>have some competing instance of our processor handling our data.</p></li></ul><p>That&#8217;s why we&#8217;re doing this fancy dance with IF statements and a stored procedure.</p><p>Detection assumes that we have a global ordering processing guarantee (thus, tricky bits with transaction ID).</p><p><strong>It also shows why global ordering is useful.</strong></p><p>By detecting that we&#8217;ve already handled a specific position, we can skip processing handling idempotency on the processor level.</p><p>By detecting that there&#8217;s another processor with the same id processing messages, we can make it more resilient and detect the <em>noisy neighbour</em> issue.</p><p>How would that look in the code?</p><pre><code><code>async function handleBatch(messageBatch: RecordedMessage[], context: ProcessorContext): Promise&lt;BatchHandlingResult&gt; {
  const { checkpoint } = messageBatch[messageBatch.length - 1].metadata;

  return context.pool.withTransaction(async (transaction) =&gt; {
    for (const message of messageBatch) {
      await context.onMessage(message);      
    };

    // No error was thrown: proceed to store checkpoint of the last processed message
    const result = await storeProcessorCheckpoint(transaction.execute, {
      processorId: context.processorId,
      newCheckpoint: checkpoint,
      lastProcessedCheckpoint: context.lastProcessedCheckpoint,
    });

    if(result.success) {
      await transaction.commit();
    } else {
      // no need to do here, either we already handled it
      // or we have a mismatch of expected and existing checkpoints
      await transaction.rollback();
    }

    return result;
  });
}

type ProcessorContext = {
  processorId: string;
  lastProcessedCheckpoint: bigint | null;
  onMessage: (message: AnyMessage) =&gt; Promise&lt;void&gt;;
  pool: ConnectionPool;
}

type BatchHandlingResult =
  | {
      success: true;
      newCheckpoint: bigint | null;
 }
  | { success: false; reason: &#8216;IGNORED&#8217; | &#8216;FURTHER&#8217; | &#8216;OLDER&#8217; };

async function storeProcessorCheckpoint(
  execute: SQLExecutor,
  options: {
    processorId: string;
    newCheckpoint: bigint | null
    lastProcessedCheckpoint: bigint | null;
    partition?: string;
 },
): Promise&lt;BatchHandlingResult&gt; {
  const { result } = await single(
      execute.command&lt;{ result: 0 | 1 | 2 | 3}&gt;(
        SQL`SELECT store_processor_checkpoint(
            ${options.processorId}, 
            ${options.newCheckpoint}, 
            ${options.lastProcessedCheckpoint}, 
            pg_current_xact_id()
        ) as result;`,
      ),
    );

    return result === 1
      ? { success: true, newCheckpoint: options.newCheckpoint! }
      : { success: false, reason: result === 0 ? &#8216;IGNORED&#8217; : result === 2 ? &#8216;FURTHER&#8217;: &#8216;OLDER&#8217; };
};</code></code></pre><p>As you can see, thanks to:</p><ul><li><p>global ordering,</p></li><li><p>checkpoint detection,</p></li><li><p>storing checkpoint where our side effects will be stored,</p></li><li><p>transactional capabilities of our end storage,</p></li></ul><p>We can ensure that the entire batch is processed or not. We could even optimise it by storing the checkpoint first and not processing the business logic if there&#8217;s a mismatch, then committing only if the logic succeeds.</p><p>We&#8217;re getting by that generic idempotence check and detection of the noisy neighbour.</p><p>Of course, I still believe that <a href="https://event-driven.io/en/idempotent_command_handling/">idempotence check should happen on the business logic side</a>. But why not both?</p><p>Being able to detect a noisy neighbour can help you automatically stop (or pause) one of the competing consumers and avoid inconsistency conflicts.</p><p><strong>What are the tradeoffs of this approach?</strong></p><ol><li><p>This will work if we have a global ordering guarantee. Not many messaging solution gives us such. If we have a subscription-based outbox as explained, event store like <a href="https://github.com/event-driven-io/emmett">Emmett</a>, Marten or KurrentDB, Kafka, this will work, but not necessarily for solutions like RabbitMQ, SQS, Google Pub Sub, etc.</p></li><li><p>This works best if you have transaction capabilities. Batching updates generally improves performance, but sometimes can lead to long-lived transactions; beware of that. The subscription-based solution with a transaction ID also works best if your transactions are short. If they&#8217;re open for a long time, it can cause delays.</p></li><li><p>Still, even without transactions, it will work fine, as long as you&#8217;re fine with having retries more often. If the business logic fails and the checkpoint is not committed, it&#8217;ll reprocess the already-handled messages from the previously stored messages. Which means that they can be handled more than once, but you should not lose any messages. You may also not fully benefit from the idempotence check for skipping already handled messages.</p></li></ol><p>As always, the choice is yours.</p><p>Still, I hope that this article will show you why:</p><ul><li><p>having a global ordering guarantee can be useful,</p></li><li><p>why and how to checkpoint your processing, how they relate to level codes from old games like Super Frog,</p></li><li><p>What are the tradeoffs, and how to consider them,</p></li><li><p>&#8230;and that SQL IF statements are sometimes justified. But don&#8217;t go wild with them!</p></li></ul><p><strong>And hey, I also hope that&#8217;s not something that you&#8217;d like to maintain on your own. There are mature tools to deal with such stuff, like Emmett, which implements this for you.</strong></p><p>What are your thoughts? Questions? Concerns?</p><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[Consumers, projectors, reactors and all that messaging jazz in Emmett]]></title><description><![CDATA[Let me show you today how I designed messaging abstractions in Emmett, and why I split them into Consumers and Processors. Why I added abstractions like Projector and Reactor, and the trade-offs I chose. I think this can be a good learning experience on how messaging systems are built and the reasoning behind them!]]></description><link>https://www.architecture-weekly.com/p/consumers-projectors-reactors-and</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/consumers-projectors-reactors-and</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 01 Dec 2025 16:02:26 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/3a5c887e-55fd-47e3-a1e1-469bb0834f29_800x683.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Did you know that you can build an event store in one hour?</strong> I even did it a few times on the conference stage. Actually, it took me usually around 25 minutes; the rest was mistyping, lame jokes and a bit of explanation. See:</p><div id="youtube2-gaoZdtQSOTo" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;gaoZdtQSOTo&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/gaoZdtQSOTo?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><strong>Yet, my final thought was: Kids don&#8217;t do it at home.</strong></p><p>It&#8217;s a fun coding exercise, but using the outcome in production? Not as much fun running and maintaining it. Why though? How hard can it be?</p><p>Event Sourcing systems have two phases: appending events and processing them afterwards. The write side gets most of the attention in tutorials and talks - commands, deciders, event stores, optimistic concurrency, as you saw, I&#8217;m also one to blame.</p><p>Providing the guarantees on the write side is relatively simple, especially if you use a database like PostgreSQL as a storage. You need to provide features like:</p><ul><li><p>appending an event at the end of the stream,</p></li><li><p>reading all events from the stream,</p></li><li><p>a guarantee of the ordering within the stream,</p></li><li><p>being able to read your writes,</p></li><li><p>strong-consistent, atomic writes and optimistic concurrency.</p></li></ul><p>That can be solved with knowledge about transactions, database design, etc. So again, why so hard?</p><p>The processing side is where systems often struggle as they grow. This is where the Event Sourcing solution becomes an Event-Driven Messaging tool. And if you&#8217;ve read my previous articles, you know that this can be tricky at times.</p><p>How do you reliably process events to build read models? How do you trigger side effects without losing messages? How do you scale processing independently from writes? How do you make it performant and run multiple handlers in parallel?</p><p>I&#8217;ve been working on the message processing architecture in <a href="https://github.com/event-driven-io/emmett">Emmett</a> for a while now. I&#8217;ll try to explain how I designed the split between Consumers and Processors, the problems it solves, and the tradeoffs involved.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OVgV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97019ab-3f11-42d6-8915-eb69b57024a0_800x683.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OVgV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97019ab-3f11-42d6-8915-eb69b57024a0_800x683.jpeg 424w, https://substackcdn.com/image/fetch/$s_!OVgV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97019ab-3f11-42d6-8915-eb69b57024a0_800x683.jpeg 848w, https://substackcdn.com/image/fetch/$s_!OVgV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97019ab-3f11-42d6-8915-eb69b57024a0_800x683.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!OVgV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97019ab-3f11-42d6-8915-eb69b57024a0_800x683.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OVgV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97019ab-3f11-42d6-8915-eb69b57024a0_800x683.jpeg" width="800" height="683" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c97019ab-3f11-42d6-8915-eb69b57024a0_800x683.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:683,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Consumers &amp; Processors&quot;,&quot;title&quot;:&quot;Consumers &amp; Processors&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Consumers &amp; Processors" title="Consumers &amp; Processors" srcset="https://substackcdn.com/image/fetch/$s_!OVgV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97019ab-3f11-42d6-8915-eb69b57024a0_800x683.jpeg 424w, https://substackcdn.com/image/fetch/$s_!OVgV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97019ab-3f11-42d6-8915-eb69b57024a0_800x683.jpeg 848w, https://substackcdn.com/image/fetch/$s_!OVgV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97019ab-3f11-42d6-8915-eb69b57024a0_800x683.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!OVgV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97019ab-3f11-42d6-8915-eb69b57024a0_800x683.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Why Split Consumers and Processors?</h2><p>When processing messages, we already know that someone produced them. We&#8217;re on the receiving end. Facts are already known; now we need to do something about them.</p><p>When we process them, do we care about the source? Typically, we take the information it gathers and reason about it. For instance, when we received an event indicating that a room reservation was made, we may need to send an email with details to the consumer, update the reservations dashboard, and generate a pro forma invoice. We may have specific logic, whether it came from our internal reservation platform or Booking.com, but we know the source from the message payload.</p><p>That seems obvious, but it was an important realisation for me. When we&#8217;re building a read model in MongoDB, we don&#8217;t care if events come from PostgreSQL event store, EventStoreDB, RabbitMQ queue or Kafka topic.</p><p>It needs events and the projection logic. Of course, it needs to know the guarantees around: delivery, ordering, idempotency, etc., but besides that? The message&#8217;s source doesn&#8217;t matter to its logic.</p><p>Similarly, a component polling PostgreSQL for messages to publish them doesn&#8217;t care what happens to those events - whether they update read models or trigger webhooks is irrelevant to polling logic.</p><p>These concerns are orthogonal.</p><p>I realised that much of the complexity comes from coupling those two together. We wouldn&#8217;t like to change our processing logic because of an internal change in how they&#8217;re produced, or vice versa. I concluded that separating them means each can evolve independently. And I came with the initial idea for the split: Consumers and Message Processors.</p><p><strong>Consumers</strong> are responsible for getting messages from a source and forwarding them to processors. Think of them as the &#8220;delivery mechanism.&#8221; They handle the &#8220;where do events come from&#8221; concern. A consumer might connect to:</p><ul><li><p>A PostgreSQL event store, polling the events table,</p></li><li><p>EventStoreDB, using push-based catch-up subscriptions,</p></li><li><p>Kafka, consuming from topics,</p></li><li><p>Any other message source you might have.</p></li></ul><p><strong>Processors</strong> are responsible for doing something meaningful with those messages. They handle the &#8220;what do we do with messages&#8221; concern. A processor might:</p><ul><li><p>Update a read model in PostgreSQL or MongoDB,</p></li><li><p>Call an external API when certain events occur,</p></li><li><p>Publish events to Kafka or send webhooks,</p></li><li><p>Trigger workflow steps or saga operations.</p></li></ul><p>This separation follows Unix philosophy: small, focused components connected by simple interfaces. Each piece does one thing well. You can plug any processor into any consumer. This gives you flexibility that matters in practice:</p><ul><li><p>You can run the same projection logic against different event sources</p></li><li><p>Adding new processors doesn&#8217;t require changing consumer code</p></li><li><p>You can test processors in isolation with fake event streams</p></li><li><p>Consumers and processors can be scaled independently</p></li></ul><p>Let me show you how this looks in practice.</p><h2>How Consumers Work</h2><p>I deliberately decided to keep Consumers as dumb as possible. A consumer&#8217;s entire job is:</p><ol><li><p>Connect to a message source,</p></li><li><p>Poll messages in batches or subscribe to notifications (depending on the source specifics).</p></li><li><p>Forward them to all registered processors.</p></li><li><p>Go back to step 2.</p></li></ol><p>That&#8217;s it. No business logic. No complex state management. No decision-making about what to do with messages. Consumers are essentially routers.</p><p>Why this simplicity? I&#8217;m a simple guy; I like clear boundaries for responsibility. They help me reason about both how to use the tool and how to handle it when things go wrong. When message delivery breaks, you want to know exactly where to look. With a simple consumer, the question is binary: did it deliver messages or didn&#8217;t it? There&#8217;s no complex interaction between delivery logic and processing logic to untangle at 3 AM.</p><p>For EventStoreDB, the consumer creates a single subscription and fans out messages to all registered processors. For PostgreSQL, it polls the message table in batches, handling ordering guarantees (we&#8217;ll get to why that later).</p><p>Here&#8217;s what a basic consumer setup looks like for PostgreSQL :</p><pre><code><code>const consumer = postgreSQLEventStoreConsumer({
  connectionString,
  processors: [
    shoppingCartDetailsProjector,
    customerAnalyticsProjector,
    orderNotificationReactor
  ]
});

await consumer.start();</code></code></pre><p>Set up looks accordingly for the other sources providing options specific for the source, e.g. fthe or EventStoreDB store, you may want to provide the category stream name:</p><pre><code><code>const consumer = eventStoreDBEventStoreConsumer({
  connectionString,
  from: { stream: &#8216;$ce-roomRservations&#8217;, options: { resolveLinkTos: true } },
});</code></code></pre><p>The consumer receives messages, batches them and forwards them to all processors. Each processor handles messages independently - they don&#8217;t know about each other, and they don&#8217;t need to.</p><h2>Why Batching Belongs to Consumers</h2><p>Who decides how many messages to fetch at once? That&#8217;s a tricky question. I think that batching can happen both at the consumer and processor levels. The consumer decides the batch size for the polling or receiving to tune the receiving throughput. The processor can either align with it, using those batches as a safe default or diverge to its specifics. Read more on <a href="https://event-driven.io/en/batching_async_enumerable/">Why you should batch message processing in my other article</a>.</p><p>Different message sources have different optimal batch sizes. PostgreSQL might be efficient with 100-row fetches. EventStoreDB subscriptions don&#8217;t have built-in batching; they deliver events as they arrive. Kafka has its own batching semantics. These are all source-specific optimisations we should be able to apply without ending up with the lowest common denominator.</p><p>Processors, by default, can just receive batches and process them. Then they can decide whether to split batches into smaller chunks, group them into even larger chunks, or process them as single messages. For instance, PostgreSQL can handle random single updates pretty well, whereas <a href="https://event-driven.io/en/projecting_from_marten_to_elasticsearch/">Elastic prefers batching updates</a>.</p><h2>How Processors Work</h2><p>If consumers are simple routers, processors are where the interesting work happens. They&#8217;re the smarter ones in this relationship. A processor is responsible for:</p><ul><li><p><strong>Processing logic</strong>: Actually doing something useful with events. Updating a read model, sending an email, calling an API.</p></li><li><p><strong>Checkpointing</strong>: Tracking which messages have been processed. This is crucial - without it, you&#8217;d reprocess everything from the beginning every time you restart.</p></li><li><p><strong>Error handling</strong>: Deciding what to do when processing fails. Retry? Skip? Stop everything?</p></li><li><p><strong>Idempotency</strong>: Doing their best to ensure that reprocessing the same event doesn&#8217;t cause problems. Of course, still assuming that handlers should be idempotent, read more in <a href="https://event-driven.io/en/idempotent_command_handling/">my other article</a>,</p></li><li><p><strong>Backpressure:</strong> They need to be able to tell consumers that they cannot process more messages at the moment, and that the consumer needs to slow down delivery.</p></li></ul><p>Those are general promises and common stuff for the message processing logic. Still, there are multiple reasons why you want to process incoming messages:</p><ul><li><p><strong>Read models (projections)</strong> transform events into queryable state. For instance, shopping cart events - <em>ProductItemAdded</em>, <em>ProductItemRemoved</em>, <em>ShoppingCartConfirmed</em> - need to become a document showing current items, quantities, and totals. Something your API can quickly return when a user opens their cart.</p></li><li><p><strong>Reactions</strong> trigger side effects after a business fact has happened. When a shopping cart is confirmed, you may want to send a confirmation email, notify the shipment module, and register a new order. These things need to happen, but they&#8217;re not part of the core business logic.</p></li><li><p><strong>Workflows</strong> coordinate multi-step processes across multiple streams. An order might involve payment processing, inventory reservation, and shipping coordination - each with its own state and events.</p></li><li><p><strong>Integration</strong> means forwarding events to other systems. Other services in your systems might need to know about orders. External partners might need webhook notifications. You might publish to messaging systems for downstream consumers.</p></li></ul><p>All of those processing needs a bit different ways to handle reliability, ordering, throughput, etc. Also, all tools we integrate with require a different approach: storing the read model in PostgreSQL will be <em>quite</em> different from forwarding a message to Kafka.</p><p>I wouldn&#8217;t like to handwave all of those specifics and end up with the lowest common denominator. That&#8217;s why I decided to group them into the following <em>archetypes</em>:</p><ul><li><p>projectors,</p></li><li><p>reactors,</p></li><li><p><a href="https://www.architecture-weekly.com/p/workflow-engine-design-proposal-tell">workflows</a>,</p></li><li><p>allow custom message processors to allow people to tune it fully to their needs,</p></li><li><p>and in the future, stuff like forwarders, web hooks and others we find useful.</p></li></ul><p>All of them should have a unified API that allows them to be plugged into different consumers, while also embracing differences in message processing and target API specifics.</p><p>That&#8217;s also why each message processing target (PostgreSQL, EventStoreDB, MongoDB, InMemory, Kafka, SQS, etc.) will have its own implementations.</p><p>I believe that this focused responsibility, different archetypes, and specific implementations for different tools will strike the right balance between reusability and avoiding the lowest common denominator. We&#8217;ll see if that&#8217;s not a famous last words.</p><p>Read also more in:</p><ul><li><p><a href="https://github.com/event-driven-io/emmett/pull/257/files">My RFC for Workflow Processing</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/compilation-isnt-just-for-programming">How message pipelines can be technically implemented</a>.</p></li></ul><p>The example projector can look like that:</p><pre><code><code>const projection = pongoSingleStreamProjection({
  collectionName: shoppingCartsSummaryCollectionName,
  evolve,
  canHandle: [&#8217;ProductItemAdded&#8217;, &#8216;ShoppingCartConfirmed&#8217;],
  initialState: () =&gt; ({
    status: &#8216;pending&#8217;,
    productItemsCount: 0,
 }),
});


const postgreSQLProjector = postgreSQLProjector({ projection });</code></code></pre><pre><code><code>const reactor = postgreSQLReactor({
    processorId: &#8216;order-notifications&#8217;,
    canHandle: [&#8217;ShoppingCartConfirmed&#8217;],
    eachMessage: (event) =&gt; 
      emailService.sendOrderConfirmation(event.data.customerId);    
  }</code></code></pre><h2>Native implementations of processors</h2><p>Different storage requirements require different capabilities, and getting proper guarantees might involve deeper knowledge. For instance, <a href="https://event-driven.io/en/ordering_in_postgres_outbox/">Postgres sequences issues can impact your messaging guarantees</a>. Those are cases where, when you&#8217;re starting, you might not anticipate. Test environments may not even catch it; you might realise you&#8217;re losing business data when you reach production. That&#8217;s why it&#8217;s, imho, better to have a tool that solves it rather than trying to maintain it on your own, making technical infrastructure something you need to keep working on instead of your business features. How does <a href="https://github.com/event-driven-io/emmett">Emmett</a> solve them? Let&#8217;s discuss them briefly. I&#8217;ll try to expand in the future posts about the details.</p><h2>Resilience</h2><p>What happens if the processor fails? By default, it stops processing. But only this one, the consumer keeps pushing events to the other processors that can continue. Consumer stops when all their processors are inactive.</p><p>Why? Consider this scenario: you have two processors, one updating MongoDB and another updating PostgreSQL. MongoDB becomes temporarily unavailable. Should that stop PostgreSQL updates?</p><p>Still, failure behaviour is configurable; your message handler can return:</p><ul><li><p><strong>void/ACK</strong>: message processed successfully, continue to the next one.</p></li><li><p><strong>Skip</strong>: Skip this message, useful for poison messages that consistently fail</p></li><li><p><strong>Stop</strong>: Stop this processor entirely.</p></li></ul><p>Why have <code>skip</code> separate from ACK? Consider a poison message - a message that causes your processor to fail every time. Without <code>skip</code>, you have two bad options: fail forever (blocking all processing) or ACK it (pretending you processed it). With <em>Skip</em>, you can move it to a dead-letter queue for investigation while continuing to process other messages.</p><p>For now, <a href="https://github.com/event-driven-io/emmett">Emmett</a> doesn&#8217;t support Dead Letter/Poison Message Queues out of the box, but they will be supported in the future. You could already append those events to some specific stream.</p><p>In upcoming releases, we&#8217;ll also have configurable retry policies based on error type and other factors. Just like we already have for command handlers (e.g. to retry 3 times with exponential backoff for Optimistic Concurrency error).</p><p>There&#8217;s no easy answer to when to stop and when to skip poison messages. Neither choice is universally correct. A financial system might need all-or-nothing semantics. A social media feed can tolerate inconsistency between views.</p><p>That&#8217;s also why you can freely group processors within consumers. Best if they share similar resiliency and desired throughput characteristics. If they&#8217;re very different, you can always spin up another consumer for the same source and process it differently.</p><h2>Checkpointing processing</h2><p>In Emmett, <strong>processors own their checkpoints</strong>. Each processor independently tracks the last message it processed. The consumer doesn&#8217;t maintain any checkpoint state.</p><p>When a consumer starts up, it asks all registered processors for their last processed position and starts polling from the earliest one.</p><p>It has several benefits:</p><ul><li><p><strong>Independent progress</strong>: Processors can move at different speeds. If your MongoDB projector is fast and your analytics processor can get slow at times, they each track their own progress. The slow one doesn&#8217;t hold back the fast one.</p></li><li><p><strong>Isolated failures</strong>: If one processor&#8217;s checkpoint storage fails, only that processor is affected. Others continue working.</p></li><li><p><strong>Easy replay</strong>: To rebuild a single projection, you just reset that processor&#8217;s checkpoint. No need to coordinate with other processors or manage a global position.</p></li><li><p><strong>Flexibility</strong>: Processors can store checkpoints wherever makes sense - in the same database as their read model, in a separate checkpoint table, or anywhere else.</p></li><li><p><strong>Capability to redistribute the load.</strong> As mentioned in the previous points, if you observe that one of the processors is slower or demands more resources, you can freely deploy it separately in a different consumer, and it&#8217;ll start where it left off.</p></li></ul><p>The tradeoff? When a consumer restarts, it might poll events that most processors have already seen. If one processor is significantly behind, all processors receive those events again (they just skip them based on their checkpoints). This is why you should group processors by their typical processing pace - don&#8217;t put a real-time dashboard projector and a monthly analytics batch processor on the same consumer.</p><h2>Backpressure: When Processors Can&#8217;t Keep Up</h2><p>Backpressure occurs when processors can&#8217;t process messages fast enough for the consumer to deliver them. This is a real operational concern that needs explicit handling.</p><p><a href="https://github.com/event-driven-io/emmett">Emmett</a> doesn&#8217;t support it at the moment, but here&#8217;s what I&#8217;m thinking about it.</p><p>There are several strategies, each with tradeoffs:</p><p><strong>1. Ignore backpressure</strong>: Consumer keeps polling and pushing regardless of processor state.</p><ul><li><p>Pro: Simple, maximum throughput when processors can keep up</p></li><li><p>Con: Memory grows unbounded, possible OOM, cascading failures</p></li></ul><p><strong>2. Stop on any slowdown</strong>: If any processor signals it&#8217;s overwhelmed, stop polling.</p><ul><li><p>Pro: Safe, no resource exhaustion</p></li><li><p>Con: Slowest processor determines overall throughput</p></li></ul><p><strong>3. Force synchronised pace</strong>: All processors must process each batch before the next is fetched.</p><ul><li><p>Pro: All processors stay in sync, predictable memory usage</p></li><li><p>Con: The Slowest processor becomes the bottleneck for all</p></li></ul><p><strong>4. Slow down ingress</strong>: Adaptively reduce polling rate based on processor feedback.</p><ul><li><p>Pro: Balances throughput and stability</p></li><li><p>Con: More complex, needs tuning</p></li></ul><p><strong>5. Rolling buffer</strong>: Buffer messages up to a limit, retry delivery to slow processors.</p><ul><li><p>Pro: Absorbs temporary slowdowns, maximises throughput</p></li><li><p>Con: Needs memory limits, complex failure handling</p></li></ul><p>Different systems need different strategies. Real-time dashboards might use strategy 1 (drop messages rather than lag). Financial transactions might use strategy 3 (consistency over throughput). Event forwarding to Kafka might use strategy 5 (buffer temporary network issues).</p><p>I&#8217;m leaning toward making this configurable per consumer, with sensible defaults. The default would be a bounded buffer with adaptive polling slowdown.</p><h2>Scaling: Current State and Future Plans</h2><p>For now, the big benefit of having dumb consumers is that you can scale them horizontally. Of course, this works for offset-based solutions like event stores and streaming tools like Kafka. It may not always work for systems that remove the message once it&#8217;s handled. Still, current consumers are using only event stores as sources; Kafka will likely come next.</p><p>You can group processors into consumers by that, reducing the number of polling jobs (one consumer polls/subscribes to one source).</p><p>I already mentioned batching, which should also increase the throughput.</p><p>Running multiple instances of the same processor causes conflicts. Both process the same events, update the same read models, and corrupt the state. <a href="https://github.com/event-driven-io/emmett">Emmett</a> already has the basic capability to do <a href="https://www.architecture-weekly.com/p/distributed-locking-a-practical-guide">distributed locking</a>, but it&#8217;s not fully plugged yet. This will come in future releases.</p><p>For now, checkpointing can detect whether a newer checkpoint is already stored (which can suggest another processor is running) and stop processing.</p><p>The recommended approach is to run the consumer as a separate service from the API. Then you can scale it separately. You can also set replicas=1 for the specific consumer to ensure one instance.</p><h2>Rebuilding Projections</h2><p>Event Sourcing enables rebuilding read models from events. Bug in projection? Fix code, rebuild. New read model? Populate from history.</p><p>With processor-owned checkpoints, you can either rebuild read model from scratch by:</p><ol><li><p>Stopping the processor.</p></li><li><p>Delete read model data.</p></li><li><p>Reset the checkpoint to the beginning.</p></li><li><p>Restart processing.</p></li></ol><p>Or doing blue greeen by:</p><ol><li><p>Creating a new version of your storage (with Pongo, it&#8217;s just adding a suffix or prefix to your collection name).</p></li><li><p>Start consumer since the beginning.</p></li><li><p>Check if read models are close enough, and stopthe old processor</p></li><li><p>Start processing.</p></li></ol><p>In <a href="https://github.com/event-driven-io/emmett">Emmett</a> you have even some syntactic sugar on top of consumers and processors to make this easier:</p><pre><code><code>import { rebuildPostgreSQLProjections } from &#8216;@event-driven-io/emmett-postgresql&#8217;;

const rebuilder = rebuildPostgreSQLProjections({
  connectionString,
  projection: shoppingCartsSummaryProjectionV2
});

await rebuilder.start();</code></code></pre><p>This will spin up a new consumer; other consumers and processors continue normally, with their checkpoints unaffected. You can specify the position from which you want to start, and also whether to truncate the end storage.</p><p>We&#8217;ll need more metrics like gap detection and distributed locking to make it more plug-and-play.</p><h2>Wrapping Up</h2><p>The consumer/processor architecture in <a href="https://github.com/event-driven-io/emmett">Emmett</a> is about making event processing concerns explicit and separable:</p><p><strong>Consumers</strong> handle delivery - getting events from sources to processors. They&#8217;re simple by design. When delivery breaks, you know where to look.</p><p><strong>Processors</strong> handle processing - doing useful things with events. They own their checkpoints, track their own progress, and handle their own failures.</p><p>This separation gives you:</p><ul><li><p>Flexibility to mix and match consumers and processors,</p></li><li><p>Independent scaling of different processing workloads,</p></li><li><p>Isolated failure domains,</p></li><li><p>Easy projection rebuilds,</p></li><li><p>Testability at multiple levels.</p></li></ul><p>The design makes tradeoffs explicit:</p><ul><li><p>Partial progress over all-or-nothing (configurable soon),</p></li><li><p>Processor-owned checkpoints over global tracking,</p></li><li><p>Simplicity in consumers, complexity in processors,</p></li><li><p>Eventual consistency for async operations.</p></li></ul><p>There&#8217;s more to build - distributed locking, partitioning, better backpressure handling. There&#8217;s still a lot to do, but I believe the foundation is there, and I know real applications are using it already.</p><p>I hope that this is a good food for thought, even if you&#8217;re not using <a href="https://github.com/event-driven-io/emmett">Emmett</a>. I&#8217;m curious about your thoughts and feedback. I&#8217;ll try to tackle those cases in more detail in dedicated articles.</p><p>If you have questions, feedback, or would like to help me speed up the planned stuff, come chat in the <a href="https://discord.gg/fTpqUTMmVa">Emmett Discord</a>. We have a small, but welcoming and awesome community.</p><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[Requeuing Roulette in Event-Driven Architecture and Messaging]]></title><description><![CDATA[I&#8217;m always saying that there&#8217;s a thin line between good and bad practice, and this thin line is named &#8220;Context&#8221;. That&#8217;s also true for the (anti-)pattern I&#8217;m calling &#8220;Requeuing Roulette&#8221;.What&#8217;s the Requeuing Roulette? As the name suggests, this technique involves putting a message back into the queue. It also (correctly) suggests that we&#8217;re hoping for the best. And sometimes we may be lucky to be true. Let's discuss pros and cons of it!]]></description><link>https://www.architecture-weekly.com/p/requeuing-roulette-in-event-driven</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/requeuing-roulette-in-event-driven</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 17 Nov 2025 18:35:13 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f5c36609-89cf-4823-91a1-3b28c57f2eae_626x391.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Welcome to the new week!</p><p>I&#8217;m always saying that there&#8217;s a thin line between good and bad practice, and this thin line is named &#8220;Context&#8221;.</p><p><strong>That&#8217;s also true for the (anti-)pattern I&#8217;m calling </strong><em><strong>&#8220;Requeuing Roulette&#8221;</strong></em>. Let&#8217;s discuss it today, continuing the &#8220;race condition series&#8221;:</p><ul><li><p><a href="https://event-driven.io/en/dealing_with_race_conditions_in_eda_using_read_models/">Dealing with Race Conditions in Event-Driven Architecture with Read Models</a>,</p></li><li><p><a href="https://event-driven.io/en/strict_ordering_in_event_handling/">Handling Events Coming in an Unknown Order</a>.</p></li></ul><p>What&#8217;s the Requeuing Roulette? As the name suggests, this <em>technique</em> involves putting a message back into the queue. It also (correctly) suggests that we&#8217;re hoping for the best. And sometimes we may be lucky to be true.</p><p>The basic primitive for a messaging system is a queue. The producer is putting messages into the queue, and the consumer is getting them on the other end. If everything goes well, the consumer receives them in the order the producer put them (thus, a queue, like a queue in a shop).</p><p>If the consumer is not available, the messaging system will try to deliver messages and handle retries for us.</p><p>We discussed it in detail in:</p><ul><li><p><a href="https://www.architecture-weekly.com/p/architecture-weekly-190-queuing-backpressure">Queuing, Backpressure, Single Writer and other useful patterns for managing concurrency</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/the-order-of-things-why-you-cant">Ordering, Grouping and Consistency in Messaging systems</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/the-order-of-things-why-you-cant">The Order of Things: Why You Can&#8217;t Have Both Speed and Ordering in Distributed Systems</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/dealing-with-eventual-consistency">Dealing with Eventual Consistency, and Causal Consistency using Predictable Identifiers</a></p></li></ul><p>Ordering of processing works if we have a single consumer for a single queue. If we have more than one consumer, we lose the ordering guarantee. Why would we want to have more than one consumer? Obviously, to speed up processing. If messages in the queue are not causally correlated, then we can process them in parallel.</p><p>What does the smartass &#8220;causally correlated&#8221; even mean? For instance:</p><ul><li><p>depositing money into a bank account is causally correlated to opening it, as we can&#8217;t deposit money if we don&#8217;t open it.</p></li><li><p>depositing money into a bank account is NOT causally correlated to other deposits, as we can deposit as much money as we have (of course, ignoring weird regulations),</p></li><li><p>money withdrawal is causally correlated to depositing money and other withdrawals, as we need to check the balance, and they may impact it.</p></li><li><p>Withdrawals and deposits are only causally correlated if they happen on the same bank account; other bank account operations can happen at any time.</p></li></ul><p>You get the idea, aye?</p><p>So if we set up a queue to process money transfer events, then it could look as follows:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!i9BO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc271ef58-6806-4fa6-8d18-aa01e1556bd8_800x356.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!i9BO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc271ef58-6806-4fa6-8d18-aa01e1556bd8_800x356.png 424w, https://substackcdn.com/image/fetch/$s_!i9BO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc271ef58-6806-4fa6-8d18-aa01e1556bd8_800x356.png 848w, https://substackcdn.com/image/fetch/$s_!i9BO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc271ef58-6806-4fa6-8d18-aa01e1556bd8_800x356.png 1272w, https://substackcdn.com/image/fetch/$s_!i9BO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc271ef58-6806-4fa6-8d18-aa01e1556bd8_800x356.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!i9BO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc271ef58-6806-4fa6-8d18-aa01e1556bd8_800x356.png" width="800" height="356" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c271ef58-6806-4fa6-8d18-aa01e1556bd8_800x356.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:356,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;queue1&quot;,&quot;title&quot;:&quot;queue1&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="queue1" title="queue1" srcset="https://substackcdn.com/image/fetch/$s_!i9BO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc271ef58-6806-4fa6-8d18-aa01e1556bd8_800x356.png 424w, https://substackcdn.com/image/fetch/$s_!i9BO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc271ef58-6806-4fa6-8d18-aa01e1556bd8_800x356.png 848w, https://substackcdn.com/image/fetch/$s_!i9BO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc271ef58-6806-4fa6-8d18-aa01e1556bd8_800x356.png 1272w, https://substackcdn.com/image/fetch/$s_!i9BO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc271ef58-6806-4fa6-8d18-aa01e1556bd8_800x356.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I assumed that we&#8217;re following the advice from the <a href="https://event-driven.io/en/strict_ordering_in_event_handling/">previous article</a>. Besides the event type and payload, we&#8217;d also pass the record revision, which represents the logical order of events. It comes from the number incremented with each change. Assuming we&#8217;re publishing events after each successful business logic handling, it should be gapless.</p><p>You may notice that our queue actually has multiple timelines for each causally correlated message sequence. If we simplify our considerations and assume that all events from a certain account are causally correlated, then we could visualise them as:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8J4J!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c4b06df-fce2-45e6-b474-5596260bd0e8_800x356.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8J4J!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c4b06df-fce2-45e6-b474-5596260bd0e8_800x356.png 424w, https://substackcdn.com/image/fetch/$s_!8J4J!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c4b06df-fce2-45e6-b474-5596260bd0e8_800x356.png 848w, https://substackcdn.com/image/fetch/$s_!8J4J!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c4b06df-fce2-45e6-b474-5596260bd0e8_800x356.png 1272w, https://substackcdn.com/image/fetch/$s_!8J4J!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c4b06df-fce2-45e6-b474-5596260bd0e8_800x356.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8J4J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c4b06df-fce2-45e6-b474-5596260bd0e8_800x356.png" width="800" height="356" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4c4b06df-fce2-45e6-b474-5596260bd0e8_800x356.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:356,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;queue2&quot;,&quot;title&quot;:&quot;queue2&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="queue2" title="queue2" srcset="https://substackcdn.com/image/fetch/$s_!8J4J!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c4b06df-fce2-45e6-b474-5596260bd0e8_800x356.png 424w, https://substackcdn.com/image/fetch/$s_!8J4J!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c4b06df-fce2-45e6-b474-5596260bd0e8_800x356.png 848w, https://substackcdn.com/image/fetch/$s_!8J4J!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c4b06df-fce2-45e6-b474-5596260bd0e8_800x356.png 1272w, https://substackcdn.com/image/fetch/$s_!8J4J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c4b06df-fce2-45e6-b474-5596260bd0e8_800x356.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>It&#8217;s fine</strong> to process events that are <strong>not</strong> causally correlated, as by that we&#8217;re increasing throughput, not trading off correctness:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!st5c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f3cf36-cc54-4a18-b4f4-4e5869a808b3_800x345.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!st5c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f3cf36-cc54-4a18-b4f4-4e5869a808b3_800x345.png 424w, https://substackcdn.com/image/fetch/$s_!st5c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f3cf36-cc54-4a18-b4f4-4e5869a808b3_800x345.png 848w, https://substackcdn.com/image/fetch/$s_!st5c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f3cf36-cc54-4a18-b4f4-4e5869a808b3_800x345.png 1272w, https://substackcdn.com/image/fetch/$s_!st5c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f3cf36-cc54-4a18-b4f4-4e5869a808b3_800x345.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!st5c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f3cf36-cc54-4a18-b4f4-4e5869a808b3_800x345.png" width="800" height="345" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21f3cf36-cc54-4a18-b4f4-4e5869a808b3_800x345.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:345,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;queue3&quot;,&quot;title&quot;:&quot;queue3&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="queue3" title="queue3" srcset="https://substackcdn.com/image/fetch/$s_!st5c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f3cf36-cc54-4a18-b4f4-4e5869a808b3_800x345.png 424w, https://substackcdn.com/image/fetch/$s_!st5c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f3cf36-cc54-4a18-b4f4-4e5869a808b3_800x345.png 848w, https://substackcdn.com/image/fetch/$s_!st5c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f3cf36-cc54-4a18-b4f4-4e5869a808b3_800x345.png 1272w, https://substackcdn.com/image/fetch/$s_!st5c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21f3cf36-cc54-4a18-b4f4-4e5869a808b3_800x345.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But <strong>it&#8217;s not fine</strong> to process messages from the same timeline in parallel, as we may get race conditions when the consumer processing the earlier message will be slower than the one processing the later message.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lx-j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c4d66-fe14-47b2-8d7a-8762475a23c4_800x321.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lx-j!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c4d66-fe14-47b2-8d7a-8762475a23c4_800x321.png 424w, https://substackcdn.com/image/fetch/$s_!lx-j!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c4d66-fe14-47b2-8d7a-8762475a23c4_800x321.png 848w, https://substackcdn.com/image/fetch/$s_!lx-j!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c4d66-fe14-47b2-8d7a-8762475a23c4_800x321.png 1272w, https://substackcdn.com/image/fetch/$s_!lx-j!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c4d66-fe14-47b2-8d7a-8762475a23c4_800x321.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lx-j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c4d66-fe14-47b2-8d7a-8762475a23c4_800x321.png" width="800" height="321" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/740c4d66-fe14-47b2-8d7a-8762475a23c4_800x321.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:321,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;queue4&quot;,&quot;title&quot;:&quot;queue4&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="queue4" title="queue4" srcset="https://substackcdn.com/image/fetch/$s_!lx-j!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c4d66-fe14-47b2-8d7a-8762475a23c4_800x321.png 424w, https://substackcdn.com/image/fetch/$s_!lx-j!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c4d66-fe14-47b2-8d7a-8762475a23c4_800x321.png 848w, https://substackcdn.com/image/fetch/$s_!lx-j!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c4d66-fe14-47b2-8d7a-8762475a23c4_800x321.png 1272w, https://substackcdn.com/image/fetch/$s_!lx-j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F740c4d66-fe14-47b2-8d7a-8762475a23c4_800x321.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>We actually have two orders in messaging systems:</strong></p><ol><li><p><strong>Queue Order:</strong> The order messages are produced to the queue</p></li><li><p><strong>Processing Order:</strong> The order messages are actually consumed and processed.</p></li></ol><p>We already learned that we can detect the out-of-order processing by:</p><ul><li><p><a href="https://event-driven.io/en/dealing_with_race_conditions_in_eda_using_read_models/">business rules or</a>,</p></li><li><p><a href="https://event-driven.io/en/strict_ordering_in_event_handling/">detecting the gap between the last processed revision and the event revision</a>.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!icIa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18b52576-663b-4215-a48e-d88ad859f584_800x245.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!icIa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18b52576-663b-4215-a48e-d88ad859f584_800x245.png 424w, https://substackcdn.com/image/fetch/$s_!icIa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18b52576-663b-4215-a48e-d88ad859f584_800x245.png 848w, https://substackcdn.com/image/fetch/$s_!icIa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18b52576-663b-4215-a48e-d88ad859f584_800x245.png 1272w, https://substackcdn.com/image/fetch/$s_!icIa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18b52576-663b-4215-a48e-d88ad859f584_800x245.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!icIa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18b52576-663b-4215-a48e-d88ad859f584_800x245.png" width="800" height="245" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/18b52576-663b-4215-a48e-d88ad859f584_800x245.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:245,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;queue5&quot;,&quot;title&quot;:&quot;queue5&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="queue5" title="queue5" srcset="https://substackcdn.com/image/fetch/$s_!icIa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18b52576-663b-4215-a48e-d88ad859f584_800x245.png 424w, https://substackcdn.com/image/fetch/$s_!icIa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18b52576-663b-4215-a48e-d88ad859f584_800x245.png 848w, https://substackcdn.com/image/fetch/$s_!icIa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18b52576-663b-4215-a48e-d88ad859f584_800x245.png 1272w, https://substackcdn.com/image/fetch/$s_!icIa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18b52576-663b-4215-a48e-d88ad859f584_800x245.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We can also apply techniques like the <em><a href="https://event-driven.io/en/dealing_with_race_conditions_in_eda_using_read_models/">Phantom record</a></em>, where we keep data as it comes and take the next steps only when defined conditions are met. And that&#8217;s also what I&#8217;d recommend in general, but&#8230;</p><p>But I promised you to talk today about the <em>Requeuing Roulette</em> (anti)pattern, aye? So let&#8217;s do it!</p><p>If we&#8217;re using tools like RabbitMQ, SQS and other classical messaging tooling (so not you Kafka! You&#8217;re a streaming or log solution!), then we can put the message back in the queue.</p><p><a href="https://www.rabbitmq.com/docs/semantics#ordering">RabbitMQ message ordering documentation states</a>:</p><blockquote><p>Messages published in one channel, passing through one exchange and one queue and one outgoing channel will be received in the same order that they were sent. RabbitMQ offers stronger guarantees since release 2.7.0.</p><p>Messages can be returned to the queue using AMQP methods that feature a requeue parameter (basic.recover, basic.reject and basic.nack), or due to a channel closing while holding unacknowledged messages. Any of these scenarios caused messages to be requeued at the back of the queue for RabbitMQ releases earlier than 2.7.0. From RabbitMQ release 2.7.0, messages are always held in the queue in publication order, even in the presence of requeueing or channel closure.</p><p>With release 2.7.0 and later it is still possible for individual consumers to observe messages out of order if the queue has multiple subscribers. This is due to the actions of other subscribers who may requeue messages. From the perspective of the queue the messages are always held in the publication order.</p></blockquote><p>The last paragraph seems promising, as it suggests the message will be put back before the next messages, since it was placed in the queue.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Tk2z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54438ead-3cbb-4d00-be91-4e2c2519a5c8_800x291.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Tk2z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54438ead-3cbb-4d00-be91-4e2c2519a5c8_800x291.png 424w, https://substackcdn.com/image/fetch/$s_!Tk2z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54438ead-3cbb-4d00-be91-4e2c2519a5c8_800x291.png 848w, https://substackcdn.com/image/fetch/$s_!Tk2z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54438ead-3cbb-4d00-be91-4e2c2519a5c8_800x291.png 1272w, https://substackcdn.com/image/fetch/$s_!Tk2z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54438ead-3cbb-4d00-be91-4e2c2519a5c8_800x291.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Tk2z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54438ead-3cbb-4d00-be91-4e2c2519a5c8_800x291.png" width="800" height="291" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/54438ead-3cbb-4d00-be91-4e2c2519a5c8_800x291.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:291,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;queue6&quot;,&quot;title&quot;:&quot;queue6&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="queue6" title="queue6" srcset="https://substackcdn.com/image/fetch/$s_!Tk2z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54438ead-3cbb-4d00-be91-4e2c2519a5c8_800x291.png 424w, https://substackcdn.com/image/fetch/$s_!Tk2z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54438ead-3cbb-4d00-be91-4e2c2519a5c8_800x291.png 848w, https://substackcdn.com/image/fetch/$s_!Tk2z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54438ead-3cbb-4d00-be91-4e2c2519a5c8_800x291.png 1272w, https://substackcdn.com/image/fetch/$s_!Tk2z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54438ead-3cbb-4d00-be91-4e2c2519a5c8_800x291.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Unfortunately, it&#8217;s only a best effort. <a href="https://www.rabbitmq.com/docs/nack">Another place in the documentation states</a>:</p><blockquote><p>When a message is requeued, it will be placed to its original position in its queue, if possible. If not (due to concurrent deliveries and acknowledgements from other consumers when multiple consumers share a queue), the message will be requeued to a position closer to queue head.</p></blockquote><p>So in the worst case, it can even end up like that:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P619!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8f3197-ca8c-4b8b-8e2b-5b6646deda35_800x336.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P619!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8f3197-ca8c-4b8b-8e2b-5b6646deda35_800x336.png 424w, https://substackcdn.com/image/fetch/$s_!P619!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8f3197-ca8c-4b8b-8e2b-5b6646deda35_800x336.png 848w, https://substackcdn.com/image/fetch/$s_!P619!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8f3197-ca8c-4b8b-8e2b-5b6646deda35_800x336.png 1272w, https://substackcdn.com/image/fetch/$s_!P619!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8f3197-ca8c-4b8b-8e2b-5b6646deda35_800x336.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P619!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8f3197-ca8c-4b8b-8e2b-5b6646deda35_800x336.png" width="800" height="336" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4e8f3197-ca8c-4b8b-8e2b-5b6646deda35_800x336.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:336,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;queue5&quot;,&quot;title&quot;:&quot;queue5&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="queue5" title="queue5" srcset="https://substackcdn.com/image/fetch/$s_!P619!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8f3197-ca8c-4b8b-8e2b-5b6646deda35_800x336.png 424w, https://substackcdn.com/image/fetch/$s_!P619!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8f3197-ca8c-4b8b-8e2b-5b6646deda35_800x336.png 848w, https://substackcdn.com/image/fetch/$s_!P619!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8f3197-ca8c-4b8b-8e2b-5b6646deda35_800x336.png 1272w, https://substackcdn.com/image/fetch/$s_!P619!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e8f3197-ca8c-4b8b-8e2b-5b6646deda35_800x336.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Then the fun will begin. Now we have two messages that we already see will be handled out of order, and we&#8217;ll need to requeue them, hoping they land after the message we need to process as the next one. What if we have multiple messages like that? What if our consumer randomly fails when processing our message with revision 12, and we need to requeue it again? How lucky will it be with Requeuing Roulette?</p><p>As you can see, the more correlated our messages are and the more we&#8217;d like to parallelise them, the more likely we are to face Requeuing Roulette. Typically, we put messages into the same queue so we can process them in order.</p><h2>When Requeuing Roulette is helpful?</h2><p>It can be useful if:</p><ul><li><p>We want to get the best parallelism for message processing, and ordering best effort is good enough for us.</p></li><li><p>Our messages are most of the time not causally correlated or&#8230;</p></li><li><p>Events for the same records/processes are not quickly published one after another, so we could safely retry message before the next one will arrive,</p></li><li><p>Our consumers are stable, not failing too often.</p></li></ul><p>As you see, those assumptions can be fragile and classical <em>famous last words</em>.</p><p>Of course, we can use one of the techniques like:</p><ul><li><p>RabbitMQ routing key, correlation id,</p></li><li><p>AWS SQS message group id, visibility timeout,</p></li><li><p>Azure Service Bus sessions,</p></li><li><p>etc. see <a href="https://www.architecture-weekly.com/p/the-order-of-things-why-you-cant">Ordering, Grouping and Consistency in Messaging systems</a> for details.</p></li></ul><p>Making this trade-off more in favour of the ordering guarantee or parallelism may reduce it to an acceptable level, but you need to be aware of the risk of an unexpected traffic spike or a different message distribution than you expected.</p><h2>Dangers of Requeuing Roulette</h2><p>Even when order doesn&#8217;t matter, requeueing has a hidden cost that becomes visible under load.</p><p>Suppose you reject the message with requeue set to true. In that case, it can be redelivered to your consumer almost instantly, resulting in a very high workload, since your consumer will reject it again.</p><p>Let&#8217;s say you have a message that fails because a downstream service is down. You requeue it. It immediately returns to the consumer (maybe even the same one), fails again, and is requeued. This can happen hundreds of times per second. It can also swamp the slow consumer, preventing it from even recovering.</p><p>In the worst case, your CPU can be spent processing and requeueing the same 10 messages over and over, while thousands of processable messages sit behind them in the queue (because RabbitMQ will try to put requeued messages before the next messages).</p><h2>What about Kafka?</h2><p>Well, in Kafka, this issue doesn&#8217;t exist as Messages with the same record key go to the same partition, maintaining order within that partition while allowing parallel processing across partitions.</p><p>So Kafka, for the win? Hold your horses!</p><p>Only one consumer from the consumer group can handle a specific partition. So, within a single partition, parallelisation isn&#8217;t possible. If we map the RabbitMQ queue to Kafka&#8217;s partition, then the conclusion can be that Kafka solved it by removing this feature.</p><p>Also, when we consume a message from the classical messaging system (like RabbitMQ), it will be removed from the queue. In a streaming solution like Kafka/Pulsar, etc., they will remain in the log until the <a href="https://event-driven.io/en/gdpr_in_event_driven_architecture/#log-compaction">retention policy kicks in and drops old messages from the partition</a>.</p><p>Kafka maintains the offset of the last processed message in each topic partition. You don&#8217;t need to requeue messages; you can just rewind the offset to an older position when you want to reprocess messages.</p><p>Read more in <a href="https://www.architecture-weekly.com/p/kafka-consumers-under-the-hood-of">Kafka Consumers: Under the Hood of Message Processing</a></p><h2>TLDR</h2><p>The &#8220;requeueing roulette&#8221; is a symptom of trying to solve a distributed systems problem with a technical solution.</p><p>The requeueing roulette is seductive because it promises something impossible: maintaining strict order in a distributed, concurrent system without sacrificing throughput. It&#8217;s trying to cheat the fundamental trade-offs of distributed systems.</p><p>Still, cheating can take us far enough, but there&#8217;s always a danger that we&#8217;ll be caught and handcuffed.</p><p>If you&#8217;re considering using Requeuing Roulette, then consider the other techniques I described in previous articles. I&#8217;d treat Requeuing Roulette as a temporary solution and a tradeoff.</p><p>In my opinion, if you decide to use it, then the question isn&#8217;t whether you&#8217;ll abandon it, but how much pain you&#8217;ll endure before you do.</p><p>The real skill isn&#8217;t in making requeueing work - it&#8217;s in understanding your actual ordering requirements and choosing the most straightforward solution that meets them. Often, that means accepting that perfect ordering is neither necessary nor worth its cost, especially in the long term.</p><p>Read also more in:</p><ul><li><p><a href="https://event-driven.io/en/dealing_with_race_conditions_in_eda_using_read_models/">Dealing with Race Conditions in Event-Driven Architecture with Read Models</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/the-order-of-things-why-you-cant">The Order of Things: Why You Can&#8217;t Have Both Speed and Ordering in Distributed Systems</a>,</p></li><li><p><a href="https://event-driven.io/en/internal_external_events/">Internal and external events, or how to design event-driven API</a>,</p></li><li><p><a href="https://event-driven.io/en/simple_trick_for_idempotency_handling_in_elastic_search_readm_model/">Dealing with Eventual Consistency and Idempotency in MongoDB projections</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/architecture-weekly-190-queuing-backpressure">Queuing, Backpressure, Single Writer and other useful patterns for managing concurrency</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/the-order-of-things-why-you-cant">Ordering, Grouping and Consistency in Messaging systems</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/the-order-of-things-why-you-cant">The Order of Things: Why You Can&#8217;t Have Both Speed and Ordering in Distributed Systems</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/dealing-with-eventual-consistency">Dealing with Eventual Consistency, and Causal Consistency using Predictable Identifiers</a>.</p></li></ul><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/en/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/en/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[Handling Events Coming in an Unknown Order]]></title><description><![CDATA[In the last article we discussed on the scenario where you know what events you should receive, just not the order. But what if you don't know that? That's the question I got from Ben, let's answer that today and explain how using revisions can help you in reaching strict ordering.]]></description><link>https://www.architecture-weekly.com/p/handling-events-coming-in-an-unknown</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/handling-events-coming-in-an-unknown</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 03 Nov 2025 20:04:01 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ppfN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Welcome to the next week!</p><p><strong>After the last article on <a href="https://www.architecture-weekly.com/p/dealing-with-race-conditions-in-event">Dealing with Race Conditions in Event-Driven Architecture with Read Models</a>, I got such a <a href="https://www.architecture-weekly.com/p/dealing-with-race-conditions-in-event/comment/171420356">question from Ben</a>:</strong></p><blockquote><p>You described the scenario where you know what events you should receive, just not the order. But what if you don&#8217;t know that? For example, you get an ItemRemovedFromCart event, but the item doesn&#8217;t exist in your view of the current state of the cart. Is it an invalid event? Or is there an ItemAddedToCart event that hasn&#8217;t come through yet?</p></blockquote><p>That&#8217;s a good question, and good questions usually require more depth to give a precise answer. That&#8217;s what we&#8217;re here for!</p><p>Let&#8217;s follow up and discuss how to determine whether we have complete information for our events!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ppfN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ppfN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png 424w, https://substackcdn.com/image/fetch/$s_!ppfN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png 848w, https://substackcdn.com/image/fetch/$s_!ppfN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png 1272w, https://substackcdn.com/image/fetch/$s_!ppfN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ppfN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png" width="626" height="391" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:391,&quot;width&quot;:626,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:461545,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.architecture-weekly.com/i/177895476?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ppfN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png 424w, https://substackcdn.com/image/fetch/$s_!ppfN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png 848w, https://substackcdn.com/image/fetch/$s_!ppfN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png 1272w, https://substackcdn.com/image/fetch/$s_!ppfN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b7fa258-9208-478a-8c61-a48a531b9f11_626x391.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>What we learned so far</h2><p>Communication in messaging systems works like a department store. It has multiple cash registers and separate queues for each of them. You can only guess which person will be handled in a specific queue: first in, first out. Between queues, you only know that the slowest will be the one you&#8217;re standing in.</p><p>Of course, you could put a single cash register and a single queue, and everything would be sequential. Such a setup can work for small groceries with few customers. Tho, for a supermarket, it&#8217;d end up with an extremely long waiting queue.</p><p>You can think about a single module as a single cash register in a department store or a small grocery store. Inside it, you can get strict ordering of processing, but not in the relationship with the outside world. For instance, you may know that on Monday morning, you&#8217;re getting the fresh fruits delivery, and at noon, you&#8217;re getting the dairy delivery. And it&#8217;s typically like that, but from time to time, because of the fruit delivery delay, you may get your fresh dairy first.</p><p>Is it an issue? Not a huge one, as you just want them to come asap so you have fresh stuff to sell. When would it be an issue? If you were running the Milk Shake Cafe and needed both to make your special strawberry shake recipe.</p><p>In many systems, ordering is not a key concern, especially when we partition the workload. The issue may arise when we need to correlate separate actions.</p><p>In the last article, we showed the payment verification workflow. To make the final decision, we needed to correlate data from the external payment gateway with our own modules, which calculate fraud scores, check limits, and assess risk. Only after receiving data on available merchant limits and the fraud assessment score could we make the final decision. Those pieces of information could return to us at different times and in a different order.</p><p>To resolve it, we were just gathering and aggregating data as they went. Then, after each step, we checked whether we now have all the data. If we had, we were making the final decision; if not, we were storing it as is, assuming that at some point, data would arrive.</p><p>If we model that as the workflow, then it&#8217;d look like that:</p><pre><code><code>function decide(
  current: PaymentVerification | null,
  event: PaymentVerificationEvent
):
  | PaymentVerification
  | { document: PaymentVerification; events: VerificationEvent[] } {
  current = current ?? {
    paymentId: event.paymentId,
    initialState,
  }

  switch (event.type) {
    // (...) other event handlers
    case &#8220;MerchantLimitsChecked&#8221;: {
      const updated = {
        ...current,
        merchantLimits: {
          withinLimits: event.withinLimits,
          dailyRemaining: event.dailyRemaining,
          checkedAt: event.checkedAt,
        },
        lastUpdated: event.checkedAt,
      };

      return tryCompleteVerification(updated, event);
    }
    case &#8220;FraudScoreCalculated&#8221;: {
      if (
        current.fraudAssessment &amp;&amp;
        event.calculatedAt &lt;= current.fraudAssessment.assessedAt
      )
        return current;

      const updated = {
        ...current,
        fraudAssessment: {
          score: event.score,
          riskLevel: event.riskLevel,
          assessedAt: event.calculatedAt,
        },
        lastUpdated: event.calculatedAt,
      };

      return tryCompleteVerification(updated, event);
    }
  }
};</code></code></pre><p>And:</p><pre><code><code>function tryCompleteVerifications(
  current: PaymentVerification,
  event: PaymentVerificationEvent
):
  | PaymentVerification
  | { document: PaymentVerification; events: VerificationEvent[] } {
  // Ignore if we already made decision
  if (current.decision)
    return current;

  // Check if we now have BOTH critical pieces
  if (!current.fraudAssessment || !current.merchantLimits)
    // Don&#8217;t have both yet - stay in processing
    return {
      ...current,
      status: &#8220;processing&#8221;,
      dataQuality: &#8220;processing&#8221;,
    };

  const decision =
    current.fraudAssessment.riskLevel === &#8220;high&#8221;
      ? {
          approval: &#8220;declined&#8221;,
          reason: &#8220;High fraud risk&#8221;,
          decidedAt: event.checkedAt,
        }
      : !current.merchantLimits.withinLimits
      ? {
          approval: &#8220;declined&#8221;,
          reason: &#8220;High fraud risk&#8221;,
          decidedAt: event.checkedAt,
        }
      : {
          approval: &#8220;approved&#8221;,
          reason: &#8220;Verified&#8221;,
          decidedAt: event.checkedAt,
        };

  return {
    document: {
      ...current,
      status: decision.approval,
      decision,
    },
    events: [
      {
        type: &#8220;PaymentVerificationCompleted&#8221;,
        data: decision,
      },
    ],
  };
};</code></code></pre><p>This works fine, as we know precisely which steps need to happen, so we know what we&#8217;re waiting for. And that&#8217;s how we made the full loop to Ben&#8217;s question. What if we didn&#8217;t know which steps we&#8217;re waiting for?</p><h2>How to know what we don&#8217;t know?</h2><p>Let&#8217;s have a look at the case brought by Ben: the e-commerce flow. First, we complete the shopping cart by adding and removing items, then we confirm it. The example event flow could look as follows for the online food ordering:</p><pre><code><code>ItemAddedToCart     (cartId: 1, name: Pizza Napoletana) 
ItemAddedToCart     (cartId: 1, name: Pizza Napoletana)
ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)
CartConfirmed       (cartId: 1, confirmedAt: 2025-11-03 11:44:27)</code></code></pre><p>We see here that someone added the first Pizza, then maybe accidentally added it again, corrected their mistake, and confirmed the order.</p><p>Then, if that was an online ordering system and we had it integrated with the kitchen ordering, then we could get those events in a different order, for instance:</p><pre><code><code>ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)
CartConfirmed       (cartId:1, confirmedAt: 2025-11-03 11:44:27)
ItemAddedToCart     (cartId: 1, name: Pizza Napoletana) 
ItemAddedToCart     (cartId: 1, name: Pizza Napoletana)</code></code></pre><p>We see that someone removed one Pizza from their shopping cart, which suggests that some information is missing. When we get a confirmation event, we still know that there&#8217;s more to come, as an order with a removed item doesn&#8217;t make sense. The same goes for the information that one pizza was added; when we correlate it with the removal event having the same cart identifier, we still see zero items in the shopping cart. Once we get the next event, we will finally know that we have more than one item in our shopping cart.</p><p>Can we then proceed? Maybe yes and maybe no. For this particular order, it&#8217;d be correct, but what if our real order:</p><pre><code><code>ItemAddedToCart     (cartId: 1, name: Pizza Napoletana) 
ItemAddedToCart     (cartId: 1, name: Pizza Napoletana)
ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)
ItemAddedToCart     (cartId: 1, name: Spaghetti Carbonara)
ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)
CartConfirmed       (cartId:1, confirmedAt: 2025-11-03 11:44:27)</code></code></pre><p>Also, since messaging systems retry to ensure delivery, how would we know that those &#8220;doubled&#8221; events for adding or removing are actually distinct events and not just retries?</p><p>For instance, in such a delivery case:</p><pre><code><code>ItemAddedToCart     (cartId: 1, name: Pizza Napoletana) 
ItemAddedToCart     (cartId: 1, name: Pizza Napoletana)
ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)
CartConfirmed       (cartId:1, confirmedAt: 2025-11-03 11:44:27)
ItemAddedToCart     (cartId: 1, name: Spaghetti Carbonara)
ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)</code></code></pre><p>Let&#8217;s discuss a few strategies to deal with that!</p><h2>External vs Internal events</h2><blockquote><ul><li><p>Doctor, it hurts when I bend my arm this way.</p></li><li><p>Then don&#8217;t bend it this way</p></li></ul></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_cEI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a13cdc-8181-46d0-9e21-46abf14760fe_800x800.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_cEI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a13cdc-8181-46d0-9e21-46abf14760fe_800x800.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_cEI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a13cdc-8181-46d0-9e21-46abf14760fe_800x800.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_cEI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a13cdc-8181-46d0-9e21-46abf14760fe_800x800.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_cEI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a13cdc-8181-46d0-9e21-46abf14760fe_800x800.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_cEI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a13cdc-8181-46d0-9e21-46abf14760fe_800x800.jpeg" width="454" height="454" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/34a13cdc-8181-46d0-9e21-46abf14760fe_800x800.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:800,&quot;resizeWidth&quot;:454,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;doctor&quot;,&quot;title&quot;:&quot;doctor&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="doctor" title="doctor" srcset="https://substackcdn.com/image/fetch/$s_!_cEI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a13cdc-8181-46d0-9e21-46abf14760fe_800x800.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_cEI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a13cdc-8181-46d0-9e21-46abf14760fe_800x800.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_cEI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a13cdc-8181-46d0-9e21-46abf14760fe_800x800.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_cEI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a13cdc-8181-46d0-9e21-46abf14760fe_800x800.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>One of the most common mistakes we learn too late is separating our events into internal and external (or private and public). <a href="https://event-driven.io/en/events_should_be_as_small_as_possible/">Internal information can and should be more granular</a>. We need it to be precise in capturing the business context and making our decision.</p><p>Yet, other parts of our system don&#8217;t need to know all of that. Is the kitchen interested in the details of all the changes procrastinating customer made to their shopping cart? No, they just want the final information on which meal they need to prepare.</p><p>So in our example, if we published to the outside world just:</p><pre><code><code>CartConfirmed  {
    cartId: 1, 
    items: [
        { name: Pizza Napoletana }
    ]
    confirmedAt: 2025-11-03 11:44:27
}</code></code></pre><p>Such a type of event is also called a <strong>Summary Event</strong>. We should not mistake it with <em>the latest state</em>. It&#8217;s still an event because it tells what has happened business-wise. It gathers all the information needed for other modules and summarises the changes. And no more than that. It should still be as small as possible and expose only the information that other modules need. It&#8217;s a contract made between different teams. I wrote about it in detail <a href="https://event-driven.io/en/internal_external_events/">Internal and external events, or how to design event-driven API</a>.</p><p>We can define such an internal event API as:</p><pre><code><code>type ItemAddedToCart = {
  type: &#8216;sc:int:ItemAddedToCart&#8217;;
  data: {
    cartId: string;
    productItem: ProductItem;
  };
};

type ItemRemovedFromCart = {
  type: &#8216;sc:int:ItemRemovedFromCart&#8217;;
  data: {
    cartId: string;
    productItem: ProductItem;
  };
};

type CartConfirmed = {
  type: &#8216;sc:int:CartConfirmed&#8217;;
  data: {
    cartId: string;
    confirmedAt: Date;
  };
};

type ShoppingCartEvent =
  | ItemAddedToCart
  | ItemRemovedFromCart
  | CartConfirmed;

interface ProductItem {
  productId: string;
  quantity: number;
}</code></code></pre><p>and public as:</p><pre><code><code>type CartOpened = {
  type: &#8216;sc:ext:CartOpened&#8217;;
  data: {
    cartId: string;
    openedAt: Date;
  };
};

type CartConfirmed = {
  type: &#8216;sc:ext:CartConfirmed&#8217;;
  data: {
    cartId: string;
    productItems: { productId: string; quantity: number }[];
    confirmedAt: Date;
  };
};

type ShoppingCartExternalEvent = CartOpened | CartConfirmed;</code></code></pre><p>As you see, we can even have more than one summary event, and not even be one-to-one with an internal event. Maybe we also have an analytics module that analyses how long it takes the user to make a final decision after adding the first product. Then we may decide to expose such an event, hiding the details of the internal flow. We&#8217;re also defending ourselves and <a href="https://event-driven.io/en/how_to_do_event_versioning/">minimising the need for versioning when flow changes</a>.</p><p>Ok, but how to map internal events into external?</p><p>We can enrich them using such a function:</p><pre><code><code>import type { ShoppingCartExternalEvent } from &#8216;./shoppingCart.external&#8217;;
import type { ShoppingCart, ShoppingCartEvent } from &#8216;./shoppingCart.internal&#8217;;

const enrich = (
  event: ShoppingCartEvent,
  state: ShoppingCart | null,
): ShoppingCartExternalEvent | [] =&gt; {
  switch (event.type) {
    case &#8216;sc:int:ItemAddedToCart&#8217;:
      return state == null
        ? {
            type: &#8216;sc:ext:CartOpened&#8217;,
            data: {
              cartId: event.data.cartId,
              openedAt: new Date(),
            },
          }
        : [];
    case &#8216;sc:int:CartConfirmed&#8217;:
      return {
        type: &#8216;sc:ext:CartConfirmed&#8217;,
        data: {
          cartId: event.data.cartId,
          productItems: state?.productItems ?? [],
          confirmedAt: event.data.confirmedAt,
        },
      };
    default:
      return [];
  }
};</code></code></pre><p>We can then subscribe to internal events in our module, load the state (best to build it from events if we&#8217;re using Event Sourcing), and publish enriched events externally.</p><p><strong>If we&#8217;re using messaging, this means also separating queues/topics.</strong> If you&#8217;re using Kafka both for internal and external communication, then you should separate topics and have two different topics for outgoing communication, e.g.:</p><ul><li><p>&#8216;carts:events:int&#8217;</p></li><li><p>&#8216;carts:events:out&#8217;.</p></li></ul><p>Similarly, for RabbitMQ or similar tools, you should have separate queues for internal and external communications.</p><p>This is important, as you can now:</p><ul><li><p>Publish messages that other modules need, decreasing the number of issues with ordering,</p></li><li><p>Have enrichment as an anti-corruption layer for your internal process changes,</p></li><li><p>You can have different scaling capabilities for internal and external events. Maybe for internal, you don&#8217;t even need a messaging system, maybe <a href="https://event-driven.io/en/outbox_inbox_patterns_and_delivery_guarantees_explained/">outbox or event store subscriptions</a> will be enough? Maybe you could cut costs using AWS SQS for internal communication and AWS Kinesis for cross-module?</p></li><li><p>You can now define different security for those topics and retention policies.</p></li></ul><p>Sweet, right?</p><h2>It&#8217;s not me, it&#8217;s them</h2><p>Maybe it&#8217;s sweet enough for you, but you may also say:</p><blockquote><p>But Oskar, it&#8217;s not me, it&#8217;s them. If I was responsible for that, I&#8217;d go this way, but I can&#8217;t change it how messages are published.</p></blockquote><p>I could handwave it and say I pity you, but well, this actually can happen. Let&#8217;s see what else we could do about it.</p><p>The first idea could be: Let&#8217;s add timestamps!</p><p>Let&#8217;s see how it looks for our example:</p><pre><code><code>11:40:10 - ItemAddedToCart     (cartId: 1, name: Pizza Napoletana) 
11:40:10 - ItemAddedToCart     (cartId: 1, name: Pizza Napoletana)
11:42:13 - ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)
11:43:18 - ItemAddedToCart     (cartId: 1, name: Spaghetti Carbonara)
11:44:23 - ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)
11:44:27 - CartConfirmed       (cartId: 1)</code></code></pre><p>And the out of order delivery:</p><pre><code><code>11:40:10 - ItemAddedToCart     (cartId: 1, name: Pizza Napoletana) 
11:40:10 - ItemAddedToCart     (cartId: 1, name: Pizza Napoletana)
11:44:23 - ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)
11:44:27 - CartConfirmed       (cartId: 1, confirmedAt:2025-11-03 11:44)
11:43:18 - ItemAddedToCart     (cartId: 1, name: Spaghetti Carbonara)
11:42:13 - ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)</code></code></pre><p>Would that help? No, because how would we know, based on timestamps, that there will be two more events after confirmation? Timestamps only tell us when a certain operation happens. They could help us order items that were delivered, but they won&#8217;t help us know what we&#8217;re missing. Because how do we know that there&#8217;s a gap in our knowledge? Within a minute, one could do nothing and order or remove a few more items. Also, timestamps might work if the data is coming from the same node, but we don&#8217;t have any guarantees across nodes. Read more about <a href="https://en.wikipedia.org/wiki/Clock_drift">clock drift</a>.</p><p>What we actually need is the logical clock. One that increments after each operation. So something like:</p><pre><code><code>1 - ItemAddedToCart     (cartId: 1, name: Pizza Napoletana) 
2 - ItemAddedToCart     (cartId: 1, name: Pizza Napoletana)
3 - ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)
4 - ItemAddedToCart     (cartId: 1, name: Spaghetti Carbonara)
5 - ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)
6 - CartConfirmed       (cartId: 1)</code></code></pre><p>If we had such, then our delivery would look as follows:</p><pre><code><code>2 - ItemAddedToCart     (cartId: 1, name: Pizza Napoletana) 
1 - ItemAddedToCart     (cartId: 1, name: Pizza Napoletana)
5 - ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)
6 - CartConfirmed       (cartId: 1, confirmedAt: 2025-11-03 11:44:27)
4 - ItemAddedToCart     (cartId: 1, name: Spaghetti Carbonara)
3 - ItemRemovedFromCart (cartId: 1, name: Pizza Napoletana)</code></code></pre><p>If this number were monotonic and gapless, we&#8217;d know that if the event has number 3, then we have completeness of information if we received three events; if not, then we&#8217;re missing something.</p><p>We still need to define the completion criteria and determine, from a business perspective, where we can make a decision or proceed to the next step. Here we know that we can start preparing a meal when the shopping cart is confirmed.</p><p>In this case, we got the events in the following order: 2, 1, 5, 6.</p><p>We know we&#8217;re missing events 3 and 4, so we need to wait for them. Only when we receive them can we proceed. Ok, but how to do it?</p><p>What if we kept a list of pending events in our data model? Let&#8217;s try that!</p><p>Our kitchen order could look as follows:</p><pre><code><code>type KitchenOrder = {
  orderId: string;
  productItems: ProductItem[];
  status: &#8216;Incomplete&#8217; | &#8216;Ready&#8217; | &#8216;InPreparation&#8217;;
};

type KitchenOrderCommand =
  | {
      type: &#8216;AddItem&#8217; | &#8216;RemoveItem&#8217;;
      productId: string;
      quantity: number;
    }
  | {
      type: &#8216;Confirm&#8217;;
      orderId: string;
    };

type ProductItem = {
  productId: string;
  quantity: number;
};</code></code></pre><p>When storing it, we could store it with additional metadata:</p><pre><code><code>type DocumentWithPendingCommands&lt;State, Command&gt; = State &amp; {
  metadata: {
    lastProcessedRevision: number;
    pendingCommands: PendingCommand&lt;Command&gt;[];
  };
};</code></code></pre><p>As you see, besides the regular data, we have two other properties: pending commands and last processed revision.</p><p>You can think about pending commands as your git repository on your local disk. It contains the list of all operations that you&#8217;ll eventually commit. The rest of the data is like the remote git repository. They will be updated when you push your changes there. Then, the last processed revision will be updated with the revision of the last applied command.</p><p>The code for that workflow could look as follows:</p><pre><code><code>function handle(
  event: ShoppingCartEvent,
  document: DocumentWithPendingCommands&lt;
    KitchenOrder,
    KitchenOrderCommand
  &gt; | null,
): DocumentWithPendingCommands&lt;KitchenOrder, KitchenOrderCommand&gt; {
  const { metadata, ...data } = document ?? {
    metadata: {
      lastProcessedRevision: 0,
      pendingCommands: [],
    },
  };

  const state: KitchenOrder = {
    orderId: event.data.cartId,
    productItems: [],
    status: &#8216;Incomplete&#8217;,
    ...data,
  };

  const updated = {
    ...state,
    metadata: {
      lastProcessedRevision: metadata.lastProcessedRevision,
      pendingCommands: [
        ...metadata.pendingCommands,
        mapToPendingCommand(event),
      ],
    },
  };

  return handlePendingCommands(updated, decide);
}</code></code></pre><p>It takes the current state of the kitchen order. If it doesn&#8217;t exist, then we need to set it up with default data. We&#8217;re also appending a pending command that&#8217;s built from an event.</p><pre><code><code>function mapToPendingCommand(
  event: ShoppingCartEvent,
): PendingCommand&lt;KitchenOrderCommand&gt; {
  switch (event.type) {
    case &#8216;sc:int:ItemAddedToCart&#8217;:
      return {
        type: &#8216;AddItem&#8217;,
        productId: event.data.productItem.productId,
        quantity: event.data.productItem.quantity,
        metadata: {
          revision: event.metadata?.revision,
        },
      };
    case &#8216;sc:int:ItemRemovedFromCart&#8217;:
      return {
        type: &#8216;RemoveItem&#8217;,
        productId: event.data.productItem.productId,
        quantity: event.data.productItem.quantity,
        metadata: {
          revision: event.metadata?.revision,
        },
      };
    case &#8216;sc:int:CartConfirmed&#8217;:
      return {
        type: &#8216;Confirm&#8217;,
        orderId: event.data.cartId,
        metadata: {
          revision: event.metadata?.revision,
        },
      };
  }
}</code></code></pre><p>Why not use a regular event here? We could. The benefit is that then we&#8217;d have all the data stored as it came. This could make troubleshooting and correction easier. Downside? We&#8217;re coupling the event with our internal business logic. Also, keeping the whole event payload increases the size of our data. The choice is yours.</p><p>As you see, we&#8217;re taking revision from the event metadata. We&#8217;ll get to that later, how to fill it on the producer side. Now, let&#8217;s focus on the workflow.</p><p>Let&#8217;s see what processing pending items looks like:</p><pre><code><code>function handlePendingCommands&lt;State, Command&gt;(
  document: DocumentWithPendingCommands&lt;State, Command&gt;,
  decide: (command: Command, state: State) =&gt; State,
): DocumentWithPendingCommands&lt;State, Command&gt; {
  const { metadata, ...data } = document;

  const commandsToHandle = getCommandsReadyToHandle(
    metadata.pendingCommands,
    metadata.lastProcessedRevision,
  );

  // Nothing to do see here, please disperse
  if (commandsToHandle.length === 0) return document;

  let state = data as State;
  for (const command of commandsToHandle) {
    state = decide(command, state);
  }

  const lastCommand = commandsToHandle[commandsToHandle.length - 1];

  return {
    ...state,
    metadata: {
      lastProcessedRevision: lastCommand.metadata.revision,
      pendingCommands: metadata.pendingCommands.filter(
        (a) =&gt; a.metadata.revision &gt; lastCommand.metadata.revision,
      ),
    },
  };
}</code></code></pre><p>We need to get the commands ready to be handled. For instance, if we already had commands with the following revisions 2, 6, 1, 4, and now we got an action with number 3, then that means:</p><ul><li><p>We can process actions 1, 2, 3, 4</p></li><li><p>Action 6 remains, as we&#8217;re missing 5.</p></li></ul><p>The filtering can look as follows:</p><pre><code><code>function getCommandsReadyToHandle&lt;Command&gt;(
  pending: PendingCommand&lt;Command&gt;[],
  lastProcessedRevision: number,
): PendingCommand&lt;Command&gt;[] {
  return (
    [...pending]
      // filter out commands that have already been processed
      .filter((cmd) =&gt; cmd.metadata.revision &gt; lastProcessedRevision)
      // sort by revision to ensure correct order
      .sort((a, b) =&gt; a.metadata.revision - b.metadata.revision)
      // only take commands that are consecutive in terms of revision
      .reduce&lt;PendingCommand&lt;Command&gt;[]&gt;((acc, command) =&gt; {
        const lastRevision = acc[acc.length - 1]?.metadata.revision;

        return !lastRevision || command.metadata.revision === lastRevision + 1
          ? [...acc, command]
          : acc;
      }, [])
  );
}</code></code></pre><p>When we filter them out and have actions to process, we need to run the actual logic for each of them. For our case this could look like that:</p><pre><code><code>
function decide(
  command: KitchenOrderCommand,
  order: KitchenOrder,
): KitchenOrder {
  switch (command.type) {
    case &#8216;AddItem&#8217;:
    case &#8216;RemoveItem&#8217;: {
      if (order.status !== &#8216;Incomplete&#8217;) return order;

      const updatedItems = new Map(
        order.productItems.map((item) =&gt; [item.productId, item.quantity]),
      );
      const current = updatedItems.get(command.productId) ?? 0;

      const multiplier = command.type === &#8216;AddItem&#8217; ? 1 : -1;
      const updated = current + multiplier * command.quantity;

      if (updated &gt; 0) {
        updatedItems.set(command.productId, updated);
      } else {
        updatedItems.delete(command.productId);
      }

      return {
        ...order,
        productItems: Array.from(updatedItems.entries()).map(
          ([productId, quantity]) =&gt; ({
            productId,
            quantity,
          }),
        ),
      };
    }
    case &#8216;Confirm&#8217;:
      if (order.status !== &#8216;Incomplete&#8217;) return order;

      return {
        ...order,
        status: &#8216;Ready&#8217;,
      };
  }
}

type KitchenOrderCommand =
  | {
      type: &#8216;AddItem&#8217; | &#8216;RemoveItem&#8217;;
      productId: string;
      quantity: number;
    }
  | {
      type: &#8216;Confirm&#8217;;
      orderId: string;
    };</code></code></pre><p>After processing each command, we&#8217;re returning the document with the updated state, filtered-out processed commands, and the last processed revision set to the revision of the last processed command.</p><p>Not that big a hassle as it may seem, but&#8230;</p><h2>What&#8217;s revision and how to get it?</h2><p>A revision needs to be created on the producer side. The best is to use Optimistic Concurrency for that. If you don&#8217;t know what optimistic concurrency or locking is, then you should. Check my intros:</p><ul><li><p><a href="https://event-driven.io/en/optimistic_concurrency_for_pessimistic_times/">Optimistic concurrency for pessimistic times</a></p></li><li><p><a href="https://event-driven.io/en/how_to_use_etag_header_for_optimistic_concurrency/">How to use ETag header for optimistic concurrency</a>.</p></li></ul><p>Essentially, if you&#8217;re:</p><ul><li><p>using a typical implementation of optimistic concurrency where each record/document change increments its version/revision,</p></li><li><p>publishing an event after each business operation.</p></li></ul><p>Then you can use this incremented state revision and pass it in your events metadata. Having that, you&#8217;d know precisely on which revision it was recorded and get gapless monotonic numeration. It&#8217;d also ensure that each message is produced in a certain order, as operations on the specific record will be sequential. Read also more on how revision can help in <a href="https://event-driven.io/en/dealing_with_eventual_consistency_and_idempotency_in_mongodb_projections/">Dealing with Eventual Consistency and Idempotency in projections</a>.</p><blockquote><p>But Oskar, what if I have more than one record?</p></blockquote><p>Well, then you either need to store multiple revisions. But this won&#8217;t help if you need to correlate data between them, as revision is monotonic and gapless for the specific record.</p><p>What about global positions? Well, they&#8217;re useful for knowing the order of things, but they won&#8217;t help here, as they&#8217;re monotonic but may have gaps.</p><p>Read more on why in:</p><ul><li><p><a href="https://www.architecture-weekly.com/p/how-does-kafka-know-what-was-the">How does Kafka know what was the last message it processed? Deep dive into Offset Tracking</a>,</p></li><li><p><a href="https://event-driven.io/en/lets_talk_about_positions_in_event_stores/">Let&#8217;s talk about positions in event stores</a>,</p></li><li><p><a href="https://event-driven.io/en/ordering_in_postgres_outbox/">How Postgres sequences issues can impact your messaging guarantees</a>.</p></li></ul><p>Then you&#8217;re back to square one, and the <a href="https://www.architecture-weekly.com/p/dealing-with-race-conditions-in-event">previous article</a>.</p><h2>TLDR</h2><p>Proper modelling in Event-Driven Architecture can spare you a lot of complicated implementation tricks.</p><p>If you</p><ul><li><p>define essential events for your process,</p></li><li><p>ensure that they have completeness of information,</p></li><li><p>shape contracts and communication between modules, respecting the internal and external split.</p></li></ul><p>Then, when things get easier to handle, we can define conditions that tell us when to take action.</p><p>Still, sometimes we may:</p><ul><li><p>have strict ordering needs,</p></li><li><p>be using a queue that doesn&#8217;t give us an ordering guarantee,</p></li><li><p>need to adjust to the other teams.</p></li></ul><p>Then, using revision can be a decent option to solve things in an organised way.</p><p><strong>If you&#8217;re dealing with such issues, I&#8217;m happy to help you through consulting or mentoring. <a href="mailto:oskar@event-driven.io">Contact me</a> and we&#8217;ll find a way to unblock you!</strong></p><p>Read also more in:</p><ul><li><p><a href="https://www.architecture-weekly.com/p/the-order-of-things-why-you-cant">The Order of Things: Why You Can&#8217;t Have Both Speed and Ordering in Distributed Systems</a>,</p></li><li><p><a href="https://event-driven.io/en/internal_external_events/">Internal and external events, or how to design event-driven API</a>,</p></li><li><p><a href="https://event-driven.io/en/simple_trick_for_idempotency_handling_in_elastic_search_readm_model/">Dealing with Eventual Consistency and Idempotency in MongoDB projections</a></p></li><li><p><a href="https://event-driven.io/en/saga_process_manager_distributed_transactions/">Saga and Process Manager - distributed processes in practice</a>,</p></li><li><p><a href="https://www.architecture-weekly.com/p/predictable-identifiers-enabling">Predictable Identifiers: Enabling True Module Autonomy in Distributed Systems</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/dealing-with-eventual-consistency">Dealing with Eventual Consistency, and Causal Consistency using Predictable Identifiers</a>,</p></li><li><p><a href="https://event-driven.io/en/event_driven_distributed_processes_by_example/">Event-driven distributed processes by example</a>,</p></li><li><p><a href="https://www.architecture-weekly.com/p/workflow-engine-design-proposal-tell">Workflow Engine design proposal, tell me your thoughts</a>,</p></li><li><p><a href="https://event-driven.io/en/how_to_have_fun_with_typescript_and_workflow/">How TypeScript can help in modelling business workflows</a>,</p></li><li><p><a href="https://event-driven.io/en/how_to_update_past_data_in_event_sourcing/">Oops I did it again, or how to update past data in Event Sourcing</a>,</p></li><li><p><a href="https://event-driven.io/en/event_transformations_and_loosely_coupling/">Event transformations, a tool to keep our processes loosely coupled</a>,</p></li><li><p><a href="https://event-driven.io/en/testing_asynchronous_processes_with_a_little_help_from_dotnet_channels/">Testing asynchronous processes with a little help from .NET Channels</a>.</p></li></ul><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/pl/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/pl/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[Dealing with Race Conditions in Event-Driven Architecture with Read Models]]></title><description><![CDATA[My events came out of order! What should I do? Are you familiar with the term "phantom record" and its benefits? No? Let me explain it to you today. Let's discuss how to embrace the chaos and learn to deal with it.]]></description><link>https://www.architecture-weekly.com/p/dealing-with-race-conditions-in-event</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/dealing-with-race-conditions-in-event</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 20 Oct 2025 18:02:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Nf89!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p>My events came out of order! What should I do?!</p></blockquote><p><strong>Are you familiar with the term &#8220;phantom record&#8221; and its benefits? No? Let me explain it to you today.</strong></p><p><strong>Everyone has a plan until they get punched in the mouth.</strong> Design, architecture, and modelling are important, but it&#8217;s the actual code that reaches production. That&#8217;s also the place where we see all those nasty issues that we haven&#8217;t foreseen, like: race conditions, eventual inconsistency, idemNotency, etc.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Nf89!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Nf89!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png 424w, https://substackcdn.com/image/fetch/$s_!Nf89!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png 848w, https://substackcdn.com/image/fetch/$s_!Nf89!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png 1272w, https://substackcdn.com/image/fetch/$s_!Nf89!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Nf89!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png" width="626" height="391" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:391,&quot;width&quot;:626,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:609610,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.architecture-weekly.com/i/176638730?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Nf89!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png 424w, https://substackcdn.com/image/fetch/$s_!Nf89!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png 848w, https://substackcdn.com/image/fetch/$s_!Nf89!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png 1272w, https://substackcdn.com/image/fetch/$s_!Nf89!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e700ce6-5b15-42dd-8766-50d324343a24_626x391.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>We realise that our BBC Architecture is just a Box-Box Cylinder on paper; in reality, things get messy. We go from a 2D layout to a 3D or even a 4D view.</p><p>We see that our business processes are not so linear and predictable as we draw them. Of course, we have somewhere in the back of our mind that subprocesses can go in parallel, but somehow our tests go down the happy path. Then bugs and incidents come in.</p><p>We learn the hard way that real processes have delays; they run in parallel. For instance, when you process a payment, fraud checking, risk assessment, and merchant validation happen simultaneously. That&#8217;s efficiency, not a flaw.</p><p>And the sad part is that it&#8217;s our role to provide this efficiency. Yelling at event-driven clouds won&#8217;t help.</p><h2>Race Conditions in EDA</h2><p>When we distribute our systems, we&#8217;re getting better isolation of failure, we can deploy changes at a different pace, but&#8230; But we also get communication going at various paces. We can no longer get a unified view of the business actions in the order of appearance. Now each service has its own linearity.</p><p>To integrate services predictably, we&#8217;re using messaging tools. They provide us with durability, retries when the recipient is unavailable, and ensure that information flows and data are delivered effectively. Still, they&#8217;re not magical creatures; they have their limits. We can&#8217;t cheat physics.</p><p>Between one service and the other, we&#8217;re putting in a queue. That&#8217;s the place where we can achieve an ordering guarantee. Typically, one queue, one consumer means that we can achieve ordering. Still, not all tools are giving us that.</p><p>Out of order issues may happen when the RabbitMQ queue has multiple consumers racing for messages. When we add consumers, we get better throughput, but we&#8217;re risking ordering issues.</p><p>Out of order issues may also happen when you&#8217;re using tools like SQS or Google PubSub, which only guarantee best-effort ordering.</p><p>Or when your outbox pattern deletes processed messages and loses sequence.</p><p>Or when network delays shuffle carefully ordered streams.</p><p>It&#8217;s safe to say that you won&#8217;t get any ordering guarantee between different queues. Since queues represent communication flows between modules, we should not assume strict ordering in cross-module communication.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!t2kZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9c09f5e-cb28-4127-b453-dbebc84d195d_1444x786.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!t2kZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9c09f5e-cb28-4127-b453-dbebc84d195d_1444x786.png 424w, https://substackcdn.com/image/fetch/$s_!t2kZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9c09f5e-cb28-4127-b453-dbebc84d195d_1444x786.png 848w, https://substackcdn.com/image/fetch/$s_!t2kZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9c09f5e-cb28-4127-b453-dbebc84d195d_1444x786.png 1272w, https://substackcdn.com/image/fetch/$s_!t2kZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9c09f5e-cb28-4127-b453-dbebc84d195d_1444x786.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!t2kZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9c09f5e-cb28-4127-b453-dbebc84d195d_1444x786.png" width="1444" height="786" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f9c09f5e-cb28-4127-b453-dbebc84d195d_1444x786.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:786,&quot;width&quot;:1444,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:118622,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.architecture-weekly.com/i/176638730?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9c09f5e-cb28-4127-b453-dbebc84d195d_1444x786.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!t2kZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9c09f5e-cb28-4127-b453-dbebc84d195d_1444x786.png 424w, https://substackcdn.com/image/fetch/$s_!t2kZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9c09f5e-cb28-4127-b453-dbebc84d195d_1444x786.png 848w, https://substackcdn.com/image/fetch/$s_!t2kZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9c09f5e-cb28-4127-b453-dbebc84d195d_1444x786.png 1272w, https://substackcdn.com/image/fetch/$s_!t2kZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9c09f5e-cb28-4127-b453-dbebc84d195d_1444x786.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Events vs Rumours?</h2><p>Continuing with the payment verification process example: Fraud module publishes to one queue. Risk assessment for another. The payment gateway to a third. Messages arrive at your module in the order your system received them, not the order they were created. Fraud detection service flags a high-risk payment at 10:00:01. The payment initiation event from the gateway, created at 10:00:00, arrives at 10:00:02. The fraud score arrives before the payment exists in your system. You might think sorting by timestamp solves this. It doesn&#8217;t. Clock skew between services means timestamps lie. Each service node has its own time. It may be similar, but if you have a system with high throughput, the skew may be significant enough, causing more issues than actual help.</p><p>But, well, that&#8217;s the same case when you&#8217;re getting information from the outside world. You can get multiple friends sending you some news, you can get them at different paces, and read them in a different order. You may get obsolete news and then newer, but it can go the other way round. You&#8217;re only sure of which you received those messages; to get the truth, you need to correlate that information and do fact-checking based on some rules (e.g. which source is more reliable, newer, etc). Then you can deduce the actual information.</p><p>We&#8217;re saying that in the event-driven world, events are facts. But that&#8217;s just half the truth.</p><p>Those events that we store/publish are facts for us, or at least represent the current state of our knowledge.</p><p><strong>The events from external systems are rumours at best. We need to interpret them to make them (our) facts.</strong></p><p>Ok, but what can we do? Sit and cry?</p><p>Of course, we can do more. We&#8217;ll discuss today a simple technique with Read Models that can take you far enough.</p><p>Let&#8217;s say that you&#8217;re using Event Sourcing in your system. You take all the messages coming from other systems, and you store them in your event store.</p><p>Event Sourcing assumes you can rebuild state by replaying events in sequence. When <em>FraudScoreCalculated</em> arrives before <em>PaymentInitiated</em>, the payment doesn&#8217;t exist. You can&#8217;t apply a fraud score to nothing. You realise that when your carefully designed domain model throws exceptions. This should never happen, aye?</p><p>This isn&#8217;t specific to Event Sourcing. Even if you don&#8217;t use it but just update read models directly from events, you have the same problem. Your read model update handler for <em>FraudScoreCalculated</em> looks for a payment document to update. No document exists. The update fails.</p><p>Such issues are a recurring theme in my consulting, as they are a common concern for my customers. My advice is usually: Don&#8217;t try to fight them.</p><p>Store data as it arrives and &#8220;denoise&#8221; on your side. Interpret them and save your own &#8220;facts&#8221;. Acknowledge the need for a <a href="https://event-driven.io/en/internal_external_events/">split between internal and external events</a>. Use an Anti-Corruption Layer (ACL) pattern to protect from external chaos.</p><h2>Read models as Anti-Corruption Layer</h2><p>Read models can be used as such an ACL. A read model document can have a partial state, with optional fields that are filled as events arrive. You can process each event, updating the fields it can, ignoring those it can&#8217;t, and making decisions based on the available data.</p><p>Let&#8217;s explore how to apply this in practice using our payment orchestration scenario. Let&#8217;s say we&#8217;re gathering information about the payment verification process, displaying its state, and performing follow-up operations once the process is complete.</p><p>When we trigger payment, we need to correlate data from the external payment gateway with our own modules, which calculate fraud scores, check limits, and assess risk.</p><p>We could define events using TypeScript, as:</p><pre><code><code>type PaymentVerificationEvent =
  | PaymentInitiated
  | FraudScoreCalculated
  | RiskAssessmentCompleted
  | MerchantLimitsChecked
  | PaymentCompleted
  | PaymentDeclined;</code></code></pre><p>Having</p><pre><code><code>type PaymentInitiated = {
  type: &#8220;PaymentInitiated&#8221;;
  data: {
    paymentId: string;
    amount: number;
    currency: string;
    gatewayId: string;
    initiatedAt: Date;
  };
};

type FraudScoreCalculated = {
  type: &#8220;FraudScoreCalculated&#8221;;
  data: {
    paymentId: string;
    score: number;
    riskLevel: &#8220;low&#8221; | &#8220;medium&#8221; | &#8220;high&#8221;;
    calculatedAt: Date;
  };
};

type RiskAssessmentCompleted = {
  type: &#8220;RiskAssessmentCompleted&#8221;;
  data: {
    paymentId: string;
    riskScore: number;
    factors: string[];
    assessedAt: Date;
  };
};

type MerchantLimitsChecked = {
  type: &#8220;MerchantLimitsChecked&#8221;;
  data: {
    paymentId: string;
    withinLimits: boolean;
    dailyRemaining: number;
    checkedAt: Date;
  };
};

type PaymentCompleted = {
  type: &#8220;PaymentCompleted&#8221;;
  data: { paymentId: string; completedAt: Date };
};

type PaymentDeclined = {
  type: &#8220;PaymentDeclined&#8221;;
  data: { paymentId: string; reason: string; declinedAt: Date };
};</code></code></pre><p>Let&#8217;s say the fraud system flagged the payment as high-risk before it even existed in your system. Approval happened before the risk assessment was completed. The example race condition can look as follows:</p><pre><code><code>10:15:32.123 - FraudScoreCalculated (score: 85, high risk)
10:15:32.145 - PaymentInitiated (amount: $500)
10:15:32.167 - PaymentCompleted (approved by automated system)
10:15:32.201 - RiskAssessmentCompleted (risk: medium)
10:15:32.234 - MerchantLimitsChecked (within limits)</code></code></pre><p>If we assume that things go as on the whiteboard, and handlers can&#8217;t process events for non-existent payments, then we&#8217;ll be the ones who get punched in the mouth by reality. They will fail to find documents to update.</p><p>The first step is to embrace that we have a problem. Actually, it&#8217;s not a problem, but rather a challenge - a scenario we just need to support.</p><p>We can start by defining a document that represents a payment&#8217;s verification state. The document should include optional fields since you can&#8217;t predict which event will arrive first. The projection function processes each event and updates this document, creating it if necessary.</p><p>It can look as follows:</p><pre><code><code>type PaymentVerification = {
  paymentId: string;
  payment?: Payment;
  fraudAssessment?: FraudAssessment;
  riskEvaluation?: RiskEvaluation;
  merchantLimits?: MerchantLimits;
  decision?: Decision;
  status: &#8220;unknown&#8221; | &#8220;processing&#8221; | &#8220;approved&#8221; | &#8220;declined&#8221;;
  completionPercentage: number;
  lastUpdated: Date;
  dataQuality: &#8220;partial&#8221; | &#8220;sufficient&#8221; | &#8220;complete&#8221;;
};

type Payment = {
  amount: number;
  currency: string;
  gatewayId: string;
  initiatedAt: Date;
};

type FraudAssessment = {
  score: number;
  riskLevel: &#8220;low&#8221; | &#8220;medium&#8221; | &#8220;high&#8221;;
  assessedAt: Date;
};

type RiskEvaluation = {
  score: number;
  factors: string[];
  assessedAt: Date;
};

type MerchantLimits = {
  withinLimits: boolean;
  dailyRemaining: number;
  checkedAt: Date;
};

type Decision = {
  approval: &#8220;approved&#8221; | &#8220; declined&#8221;;
  reason: string;
  decidedAt: Date;
};</code></code></pre><p>Besides the optional data that we&#8217;ll gradually fill as events arrive, we have some mandatory fields. The obvious one is paymentId, we need to be able to correlate upcoming data from events. If you have a look at them, all of them have such information. Thanks to that, we know which payments we are verifying. If we&#8217;re missing it, then we won&#8217;t be able to correlate upcoming data. Then we&#8217;re indeed doomed.</p><p>Of course, sometimes things can get harder. We do not always have a certain id, sometimes some other data, like external id, idempotence key, correlation id, whatever id. Still, we need to have some field, or multiple fields that allow us to point to that for this event we need to update that document.</p><p>How do we update our read model? We need a function that takes the current state (or null if it doesn&#8217;t exist) and the event we apply on top of it, returning the new state.</p><p>It can look as follows:</p><pre><code><code>function evolve (
  current: PaymentVerification,
  event: PaymentVerificationEvent
): PaymentVerification | null {
  current = current ?? {
    paymentId: event.paymentId,
    initialState,
  }

  switch (event.type) {
    case &#8220;PaymentInitiated&#8221;: 
      return onPaymentInitiated(current, event);
    case &#8220;FraudScoreCalculated&#8221;: 
      return onFraudScoreCalculated(current, event);
    case &#8220;RiskAssessmentCompleted&#8221;: 
      return onRiskAssessmentCompleted(current, event);
    case &#8220;MerchantLimitsChecked&#8221;: 
      return onMerchantLimitsChecked(current, event);
    case &#8220;PaymentCompleted&#8221;: 
      return onPaymentCompleted(current, event);
    case &#8220;PaymentDeclined&#8221;: 
      return onPaymentDeclined(current, event);
};

const initialState: PaymentVerification = { 
  paymentId: undefined!, 
  status: &#8216;unknown&#8217;, 
  completionPercentage: 0, 
  lastUpdated: new Date(), 
  dataQuality: &#8216;partial&#8217; 
};

// Using Emmett it could be defined as
export const paymentVerificationProjection = pongoSingleStreamProjection({
  // collection to where we store projection
  collectionName: &#8216;paymentVerification&#8217;,
  getDocumentId: (event) =&gt; event.data.paymentId,
  evolve,
  canHandle: [
    &#8216;PaymentInitiated&#8217;,
    &#8216;FraudScoreCalculated&#8217;,
    &#8216;RiskAssessmentCompleted&#8217;,
    &#8216;MerchantLimitsChecked&#8217;,
    &#8216;PaymentApproved&#8217;,
    &#8216;PaymentDeclined&#8217;,
  ],
  // use this state when there&#8217;s no document in collection of certain id
  initialState: () =&gt; initialState,
});</code></code></pre><p>And yes, even if the data is not in the state we expect, we can always create the <em><strong>Phantom Record</strong></em> that represents the best state we can in the conditions we have.</p><p>What if fraud scoring arrives first?</p><pre><code><code>function onFraudScoreCalculated(
  current: PaymentVerification,
  { data: event }: FraudScoreCalculated): PaymentVerification {
  // Ignore event if we&#8217;re already ahead of it
  if (current.fraudAssessment &amp;&amp; event.calculatedAt &lt;= current.fraudAssessment.assessedAt)
    return current;

  const updated = {
    ...current,
    fraudAssessment: {
      score: event.score,
      riskLevel: event.riskLevel,
      assessedAt: event.calculatedAt,
    },
    lastUpdated: event.calculatedAt,
  }

  if (event.riskLevel !== &#8216;high&#8217;)
    return {
      ...updated,
      status: &#8216;declined&#8217;,
      decision: {
        approval: &#8216;declined&#8217;,
        reason: `High fraud risk detected: score ${event.score}`,
        decidedAt: event.calculatedAt,
      },
    };

  return updated;
}</code></code></pre><p>I told you that you should not use a timestamp, but I did it on my own. We cannot always be prim and proper. When doing ACL, sometimes we can achieve just good enough results. If the event doesn&#8217;t have any <a href="https://martinfowler.com/articles/patterns-of-distributed-systems/lamport-clock.html">logical timestamp</a>, then we need to live with what we have. Here, we&#8217;re assuming that <em>FraudScoreCalculated</em> always comes from the same source, and then we assume that timestamps have the chance to be consistent. With that, we can use them to see if we haven&#8217;t already applied this information, or the newer one. If it&#8217;s older, then we just don&#8217;t make any changes. Sometimes you&#8217;ve got to do what you&#8217;ve got to do.</p><p>Now, lets have a look on the payment initiation when fraud scoring might have already completed:</p><pre><code><code>function onPaymentInitiated(
  current: PaymentVerification,
  { data: event }: PaymentInitiated): PaymentVerification {
  // Ignore if we already handled this event
  if (current.payment)
    return current;

  return {
    ...current,
    payment: {
      amount: event.amount,
      currency: event.currency,
      gatewayId: event.gatewayId,
      initiatedAt: event.initiatedAt,
    },
    lastUpdated: event.initiatedAt,
  };
};</code></code></pre><p>If fraud scoring was completed first, <em>current</em> already has fraud data. The handler merges payment details into the current state.</p><p>We can also mark that in the field that I called <em>dataQuality</em>. It may not be the perfect name, but I used it to indicate that we can have a field that lets us decide whether data is as expected, or if we hit reality. Based on that, we can decide how to display it, or whether to display it at all. Thus, the name &#8220;phantom record&#8221;. It may look like phantom as it&#8217;s missing some data but it&#8217;s the best data we have.</p><pre><code><code>function onPaymentCompleted(
  current: PaymentVerification,
  { data: event }: PaymentCompleted): PaymentVerification {
  // Ignore if we already made decision
  if (current.decision)
    return current;

  const decision = 
    current.fraudAssessment?.riskLevel === &#8216;high&#8217; 
      ? {
        approval: &#8216;declined&#8217;,
        reason: `Approval attempted but overridden by fraud (score: ${current.fraudAssessment.score})`,
        decidedAt: event.approvedAt,
      } : {
        approval: &#8216;approved&#8217;,
        reason: `Approved by ${event.approvedBy}`,
        decidedAt: event.approvedAt,
      };

  return {
    ...current,
    status: decision.approval,
    decision,
    lastUpdated: event.approvedAt,
  };
}</code></code></pre><h2>Waiting for Dependencies</h2><p>Some decisions need multiple pieces. The <em>MerchantLimitsChecked</em> handler waits for both fraud assessment and merchant limits. If we receive merchant limits first and don&#8217;t have a fraud assessment yet, we simply store the information; the same applies if it goes the other way. When we eventually get both, we can make the final update, e.g. about the final payment approval or decline.</p><pre><code><code>function onMerchantLimitsChecked(
  current: PaymentVerification,
  { data: event }: MerchantLimitsChecked): PaymentVerification {
  // Ignore if we already handled this event
  if (current.merchantLimits)
    return current;

  const updated = {
    ...current,
    merchantLimits: {
      withinLimits: event.withinLimits,
      dailyRemaining: event.dailyRemaining,
      checkedAt: event.checkedAt,
    },
    lastUpdated: event.checkedAt,
  };

  // Check if we now have BOTH critical pieces
  if (!updated.fraudAssessment || !updated.merchantLimits)
    // Don&#8217;t have both yet - stay in processing
    return {
      ...updated,
      status: &#8220;processing&#8221;,
      dataQuality: &#8220;processing&#8221;,
    };

  // Both present - can make final decision
  const decision =
    updated.fraudAssessment.riskLevel === &#8220;high&#8221;
      ? {
          approval: &#8220;declined&#8221;,
          reason: &#8220;High fraud risk&#8221;,
          decidedAt: event.checkedAt,
        }
      : !updated.merchantLimits.withinLimits
      ? {
          approval: &#8220;declined&#8221;,
          reason: &#8220;High fraud risk&#8221;,
          decidedAt: event.checkedAt,
        }
      : {
          approval: &#8220;approved&#8221;,
          reason: &#8220;Verified&#8221;,
          decidedAt: event.checkedAt,
        };
        
  return {
    ...updated,
    status: decision.approval,
    decision,
  };
}</code></code></pre><p>And now we&#8217;re reaching the grey area, where this can become an anti-pattern. I even made this mistake when writing this article. I wrote initially, <em>&#8220;we can make the final decision&#8221;</em>, but then changed it to <em>&#8220;we can make the final update&#8221;</em>. Why?</p><h2>Projections, ACL responsibility and Workflows</h2><p>I&#8217;m sure you&#8217;ve used or seen Anti-Corruption Layers. They&#8217;re usually places where all the weird logic lands. We want to protect our core logic, pushing all hacks to one, hidden place. This strategy makes some sense; that&#8217;s why it&#8217;s called Anti-Corruption Layer. Yet&#8230;</p><p>Yet, we should not push it to the limits, as we&#8217;ll end up with an unmaintainable beast. With the last example, we&#8217;ve reached or even passed the limits of what projection should be responsible for. The projection should just interpret upcoming information and store the result. It should not make decisions. Business logic is responsible for making decisions.</p><p>What we actually ended up with in the last step is a form of <a href="https://www.architecture-weekly.com/p/workflow-engine-design-proposal-tell">process manager, or workflow</a>, so a state machine that listens to events, gets its current state and makes further decisions. And what&#8217;s the best way to inform the outside world about making new decisions? Well, producing a new event.</p><p>We could define an event as:</p><pre><code><code>type PaymentVerificationCompleted = {
  type: &#8220;PaymentVerificationCompleted&#8221;;
  data: { 
    approval: &#8220;approved&#8221; | &#8220; declined&#8221;;
    reason: string;
    decidedAt: Date; 
  };
}</code></code></pre><p>As we&#8217;re making decisions, let&#8217;s be frank and rename <em>evolve</em> function to <em>decide</em> and produce a new event when we distilled it from the messy outside world?</p><p>It could look like that:</p><pre><code><code>function decide(
  current: PaymentVerification | null,
  event: PaymentVerificationEvent
):
  | PaymentVerification
  | { document: PaymentVerification; events: VerificationEvent[] } {
  current = current ?? {
    paymentId: event.paymentId,
    initialState,
  }

  switch (event.type) {
    // (...) other event handlers
    case &#8220;MerchantLimitsChecked&#8221;: {
      const updated = {
        ...current,
        merchantLimits: {
          withinLimits: event.withinLimits,
          dailyRemaining: event.dailyRemaining,
          checkedAt: event.checkedAt,
        },
        lastUpdated: event.checkedAt,
      };

      return tryCompleteVerification(updated, event);
    }
    case &#8220;FraudScoreCalculated&#8221;: {
      if (
        current.fraudAssessment &amp;&amp;
        event.calculatedAt &lt;= current.fraudAssessment.assessedAt
      )
        return current;

      const updated = {
        ...current,
        fraudAssessment: {
          score: event.score,
          riskLevel: event.riskLevel,
          assessedAt: event.calculatedAt,
        },
        lastUpdated: event.calculatedAt,
      };

      return tryCompleteVerification(updated, event);
    }
  }
};</code></code></pre><p>We update the state and try to complete verification if we gathered both merchant limits and risk assessment. This could be encapsulated in a dedicated method, as the logic is the same now matter which event came first:</p><pre><code><code>function tryCompleteVerifications(
  current: PaymentVerification,
  event: PaymentVerificationEvent
):
  | PaymentVerification
  | { document: PaymentVerification; events: VerificationEvent[] } {
  // Ignore if we already made decision
  if (current.decision)
    return current;

  // Check if we now have BOTH critical pieces
  if (!current.fraudAssessment || !current.merchantLimits)
    // Don&#8217;t have both yet - stay in processing
    return {
      ...current,
      status: &#8220;processing&#8221;,
      dataQuality: &#8220;processing&#8221;,
    };

  const decision =
    current.fraudAssessment.riskLevel === &#8220;high&#8221;
      ? {
          approval: &#8220;declined&#8221;,
          reason: &#8220;High fraud risk&#8221;,
          decidedAt: event.checkedAt,
        }
      : !current.merchantLimits.withinLimits
      ? {
          approval: &#8220;declined&#8221;,
          reason: &#8220;High fraud risk&#8221;,
          decidedAt: event.checkedAt,
        }
      : {
          approval: &#8220;approved&#8221;,
          reason: &#8220;Verified&#8221;,
          decidedAt: event.checkedAt,
        };

  return {
    document: {
      ...current,
      status: decision.approval,
      decision,
    },
    events: [
      {
        type: &#8220;PaymentVerificationCompleted&#8221;,
        data: decision,
      },
    ],
  };
};</code></code></pre><p>We&#8217;re returning not only the new state, but also the new event. It can be published to the local module queue or just stored inside the event store.</p><p><strong>Essentially, we&#8217;re making a chaotic outside world linear based on the order of our observations.</strong> We can&#8217;t change the outside world, but we can at least know why and where we&#8217;ve made our decisions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ltoi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcea52e77-bfa8-4f39-87f3-acc1dd13956d_1211x485.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ltoi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcea52e77-bfa8-4f39-87f3-acc1dd13956d_1211x485.png 424w, https://substackcdn.com/image/fetch/$s_!ltoi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcea52e77-bfa8-4f39-87f3-acc1dd13956d_1211x485.png 848w, https://substackcdn.com/image/fetch/$s_!ltoi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcea52e77-bfa8-4f39-87f3-acc1dd13956d_1211x485.png 1272w, https://substackcdn.com/image/fetch/$s_!ltoi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcea52e77-bfa8-4f39-87f3-acc1dd13956d_1211x485.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ltoi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcea52e77-bfa8-4f39-87f3-acc1dd13956d_1211x485.png" width="1211" height="485" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cea52e77-bfa8-4f39-87f3-acc1dd13956d_1211x485.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:485,&quot;width&quot;:1211,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:81781,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.architecture-weekly.com/i/176638730?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcea52e77-bfa8-4f39-87f3-acc1dd13956d_1211x485.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ltoi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcea52e77-bfa8-4f39-87f3-acc1dd13956d_1211x485.png 424w, https://substackcdn.com/image/fetch/$s_!ltoi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcea52e77-bfa8-4f39-87f3-acc1dd13956d_1211x485.png 848w, https://substackcdn.com/image/fetch/$s_!ltoi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcea52e77-bfa8-4f39-87f3-acc1dd13956d_1211x485.png 1272w, https://substackcdn.com/image/fetch/$s_!ltoi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcea52e77-bfa8-4f39-87f3-acc1dd13956d_1211x485.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>This is actually the same pattern I showed during the webinar on modelling and implementing distributed processes:</p><div id="youtube2-uURh5ziGfb8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;uURh5ziGfb8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/uURh5ziGfb8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>We aggregate data until we have enough of it to make the next decision. Is it perfect? Nah, but sometimes best you can is good enough.</p><h2>When This Approach Works (And When It Doesn&#8217;t)</h2><p>This approach works when:</p><p><strong>Your business logic can handle partial data</strong>: Payment approval can often proceed with fraud score and limits check, even if risk assessment is pending. E-commerce fulfilment can start when payment is confirmed, even if the recommendation engine results are still processing.</p><p><strong>Eventual consistency is acceptable</strong>: The payment dashboard might show &#8220;processing&#8221; for a few hundred milliseconds while verification completes. Most business users can tolerate brief inconsistency in exchange for system responsiveness.</p><p><strong>Decisions can be corrected or refined</strong>: A payment approved with partial data can be flagged for additional review when complete risk data arrives. An order can be expedited or delayed based on a complete customer analysis.</p><p><strong>You have clear business rules for conflicts</strong>: When fraud assessment contradicts approval, fraud wins. When risk assessment arrives late, it updates data quality but doesn&#8217;t reverse committed decisions.</p><p>This approach struggles when:</p><p><strong>Perfect consistency is required</strong>: Financial accounting, legal compliance, and safety-critical systems often can&#8217;t tolerate any inconsistency, even briefly.</p><p><strong>Business logic requires complete data</strong>: Some decisions genuinely can&#8217;t be made with partial information. Credit limit increases might require a complete financial analysis before any approval.</p><p><strong>Rollback costs are high</strong>: If reversing a decision is expensive or impossible, you need stronger ordering guarantees before committing.</p><p><strong>Users can&#8217;t tolerate uncertainty</strong>: Some interfaces need to show definitive status immediately, not &#8220;processing&#8221; or &#8220;partial data available.&#8221;</p><h2>Conclusion: Embracing the Chaos</h2><p><strong>The world is chaotic, and we can&#8217;t stop the chaos, but we can stop fighting the chaos.</strong> External events are rumours about what happened in other systems. Your read model can store the information you derive from those rumours. The evolve function processes whatever arrives, in whatever order, building state incrementally, making your local interpretation.</p><p>Undeniably, it&#8217;s a workaround of some sort. But it&#8217;s also an acknowledgement that distributed systems don&#8217;t guarantee order across boundaries. When you can&#8217;t control the topology, you adapt on your side. Store data as it arrives. Denoise in your projections. Create clean internal events for downstream systems.</p><p>Sometimes the proper solution is fixing your message topology. Route related events to the same partition. Use predictable identifiers for correlation. But when external systems constrain your options, when organisational boundaries limit your control, when legacy integrations force your hand - you&#8217;ve got to do what you&#8217;ve got to do.</p><p>Build your local models and live with partial state. Process events in any order. Make decisions with available data. That&#8217;s how you build reliable systems on unreliable foundations.</p><p><strong>If you&#8217;re dealing with such issues, I&#8217;m happy to help you through consulting or mentoring. <a href="mailto:oskar@event-driven.io">Contact me</a> and we&#8217;ll find a way to unblock you!</strong></p><p>Read also more in:</p><ul><li><p><a href="https://www.architecture-weekly.com/p/the-order-of-things-why-you-cant">The Order of Things: Why You Can&#8217;t Have Both Speed and Ordering in Distributed Systems</a>,</p></li><li><p><a href="https://event-driven.io/en/internal_external_events/">Internal and external events, or how to design event-driven API</a>,</p></li><li><p><a href="https://event-driven.io/en/simple_trick_for_idempotency_handling_in_elastic_search_readm_model/">Dealing with Eventual Consistency and Idempotency in MongoDB projections</a></p></li><li><p><a href="https://event-driven.io/en/saga_process_manager_distributed_transactions/">Saga and Process Manager - distributed processes in practice</a>,</p></li><li><p><a href="https://www.architecture-weekly.com/p/predictable-identifiers-enabling">Predictable Identifiers: Enabling True Module Autonomy in Distributed Systems</a></p></li><li><p><a href="https://www.architecture-weekly.com/p/dealing-with-eventual-consistency">Dealing with Eventual Consistency, and Causal Consistency using Predictable Identifiers</a>,</p></li><li><p><a href="https://event-driven.io/en/event_driven_distributed_processes_by_example/">Event-driven distributed processes by example</a>,</p></li><li><p><a href="https://www.architecture-weekly.com/p/workflow-engine-design-proposal-tell">Workflow Engine design proposal, tell me your thoughts</a>,</p></li><li><p><a href="https://event-driven.io/en/how_to_have_fun_with_typescript_and_workflow/">How TypeScript can help in modelling business workflows</a>,</p></li><li><p><a href="https://event-driven.io/en/how_to_update_past_data_in_event_sourcing/">Oops I did it again, or how to update past data in Event Sourcing</a>,</p></li><li><p><a href="https://event-driven.io/en/event_transformations_and_loosely_coupling/">Event transformations, a tool to keep our processes loosely coupled</a>,</p></li><li><p><a href="https://event-driven.io/en/testing_asynchronous_processes_with_a_little_help_from_dotnet_channels/">Testing asynchronous processes with a little help from .NET Channels</a>.</p></li></ul><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/pl/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/pl/donate/">Ukraine humanitarian organisation</a> or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[On Messaging and Distributed Systems with Ian Cooper]]></title><description><![CDATA[Why do we still need to learn messaging? Why isn&#8217;t it a commodity yet? How to shape boundaries, and how data on the insight, and data on the outside can help in that. This and more we discussed with Ian Cooper learning also how he started his journey and his approach to learning and teaching.]]></description><link>https://www.architecture-weekly.com/p/on-messaging-and-distributed-systems</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/on-messaging-and-distributed-systems</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 13 Oct 2025 05:01:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/0NYwN_p2pFI" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Welcome to the new week!</p><p>New week, new video. Last week, you got <a href="https://www.architecture-weekly.com/p/new-recording-on-event-modelling">my talk about event-driven modelling anti-patterns</a>. A significant portion of it was to explain the difference between logical design and technical one. </p><p>Event-Driven Architecture helps us to design interactions in our system. We prefer async communication, informing others of what has happened by publishing events. Yet, if we model our interactions only with events, then we&#8217;d either end up with a passive-aggressive environment or a room filled with shouting people. That&#8217;s not great.</p><p>Messaging patterns help us to design our message flow optimally.</p><p>One of the best people in the messaging space is Ian Cooper. I have a lot of respect for his work and knowledge. He always does his homework. </p><p>Last week, I had the pleasure of recording an interview with him, thanks to an invitation from <a href="https://www.avanscoperta.it/en/">Avanscoperta</a>. </p><div id="youtube2-0NYwN_p2pFI" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;0NYwN_p2pFI&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/0NYwN_p2pFI?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>We discussed Ian&#8217;s perspective on:</p><ul><li><p>Why do we still need to learn messaging? Why isn&#8217;t it a commodity yet?</p></li><li><p>how we both started with messaging,</p></li><li><p>how Ian is learning and teaching others about distributed systems and messaging,</p></li><li><p>how to shape boundaries, and how data on the insight, and data on the outside can help in that,</p></li><li><p>testing strategies,</p></li><li><p>and more.</p></li></ul><p>The trigger for discussion was Ian&#8217;s upcoming workshop, but I&#8217;m sure that if you&#8217;re looking for a lighthearted exchange of ideas on those topics, I think you'll enjoy it.</p><p>You can also check it on:</p><ul><li><p><a href="https://open.spotify.com/episode/2O0C9QoefQzyaLs5v86HZq?si=1h1UxO7qTAmXTgJX1CUt1A&amp;nd=1&amp;dlsi=0b0c2b91f4f34af9">Spotify</a>,</p></li><li><p><a href="https://podcasts.apple.com/it/podcast/distributed-systems-messaging-patterns-with-ian-cooper/id1509214035?i=1000730424403">Apple Podcasts</a>.</p></li></ul><p>Please share your thoughts with me on how you liked it!</p><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, and putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/pl/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/pl/donate/">Ukraine humanitarian </a>organisation or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item><item><title><![CDATA[New Recording on Event modelling anti-patterns from DDDEU]]></title><description><![CDATA[Have you heard about Passive Aggressive Events or CRUD sourcing? Or maybe about the Clickbait event?If you don't, check the recording of my talk, as knowing only best practices is one side of the coin. Knowing what NOT to do can be even more important.You'll learn the specifics of event-driven modelling, starting with worse practices and knowing why and how to avoid them.You'll hear the story of Rakso, who ended up with the distributed monolith and understand how ignoring modelling practices led him to that pitiful place. Thanks to that, you'll see what and how to avoid it in your case.You'll also find out that there's a thin line between bad and good practices: context.]]></description><link>https://www.architecture-weekly.com/p/new-recording-on-event-modelling</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/new-recording-on-event-modelling</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 06 Oct 2025 09:01:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/Lf1MZlpbkGA" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Welcome to the new week!</p><p><strong>I&#8217;m always saying that with Event-Driven Architecture, the modelling effort pays back. </strong></p><p><strong>That goes both ways: the less effort we put in modelling, the more it&#8217;ll hurt later.</strong></p><p>There&#8217;s a wide range of issues you may be facing, from overfocusing on the state instead of tracking behaviour, to asking others more often than allowing them to tell you what happened, and ending with race conditions and other unpleasant scenarios.</p><p>I packed as many of such cases into our talk, and the recording from this year&#8217;s DDD Europe just arrived:</p><div id="youtube2-Lf1MZlpbkGA" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Lf1MZlpbkGA&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Lf1MZlpbkGA?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>During the session, I explained the specifics of event modelling (yes, no capital letter, and double l), starting with bad practices and knowing why and how to avoid them. </p><p>I told the story about the project that aimed to modernise legacy software into the event-driven world. In theory, artificial, but in practice, none of the examples were made up. Either I made those mistakes on my own, or I saw them in my projects or helped to fix them for my clients.</p><p>I tried to make it both entertaining and educational, bitter and sweet. It is not easy when you&#8217;re not a native speaker. There&#8217;s a thin line between being funny and being silly.</p><p><strong>There&#8217;s also a thin line between bad and good practices. And its name is: </strong><em><strong>context</strong></em><strong>.</strong></p><p>Also, as much some of those cases may seem wild, then I can assure you that all of those mistakes I either:</p><ul><li><p>did by myself,</p></li><li><p>saw in my projects,</p></li><li><p>saw in my client&#8217;s project.</p></li></ul><p>Check it out, why learn always from your mistakes? Learn from mine.</p><p><strong>The talk also summarised my article series about anti-patterns in event modelling. Here&#8217;s the full list:</strong></p><ul><li><p><a href="https://event-driven.io/en/state-obsession/">State Obsession</a>,</p></li><li><p><a href="https://event-driven.io/en/property-sourcing/">Property Sourcing</a>,</p></li><li><p><a href="https://event-driven.io/en/i_will_just_add_one_more_field/">I&#8217;ll just add one more field</a>.</p></li><li><p><a href="https://event-driven.io/en/clickbait_event/">Clickbait event</a>,</p></li><li><p><a href="https://event-driven.io/en/one_or_more_event_that_is_the_question/">Should you record multiple events from business logic?</a>,</p></li><li><p><a href="https://event-driven.io/en/on_putting_stream_id_in_event_data/">Stream ids, event types prefixes and other event data you might not want to slice off</a>.</p></li></ul><p><strong>Check also more general considerations:</strong></p><ul><li><p><a href="https://event-driven.io/en/events_should_be_as_small_as_possible/">Events should be as small as possible, right?</a>,</p></li><li><p><a href="https://event-driven.io/en/whats_the_difference_between_event_and_command/">What&#8217;s the difference between a command and an event?</a>,</p></li><li><p><a href="https://event-driven.io/en/internal_external_events/">Internal and external events, or how to design event-driven API</a>,</p></li><li><p><a href="https://event-driven.io/en/event_streaming_is_not_event_sourcing/">Event Streaming is not Event Sourcing!</a>,</p></li><li><p><a href="https://event-driven.io/en/dont_let_event_driven_architecture_buzzwords_fool_you/">Don&#8217;t let Event-Driven Architecture buzzwords fool you</a>,</p></li><li><p><a href="https://event-driven.io/en/how_to_design_software_architecture_pragmatically/">How to design software architecture pragmatically</a>,</p></li><li><p><a href="https://event-driven.io/en/gdpr_in_event_driven_architecture/">How to deal with privacy and GDPR in Event-Driven systems</a>.</p></li></ul><p><strong>And hey, I&#8217;m also doing consulting and mentoring. If you&#8217;re struggling with your projects/organisation with such cases, I&#8217;m happy to help. Feel free to reach out to me through the <a href="mailto:oskar@event-driven.io">email</a>.</strong> </p><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, and putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/pl/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/pl/donate/">Ukraine humanitarian </a>organisation or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p><p></p>]]></content:encoded></item><item><title><![CDATA[Sneaky Code Bites Back]]></title><description><![CDATA[I was adding SQLite support to Pongo when I realised that I'm talking to myself, and what I heard was insane. I was trying to understand and recap what I just built, an extremely clever and sneaky solution to handle multiple databases. I realised I was wrong, and that sneaky code will soon bite me back. Why?]]></description><link>https://www.architecture-weekly.com/p/sneaky-code-bites-back</link><guid isPermaLink="false">https://www.architecture-weekly.com/p/sneaky-code-bites-back</guid><dc:creator><![CDATA[Oskar Dudycz]]></dc:creator><pubDate>Mon, 22 Sep 2025 12:05:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ps3C!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2c15624-c7ec-4bc5-b4de-0fc4851db328_780x438.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Welcome to the new week!</p><p>Sneaky code bites back, I realised that again recently.</p><p><strong>I realised I'd been writing terrible code when I couldn't explain it to myself.</strong></p><p>The code worked. It was adding SQLite support to <a href="https://github.com/event-driven-io/pongo">Pongo</a>. But when I tried to describe how I was implementing multi-database support, I heard myself saying:</p><blockquote><p><em>"First it checks if the Promise is cached, then it creates a proxy that defers to the real implementation which doesn't exist yet, but will be loaded dynamically when..."</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ps3C!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2c15624-c7ec-4bc5-b4de-0fc4851db328_780x438.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ps3C!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2c15624-c7ec-4bc5-b4de-0fc4851db328_780x438.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ps3C!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2c15624-c7ec-4bc5-b4de-0fc4851db328_780x438.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ps3C!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2c15624-c7ec-4bc5-b4de-0fc4851db328_780x438.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ps3C!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2c15624-c7ec-4bc5-b4de-0fc4851db328_780x438.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ps3C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2c15624-c7ec-4bc5-b4de-0fc4851db328_780x438.jpeg" width="780" height="438" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f2c15624-c7ec-4bc5-b4de-0fc4851db328_780x438.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:438,&quot;width&quot;:780,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Nicolas Cage's Most Outlandish Roles Ever&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Nicolas Cage's Most Outlandish Roles Ever" title="Nicolas Cage's Most Outlandish Roles Ever" srcset="https://substackcdn.com/image/fetch/$s_!ps3C!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2c15624-c7ec-4bc5-b4de-0fc4851db328_780x438.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ps3C!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2c15624-c7ec-4bc5-b4de-0fc4851db328_780x438.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ps3C!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2c15624-c7ec-4bc5-b4de-0fc4851db328_780x438.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ps3C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2c15624-c7ec-4bc5-b4de-0fc4851db328_780x438.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Yes, I sounded insane.</p><h2><strong>What I Was Building</strong></h2><p>Pongo is my attempt to bring the familiar MongoDB API and use relational databases as document databases.</p><p>If you've used MongoDB, you know the API collection.find(), collection.insertOne(), collection.updateOne(). Pongo gives you that same API, but your data lives in <a href="https://www.architecture-weekly.com/p/postgresql-jsonb-powerful-storage">PostgreSQL using JSONB columns</a>.</p><p>Why would anyone want this? Teams often know MongoDB's API but need PostgreSQL's guarantees: real ACID transactions, mature replication, and existing infrastructure. Or they have PostgreSQL but want document storage without learning PostgreSQL's JSON query syntax, which looks like this:</p><pre><code>SELECT data FROM users

WHERE jsonb_path_exists(data, '$.address.history[*] ? (@.street == "Elm St")')</code></pre><pre><code>users.find({ "address.history": { $elemMatch: { street: "Elm St" } } })</code></pre><p>Same result, familiar syntax. I'd built something similar before with Marten for .NET, so I knew the approach worked.</p><p>It seemed that people appreciated it. And when someone appreciates something, they usually want more&#8230;</p><p>Users started requesting support for other databases, e.g. SQLite for edge deployments, DuckDB for analytics, Spanner for distributed systems, etc. Made sense. The MongoDB API could work on any database that supports JSON.</p><h2><strong>The Challenge</strong></h2><p>Adding multi-database support meant dealing with drivers. Database drivers are how your application talks to databases. So code code handling connection protocols, type conversions, and query preparation. Each database needs its own driver. </p><p>The straightforward approach: make developers explicitly import the driver they need:</p><pre><code><strong>import { pongoClient } from '@event-driven-io/pongo';</strong>

<strong>import { databaseDriver } from '@event-driven-io/pongo/pg';</strong>

<strong>const client = pongoClient({</strong>
 <strong>driver: databaseDriver,</strong>

 connectionString: 'postgresql://localhost:5432/mydb'
<strong>});</strong></code></pre><p>This bothered me. The connection string already says "<code>postgresql</code>".</p><p>Why make developers say it twice?</p><p>I wanted automatic driver selection. Parse the connection string, load the right driver, only when needed.</p><h2><strong>The Sneaky Solution</strong></h2><p>I built a deferred loading pattern in Dumbo, my database abstraction layer. Dumbo sits under Pongo, handling connection pools, transactions, and SQL generation. Both Pongo and my event sourcing library, <a href="https://event-driven-io.github.io/emmett/getting-started.html">Emmett</a>, use it.</p><p>Here's the pattern:</p><pre><code>export const c<strong>reateDeferredConnectionPool = &lt;Connector, ConnectionType&gt;(
  connector: Connector,
  importPool: () =&gt; Promise&lt;ConnectionPool&lt;Connection&lt;Connector&gt;&gt;&gt;,
): ConnectionPool&lt;ConnectionType&gt;</strong> =&gt; {
  <strong>let poolPromise: Promise&lt;ConnectionPool&gt; | null = null;</strong>
  
  <strong>const getPool = async () =&gt; {
    if (poolPromise) return poolPromise;
    return (poolPromise = importPool());
  };</strong>
  
  <strong>return createConnectionPool({</strong>
    connector,
    <strong>execute: createDeferredExecutor(connector, async () =&gt; {</strong>
      const connection = await getPool();
      return connection.execute;
    }),
    <strong>close: async () =&gt; {</strong>
      <strong>if (!poolPromise) return;
      const pool = await poolPromise;
      await pool.close();</strong>
    },
    // ... every method proxied through getPool()
  });
};</code></pre><p>Let me explain what's happening here, because it's the core of why my design went wrong.</p><p>The connection pool appears to exist immediately, but it's actually a proxy. Every method (<code>execute</code>, <code>close</code>, <code>transaction</code>, etc.) is wrapped to the first call <code>getPool()</code>. The first time <code>getPool()</code> runs, it imports the real driver and creates the real connection pool. Then&#8212;and this is crucial&#8212;it caches the Promise itself, not the result.</p><p>In JavaScript, a Promise is an object. Once created, it has an identity. You can await the same Promise multiple times, and the function will be evaluated once, the result will be cached and reused in subsequent awaits:</p><pre><code><code>const work = doSomethingAsync();
const a = await work;
const b = await work; // Same Promise object, same eventual result
</code></code></pre><p>This is different from calling the async function twice:</p><pre><code><code>const a = await doSomethingAsync(); // Creates Promise #1
const b = await doSomethingAsync(); // Creates Promise #2, does work again
</code></code></pre><p><strong>By caching the Promise, multiple parts of code can request the connection pool simultaneously, but the import only happens once.</strong> Everyone waits for the same Promise to resolve. This pattern is called Promise memoisation.</p><p>In C# terms, it's like storing a <code>Task&lt;T&gt;</code> and having multiple threads await it. In Python, it's like <code>functools.lru_cache</code> for async functions. The concept exists across languages&#8212;cache the operation, not just the result.</p><p>This worked in Dumbo. The API stayed clean. Drivers are loaded on demand. You pay the differing cost only once.</p><p>Then I tried extending this pattern to Pongo itself.</p><h2><strong>The Cascade Effect</strong></h2><p>To support automatic driver selection in Pongo, I'd need to defer everything.</p><p>Currently, when you create a Pongo client, it knows its driver:</p><pre><code><code>const client = pongoClient({ connectionString });
const db = client.db('myapp');
const users = db.collection('users');
</code></code></pre><p>Each level&#8212;client, database, collection&#8212;is a concrete object with real methods.</p><p><strong>With automatic driver selection, none of these could be real until first use.</strong> The client wouldn't know which driver to use until it parsed the connection string. However, parsing occurs asynchronously, as dynamic module loading in JavaScript can only happen with <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/import#import_a_module_for_its_side_effects_only">async imports</a>. So the client also needs to become a proxy just like the Dumbo connection. The database becomes a proxy. The collection becomes a proxy.</p><p><strong>Here's what a deferred collection would look like:</strong></p><pre><code>class <strong>DeferredPongoCollection&lt;T&gt; implements PongoCollection&lt;T&gt; {</strong>
  <strong>private realCollectionPromise: Promise&lt;PongoCollection&lt;T&gt;&gt; | null = null;</strong>
  
  <strong>constructor(</strong>
    private name: string,
    <strong>private getDb: () =&gt; Promise&lt;PongoDb&gt;</strong>
  ) {}

  <strong>private async getRealCollection(): Promise&lt;PongoCollection&lt;T&gt;&gt; {
    if (!this.realCollectionPromise) {
      this.realCollectionPromise = this.getDb()
        .then(db =&gt; db.collection&lt;T&gt;(this.name));
    }
    return this.realCollectionPromise;
  }</strong>

  <strong>async find(filter?: Filter&lt;T&gt;, options?: FindOptions): Promise&lt;T[]&gt; {</strong>
    <strong>const collection = await this.getRealCollection();</strong>
    return collection.find(filter, options);
  }

  <strong>async findOne(filter: Filter&lt;T&gt;, options?: FindOptions): Promise&lt;T | null&gt; {
    const collection = await this.getRealCollection();</strong>
    return collection.findOne(filter, options);
  }

  <strong>async insertOne(doc: T): Promise&lt;InsertOneResult&gt; {
    const collection = await this.getRealCollection();</strong>
    return collection.insertOne(doc);
  }

  <strong>async insertMany(docs: T[]): Promise&lt;InsertManyResult&gt; {
    const collection = await this.getRealCollection();</strong>
    return collection.insertMany(docs);
  }

  <strong>async updateOne(filter: Filter&lt;T&gt;, update: Update&lt;T&gt;): Promise&lt;UpdateResult&gt; {
    const collection = await this.getRealCollection();</strong>
    return collection.updateOne(filter, update);
  }

  <strong>async updateMany(filter: Filter&lt;T&gt;, update: Update&lt;T&gt;): Promise&lt;UpdateResult&gt; {
    const collection = await this.getRealCollection();</strong>
    return collection.updateMany(filter, update);
  }

  // ... and 20+ more methods
}</code></pre><p>Every single method becomes a proxy that waits for the real collection. But the collection needs a database, which needs a client, which needs a driver. Three levels of deferred proxies.</p><p>Now add TypeScript generics that need to flow through all these layers. Add error handling&#8212;what if the import fails? Add specialised MongoDB types like <code>AggregationCursor</code>, <code>ChangeStream</code>, <code>BulkWrite</code>. Each needs its own deferred proxy class.</p><p>The driver would register itself through a side effect:</p><pre><code>// In @event-driven-io/pongo/pg
<strong>import { registerDriver } from '@event-driven-io/pongo';</strong>
<strong>registerDriver('postgresql', () =&gt; import('./postgresqlDriver'));</strong></code></pre><p>Then you could add just the import anywhere in the app:</p><pre><code>import * from '@event-driven-io/pongo/pg';</code></pre><p>And magic registration will happen, and as a user, you wouldn&#8217;t need to know about all the stuff I just wrote, right? RIGHT?!</p><p>Right, if you remember about this import, if you forgot, then the registration doesn't happen. The error only surfaces at runtime when you try to connect.</p><h2><strong>The Failure</strong></h2><p>When I was into this design, I reminded myself on a similar thing I wanted to do in past.</p><p>Years ago, I nearly shipped distributed transactions using <a href="https://en.wikipedia.org/wiki/Microsoft_Distributed_Transaction_Coordinator">MSDTC (Microsoft's distributed transaction coordinator).</a> The code worked perfectly in testing. But when I imagined production failures, I realised I couldn't debug them. The complexity was hidden, not gone.</p><p>This was the same mistake.</p><p>Consider what happens when something goes wrong. A user reports: </p><blockquote><p>"I'm getting an error when inserting a document."</p></blockquote><p>Where's the actual problem? Let's trace through the layers:</p><ol><li><p>User calls <code>collection.insertOne(doc)</code></p></li><li><p>Deferred collection waits for real collection via <code>getRealCollection()</code></p></li><li><p>Which waits for a deferred database via <code>getDb()</code></p></li><li><p>Which waits for a deferred client to resolve</p></li><li><p>Which waits for the driver to be imported</p></li><li><p>Which needs the connection string to be parsed</p></li><li><p>Which needs the driver to be registered</p></li><li><p>Which depends on a side-effect import that may not have happened</p></li></ol><p>The stack trace shows <code>insertOne</code> at the top. The actual error&#8212;maybe a typo in the connection string, maybe a missing driver import, maybe a network issue&#8212;is buried under seven layers of Promise resolution.</p><p>But there's a worse problem: Promise memoisation means errors get cached forever.</p><p>If the first connection attempt fails, <code>poolPromise</code> holds a rejected Promise. Every subsequent request awaits the same rejected Promise. You can't retry without restarting the process. This is a fundamental property of Promises&#8212;once settled (resolved or rejected), they never change state.</p><p>In production, this means that one failed connection attempt during startup breaks everything until a restart is performed. A temporary network glitch becomes a permanent failure.</p><h2><strong>The Nightmare</strong></h2><p>Let's say you're debugging this in production. You add logging:</p><pre><code><code>async insertOne(doc: T): Promise&lt;InsertOneResult&gt; {
  console.log('insertOne called');
  const collection = await this.getRealCollection();
  console.log('Got real collection');
  return collection.insertOne(doc);
}
</code></code></pre><p>The logs show "insertOne called" but not "Got real collection". Where's the problem? Could be:</p><ul><li><p>Driver not registered (missing import)</p></li><li><p>Connection string malformed</p></li><li><p>Network unreachable</p></li><li><p>Import failed (syntax error in driver)</p></li><li><p>Previous connection attempt failed (cached rejection)</p></li></ul><p>You need to understand the entire deferred loading chain to debug a simple insert.</p><p>Now imagine explaining this to a contributor who wants to add a DuckDB driver. </p><blockquote><p><em>"First, create your driver. Then register it via side effect. Understand Promise memoization because failures get cached. Know that everything is proxied through three levels of deferreds. When debugging, remember the stack trace lies about where errors originate..."</em></p></blockquote><h2><strong>The Alternative</strong></h2><p>I chose explicit driver injection instead:</p><pre><code>import { pongoClient } from '@event-driven-io/pongo';
<strong>import { databaseDriver } from '@event-driven-io/pongo/pg';</strong>

const <strong>client = pongoClient({</strong>
  <strong>driver: databaseDriver,</strong>
  connectionString: 'postgresql://localhost:5432/mydb'
});</code></pre><p>One extra import. That's the entire cost.</p><p>The benefits:</p><ul><li><p>TypeScript catches missing drivers at compile time: "Property 'driver' is missing"</p></li><li><p>Stack traces point to real problems</p></li><li><p>Failed connections can retry</p></li><li><p>New developers understand it immediately</p></li><li><p>Each driver can expose database-specific features</p></li></ul><p>The code is boring. No magic. No proxies. No deferred loading. It works or it doesn't. When it fails, you know why and where.</p><h2><strong>The Pattern</strong></h2><p>This isn't really about Pongo or database drivers. It's about a pattern we all fall into.</p><p>We see repetition and think "I can abstract this." </p><p>We see explicit configuration and think: <em>"I can infer this"</em>. We see upfront costs and think: <em>"I can defer this"</em>.</p><p>Sometimes we just don&#8217;t trust our colleagues, thinking that they&#8217;re too incompetent and we need to take precautionary steps.</p><p>Sometimes we're right. Most often we&#8217;re not.</p><p><strong>Too often, we're optimising the wrong metric. I was optimizing for fewer imports, not simpler code. I was hiding essential configuration, not accidental complexity.</strong></p><p>A minimal API with complex implementation isn't simple&#8212;it's a lie. The complexity doesn't disappear. It moves from the visible surface to hidden internals, where it's harder to understand, debug, and modify.</p><p>I've seen this pattern everywhere:</p><p><strong>ORMs with query optimization</strong>: They generate "optimal" SQL you can't debug. When queries are slow, you're reverse-engineering what the ORM decided to do. The abstraction that was supposed to help becomes a barrier to fixing problems.</p><p><strong>Dependency injection with auto-wiring</strong>: The container magically figures out what to inject where. Works great until it doesn't. Then you're debugging why the container chose the wrong implementation, and the magic configuration system has no good debugging tools.</p><p><strong>Build tools with convention over configuration</strong>: They discover your project structure automatically. Perfect until your project doesn't fit the convention. Then you're fighting the tool's assumptions, wishing for explicit configuration.</p><p><strong>Test frameworks with magical setup</strong>: They auto-mock dependencies, auto-setup databases, auto-configure everything. The tests become harder to understand than the code they're testing. When tests fail, you debug the framework, not your code.</p><p>The pattern is always the same. Good intentions. Hidden complexity. Eventual regret.</p><h2><strong>When Magic Becomes a Prison</strong></h2><p>Magic in code should be like magic in a good movie&#8212;the audience shouldn't see how it works. But there's a crucial difference: movie audiences never need to modify the magic trick.</p><p>Code has maintainers. They need to understand the magic, extend it, debug it. When the magic is at the surface&#8212;the API level&#8212;every maintainer becomes a magician's apprentice.</p><p>The deferred loading pattern would have created a codebase where:</p><ul><li><p>Reading code means understanding proxies and Promise memoization</p></li><li><p>Adding features means maintaining the deferred chain</p></li><li><p>Debugging means tracing through invisible layers</p></li><li><p>Onboarding means learning the entire magical system</p></li></ul><p>It creates a priesthood of maintainers who understand the magic. Everyone else fears the code. They make small changes and hope nothing breaks. They don't refactor because they might disturb the magic.</p><p>Let's be honest about what I was trading:</p><p><strong>The cost of explicit drivers:</strong></p><ul><li><p>One extra import line</p></li><li><p>Developers must know which database they're using</p></li></ul><p><strong>The cost of automatic driver selection:</strong></p><ul><li><p>Three levels of proxy objects</p></li><li><p>Promise memoization complexity</p></li><li><p>Cached failures requiring restarts</p></li><li><p>Incomprehensible stack traces</p></li><li><p>Side-effect registration</p></li><li><p>Impossible debugging</p></li><li><p>Steep learning curve for contributors</p></li></ul><p>Was saving one import worth all that complexity? Obviously not. But I didn't see it until I started implementing it.</p><p>This is why I need to write code to understand ideas. My brain can't model all the implications abstractly. I need to "sweat out" designs in real code. Sometimes that means building something just to realize it's wrong.</p><p>The key is recognizing it before shipping.</p><h2><strong>The Lesson</strong></h2><p>Every time I've tried to be sneaky, it's bitten me. The distributed transactions that worked until they didn't. The code generators that saved typing but destroyed readability. The clever abstractions that became maintenance nightmares.</p><p><strong>The problems sneaky code solves are almost always smaller than the problems it creates.</strong> But we don't see that at design time. We see the elegance of the solution, not the pain of living with it.</p><p>I wanted to make Pongo simple. I was making its setup simplistic and obscure. Simple means easy to understand and maintain. Simplistic means oversimplified to the point of being inadequate for real-world use.</p><p>Now I'm removing the clever patterns. Each driver will be explicit. Each connection direct. Each error immediate and clear.</p><p>The new code is boring. Import a driver. Pass it to the client. It connects or fails with a clear error.</p><p>Boring code is debuggable. Explicit code is maintainable. Simple code&#8212;truly simple, not simplistic&#8212;is code that works and keeps working.</p><p>I got too clever. We all do sometimes. The key is catching it before it escapes into production, where it becomes someone else's nightmare.</p><p>If you want to see &#8220;The Removal&#8221;, check <a href="https://github.com/event-driven-io/Pongo/pull/126">this Pull Request</a> and related follow-ups. I&#8217;m replacing the clever code with boring code. It's less exciting than it sounds, and that's the point.</p><p>And hey, there&#8217;s also another lesson: we all make mistakes, and fall into the rabbit hole. At least I do.</p><p>Cheers!</p><p>Oskar</p><p>p.s. <strong>Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help.</strong> You can help in various ways, for instance, directly helping refugees, spreading awareness, and putting pressure on your local government or companies. You can also support Ukraine by donating e.g. to <a href="https://www.icrc.org/pl/donate/ukraine">Red Cross</a>, <a href="https://savelife.in.ua/pl/donate/">Ukraine humanitarian </a>organisation or <a href="https://www.gofundme.com/f/help-to-save-the-lives-of-civilians-in-a-war-zone">donate Ambulances for Ukraine</a>.</p>]]></content:encoded></item></channel></rss>