Architecture Weekly #166 - 12th February 2024

Feb 12, 2024

Welcome to the new week!

I want to introduce you to Emmett! Finally, I gathered the patterns around Event Sourcing and CQRS I used last year and grouped them into a Node.js package.

Check the documentation:

Emmett - a Node.js library taking your event-driven applications back to the future!

Why Emmett? Because I'd like to take your event-driven tooling back to the future!

The goal is to help reduce the boilerplate, enhance strong sides and help you skip common mistakes. Currently, you can find there abstractions for patterns like business logic (e.g. Decider), business workflows and command handling.

More will come later. Read also more in the blog where I expanded on my motivation behind it:

Oskar Dudycz - Announcing Emmett! Take your event-driven applications back to the future!

One of the things that are too often missed while doing event-driven design is data governance practices. We’re taking things too lightly. It may work well in some environments, but for the bigger enterprises, that’s too optimistic if we want to use event-driven tools as a communication backbone.

Wim Debreuck explained in his talk why and how we can and should think more responsibly about defining our event model. He showed how to put our events in a broader context. He used Kafka as an example. Event Streaming is not the same as Event Sourcing, but both come from event-driven tooling, and many patterns are the same, and we can learn from each other. It’s essential to think about what should happen with the events we store, as it’s just the beginning of the journey.

Wim Debreuck - Event Driven Architecture & Governance in action

Check also:

Martin Schimak - Tackling Complex Event Flows

There was an interesting decision around the data governance. The Netherlands agency SIDN (Stichting Internet Domeinregistratie Nederland) decided to outsource part of its '.NL' domain registry services to Amazon Web Services (AWS). Initially, that went silent but ignited controversy, highlighting tensions between commercial interests and the collective security, stability, and sustainability of the Dutch (and the whole) Internet. In theory, the World Wide Web is a global thing, but the reality is that we still have borders, even if we don’t see them clearly on the Internet.

It’s an interesting study of data sovereignty, so potentially, security and privacy around data. The article discusses whether they should really have lower priority than cutting costs or ambitions to enter the SaaS market by agencies. That raises concerns about European digital sovereignty and the long-term implications of entrusting essential services to major US cloud providers (or others). It has the potential to conflict with national interests and preserve digital autonomy.

The case is ongoing. The Dutch Parliament is now questioning the decision, calling for stronger oversight, and reevaluating the criteria for managing such critical infrastructure. Read more in:

Tech Policy Press - The Dangers of Moving Key Internet Governance Functions to Amazon’s Cloud: The Case of the Netherlands

On the other topic, Microsoft released two interesting case studies:

The first article is focused on using OpenAI services from your systems and doing that efficiently. Surprisingly, it’s not about AI per se, but the techniques here can be applied to any other integrations with external services. They showed their recommendations on how to do “smart load balancing”. Why is it smart?

What makes this solution different than others is that it is aware of the "Retry-After" and 429 errors and intelligently sends traffic to other OpenAI backends that are not currently throttling. You can even have a priority order in your backends, so the highest priority are the ones being consumed first while they are not throttling. When throttling kicks in, it will fallback to lower priority backends while your highest ones are waiting to recover.
Another important feature: there is no time interval between attempts to call different backends. Many of other OpenAI load balancers out there configure a waiting internal (often exponential). While this is a good idea doing at the client side, making a server-side load balancer to wait is not a good practice because you hold your client and consume more server and network capacity during this waiting time. Retries on the server-side should be immediate and to a different endpoint.

This pattern should be applied carefully, respecting the specific integration characteristics, but can be useful. It’s nice that they also provided the repository with the code showing how to do it in practice.

The second link explains how The Experimentation Platform at Microsoft (ExP) is doing A/B testing. They show how they’re technically organising their infrastructure to check different scenarios and features for various users. It’s a nice case study of using and configuring tools like reverse proxy to make the testing environment safe and predictable.

In a similar spirit, there’s an article from Luc van Donkersgoed about considerations on single tenancy vs multi-tenancy. He wrote that switching from single-tenant to multi-tenant architecture for the Event-Broker e-Commerce (EBE) platform is like moving from private houses to a shared apartment building

Initially, EBE's single-tenant setup offered simplicity and security, with each user having their resources. This made growth and scalability challenging due to limited resources and slower updates. Transitioning to multi-tenancy is aimed at sharing resources efficiently, like utilities in an apartment, to support growth and improve update speeds. However, this brings new challenges in managing shared spaces securely. The shift is a trade-off between the need for individual security and the benefits of shared efficiency.

Luc van Donkersgoed - The single-tenancy to multi-tenancy spectrum

Check also a really decent write-up by George Ball explaining patiently the relationship between throughput and latency. It also shows scaling considerations:

George Ball - Achieving High Throughput Without Sacrificing Latency

Last but not least, take a read of:

Paul Graham - Life is Short

Remember that no one will remember your overtime hours besides you and your family.

Check also other links!

Cheers

Oskar

p.s. I invite you to join the paid version of Architecture Weekly. It already contains the exclusive Discord channel for subscribers (and my GitHub sponsors), monthly webinars, etc. It is a vibrant space for knowledge sharing. Don’t wait to be a part of it!

p.s.2. Ukraine is still under brutal Russian invasion. A lot of Ukrainian people are hurt, without shelter and need help. You can help in various ways, for instance, directly helping refugees, spreading awareness, and putting pressure on your local government or companies. You can also support Ukraine by donating, e.g. to the Ukraine humanitarian organisation, Ambulances for Ukraine or Red Cross.

Architecture

DevOps

Gregor Hohpe - Application architecture as code

Testing

clumsy - Makes your network condition on Windows significantly worse, but in a controlled and interactive manner

Azure

Go

Mat Ryer - How I write HTTP services in Go after 13 years

Java

Devoxx Belgium - Ask the Java Architects By Sharat Chander, Alan Bateman, Stuart Marks, Viktor Klang, Brian Goetz

.NET

Node.js

Security

Trivia

Paul Graham - Life is Short

Architecture Weekly