Distributed systems

Tag: distributed-systems

  • Compensation is all around us (28 November 2023)
    In a message-based system, we might feel a lack of control, especially when in need of compensating changes spread across the system. Fear not! Real life deals with compensation every day! And it's better than rolling back a transaction or deleting some data in the database.
  • The power of timeouts to compensate for failures and other tales (18 October 2023)
    There are scenarios when a chatty services relationship seems the only option, with the results of coupling quickly becoming our best friend. Not all hope is lost, we can try to ask different questions to untangle the knot.
  • What's an Outbox and why do we need it? Hint: it's about data integrity (7 February 2023)
    Distributed systems are ugly beasts sometimes. They hide subtle tricks that can lead to data loss and system corruption. The Outbox pattern helps address a couple of them.
  • The pitfalls of request/response over messaging (19 January 2023)
    Request/response is everywhere. It serves us very well and is a neat solution in many scenarios. It comes with a few pitfalls in distributed systems and needs to be handled with care.
  • What is the deal with security and distributed systems? (1 November 2022)
    Security is a crucial topic for any architect. We cannot implement it as a second thought. We must consider its implications from day one. Distributed systems are no different. However, it might be a little more involved.
  • Autonomy probably doesn't mean what you think it means (5 September 2022)
    There seems to be some misunderstanding around the word 'autonomous' when used in the context of distributed systems. Unfortunately, there is no unique meaning, it depends on the context and the observer's point of view. It might not mean what you think.
  • Distributed systems evolution: topology changes (25 July 2022)
    Evolving distributed systems architecture is challenging. It's not only a matter of evolving message contracts or processes state. Surprisingly, deployments can play a role in creating more challenges.
  • Distributed systems evolution: processes state (12 July 2022)
    Evolving distributed systems architecture is challenging. Addressing message evolution is one aspect. Another one is evolving existing processes and their persisted status.
  • Distributed systems evolution: message contracts (4 July 2022)
    Evolving distributed systems architecture is challenging. If the system is message-based, the first challenge comes from evolving message contracts.
  • Distributed systems evolution challenges (11 June 2022)
    What are the challenges posed by evolving distributed systems architecture? In this short series of articles, we'll understand the critical factors we should be keeping an eye on and how to address them.
  • Do we need to debug distributed systems? (23 May 2022)
    We're humans. We are designed to apply previous experience and knowledge to new problems. When faced with distributed systems, we want to debug them. Do we need that? And, can we debug distributed systems?
  • We need insights, not data (19 April 2022)
    Gauges and graphs attract software engineers like honey for bees. We spend hours implementing distributed logging solutions or monitoring systems, and still we have a hard time understanding what's going on.
  • Where we're going, we don't need service discovery (12 March 2022)
    Too many times technology is used to solve problems that, to begin with, should not be considered problems. Service discovery, on many occasions, is a solution in search of a problem.
  • AsyncAPI, a specification for defining asynchronous APIs (23 February 2022)
    Distributed systems governance is a hot topic. At first, it might feel overwhelming. It's important to understand what we need to govern and which tools can help.
  • Is it complex? Break it down! (3 January 2022)
    Sometimes, we choose technology based on the perceived complexity or heaviness. We focus our decisions on the technical solutions and rather than looking deeper at the problems, we stick with what we know. Are we making the right choices?
  • Isn't A supposed to come before B? On message ordering in distributed systems. (20 October 2021)
    We are used to lists, sequences, and procedural approaches. We are constantly under the impression that what we do is ordered. That's not the case. Why are we trying to replicate into software architectures a non-existent ordering?
  • Update me, please (3 August 2021)
    We're so used to notifications that we probably never stopped to think about how to implement them. It might be trivial at first glance. However, in a distributed system, we might face more challenges requiring techniques we don't expect when implementing a notifications infrastructure.
  • Don't keep a saga in both camps (28 July 2021)
    When it comes to distributed systems, autonomy is a guiding star, and coupling is the villain trying to sneak in at every step. Orchestration is a particularly subtle form of coupling, usually detected when it's too late. However, the root cause is somewhere else.
  • Own the cache! (15 July 2021)
    Caches are everywhere and power the internet. When it comes to distributed systems, they are an essential tool in our tool belt. However, special care needs to be put into defining who owns the cache.
  • I'll be back (8 February 2021)
    Time from the perspective of systems design has many nuances and complexities. There are clock drift issues and design issues related to modeling the passage of time. Shall we model the passage of time as a clock does?
  • Do not trust the user mental model: Model behaviors, not data (2 February 2021)
    When designing systems, we say how important it is to model the system following the user mental model. Nonetheless, it works. However, it's not necessarily always the right choice.
  • Transactions? None for me, thanks (30 January 2021)
    Queues are designed for reliability. I personally stress a lot about designing message processing to be as transactional as possible. Is there a use case for unreliable message processing?
  • Ooops, can I try again, please? (21 January 2021)
    When systems fail, we can retry the whole process and be successful. However, there are scenarios in which retrying a subset of the process might be a better choice. Not all failures are born equal.
  • Ehi! What's up? Feedback to users' requests in distributed systems (12 January 2021)
    The system's design proceeds at full speed; all of a sudden, a thunderbolt hits us: how do we go about providing feedback to users? Request handling is asynchronous, and thus results are eventually consistent. What technique can we use to preserve the user context to get back to them with results?