Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's something about event driven architectures that captures our engineering minds and makes our imaginations run wild. At face value it seems like the Grand Unifying Theory of software engineering, a Grand Unifying Architecture. Everything is perfectly decoupled and only reads and writes to an event bus. Perfect uniformity, everything just reads and writes events. Perfect open-closed semantics, I can completely add new functionality without touching any existing components just by having it read from the event bus.

However, like the grand unifying theories of physics so far, it just doesn't match reality. I have yet to witness a system that fully embraces event driven architecture that isn't a complete nightmare to read, understand, debug and modify. Yet we seem unable to shake the idea that it is a panacea. It captures the minds of each new generation of engineers and gets implemented in the technology of the era. For me it was applying the observable pattern to everything in Java. Now it's setting up a bunch of microservices that only communicate through Kafka. Maybe I'm just not a true Scotsman but this pattern has anecdotally never worked well in anything I've had to work with. I would think very hard before applying it as a pattern in my code let alone as the driving force of my architecture. That's just my experience and 2 cents.



In my experience it's a bit the opposite; the domains I've worked in have always have events, whether given that name or not, and the systems for handling these events are usually coded much more optimistically and naively at first before people break down and start migrating parts of the system to run in event-driven style for performance and reliability reasons. People who are aware of event-driven architecture tend to make a smoother transition and create systems that make sense and are easier to work with. People who get there accidentally, forced every inch by successive performance and reliability bugs, end up with a hodge-podge that in retrospect is a poorly designed event-driven system.

But that's just what I've seen in my experience. I've seen damage from people being ignorant of event-driven architecture or being in denial about how their systems are evolving; you've seen damage from people being overeager to use it. Probably in your shoes I would have seen the same things you have.

One thing I've given up on seeing is the content of the linked article.


In my experience, much more damage has been done by people who implement EDA and fancy cloud architecture in general. Something like 95% of applications are simple CRUD with a minuscule amount of custom domain logic. Doing anything beyond a sanely laid out monolith for these is resume padding and potentially project killing. I think people just don't want to admit that their job is validation, presentation, and shuffling data in and out of a database.


Yeah, but very quickly in my experience the simple REST-based CRUD service starts waking up every 15 minutes to post a batch of rows from the database to a different REST-based CRUD service (or to itself if you have a monolith), and when that gets slow it gets scaled by increasing the wake-up frequency, and then someone starts writing batching logic and trying to scale it horizontally. Then external services start getting mixed in (and, if you're doing B2B, customer services) and you have to deal with their performance and reliability issues. Or maybe not; maybe you're Reddit and can get huge while still being essentially a CRUD app with a couple of UI front ends.


I think this is the key: EDA should be used to communicate between different applications. If it is used within an application it points to over-engineering.


you can try reading https://www.fullhn.com/ while it's on the front page


The thing is -- computers ARE event driven at the low levels.

Problem is most systems are designed to completely hide the event-driven nature of things.

The "nightmare to read, understand, debug and modify" is happily abstracted away and we have a nice safe environment to work with.

... and it then becomes much harder to support parallelism, error handling and responsiveness.

I think there may be a more nuanced solution.


Not really.

Interrupts are really a polling mechanism: the CPU checks an interrupt line at convenient times and alters its control flow.

Most signaling depends on timing: based on what signals have been seen so far, some other signals must appear within a certain time frame. The new signals are blindly sampled.

For instance, if some device asserts some bus line that it's writing some data, then the data to be written is expected to then appear in some time window. The data lines will be sampled regardless of whether the device actually puts out that data. There is no "device has done it" event, and even the original write indication signal is basically polled.

If we look for a pure cause-and-effect relationship, it's hard to find. A device whose operation is altered by a signal is as much the cause for what happens as is the device which originates that signal.


I'm curious here. In my mind the computer is basically imperitive. It processes a series of instructions. It then happens to process multiple of such streams at once, but it's still essentially serial.

How is it event driven?


It's event driven in the manner of stimulus/response. A computer may process a series of instructions, but absent a loop it then finishes and does nothing without being told to process another series of instructions. Even within a listening loop it is essentially looking for "something to happen" and if nothing happens it NOPs, or does some basic house cleaning regarding the loop which is essentially the same. It is only when a stimulus occurs that the computer responds with a not-loop series of instructions.


I suspect they’re confusing “event driven” with “interrupt driven”.


Yes, interrupts and events are distinct.

Would it be clearer to say sequential vs asynchronous?

Or to say that interrupts are one form of event, but not all events involve interrupts.


Most computers have had more than one core and even additional cpu's like a gpu. Also things on the bus usually have their own processors. So things are happening at the same time, it might tick at the same rate that the bus allows but they communicate by events like interrupts. As far as I understand it.


i/o, exceptions, system calls.. all these (usually) involve stopping the sequential flow of your program and doing something else.


If you do it at a small scale, where a team of a handful engineers is responsible for following/maintaining/debugging countless distinct systems that are only linked through Kafka, its insane.

if you have hundreds or thousands of engineers in more teams than you can count, and each team is responsible for a few event producers or consumers and have the time and resources to dedicate to be experts about their own systems and their interfaces/boundaries, it's an amazing way to scale your organization. It really does work.


I understand what you are saying here. But... I am having trouble determining the implication.

It seems like you are saying that an EDA is only useful when no one has to understand the system in its entirety? That can't possibly be useful can it?

The implication is that an EDA is inappropriate if I have a system with 1000 services and 5 engineers, but amazing if I have 500 engineers? Isn't the purpose of a technical architecture to optimize for the former (assuming engineering is a cost-center)?

The former (small-scale) version you describe is really just an optimization for the latter (large-scale) version you describe a la "We have simplified our systems so much that we no longer need as many engineers to maintain it".

In a way, your comment serves to substantiate the point of its parent in that it asserts that choosing an EDA requires more engineers per service.

I must be misunderstanding your point.


> Isn't the purpose of a technical architecture to optimize for the former (assuming engineering is a cost-center)?

The companies that developed these things are employing hundreds or thousands of engineers. So, yeah, they are optimizing for engineering costs when you have thousands of engineers. They are building systems that cannot be fully understood by a single person. If you don't have thousands of engineers, their solutions to their problem won't necessarily be your solution to your problem.

Complex beasts like microservices and event driven architecture are about enabling huge engineering departments to make changes to huge systems without constantly stepping on eachothers toes.

Ignoring headcount, they are actually very inefficient because systems designed like this necessarily has tons of duplicated effort across the organization. Those inefficiencies are made up by reducing the amount of communication required between thousands of people and hundreds of teams to change and maintain the system. But if your department isn't huge then the communication overhead is less than the duplicated effort, which tends to make these designs a poor choice.

These sorts of things usually do not scale down. If you have 1,000 services and 5 engineers there's a 99.999% chance you're doing it wrong. 5 engineers is not even enough employees to manage, monitor, and build the requisite tooling to make efficient use of a system distributed across 1,000 services, much less build and understand the system as a whole.


No, if you can have 1000 services maintained by 5 eng, you're doing fine.

The problem organizations hit is such: you have 2-3 huge services maintained by 5 engineers. The company grows, the services grow. 5 eng become 10 eng become 100. It's still the same 2 or 3 services but much, much bigger. You get to a point where people are getting in each other's way, and every additional person you had provided less value than the people before them. They have to work within the boundaries of the system and coordinate with other people.

It's all in the same programming language so you are limited by one pool of potential hires. Tech debt affects everyone. Deploys affect everyone. Downtime affect everyone.

So you split it to solve some of these. Then split it again. And again. And again. You keep splitting until someone can reasonably wrap their head around a single atomic piece, from implementation to deployment. The piece can be rewritten from scratch at any time without impacting any other piece, in any language, on any infrastructure, with any framework, to optimize for the people working on it.

That's great, but that piece still has to be part of the whole. These event buses are an elegant way to keep them together without adding complexity at the individual piece level.

Yeah, it means people might not the understand the whole system anymore, but do they need to? It's like people organization. The CEO doesn't know the intricacies of the HR system. They know it at a macro level but defer to the HR manager for the details. Same thing here, but with software.


Can confirm that message buses aren't just buzzwords and are literally the only feasible way to combine business systems in the most complex of enterprises. I have seen a messaging domain with over 1000 different event types to deal with. Gigabytes of messages per hour. 24/7/365. Tens of thousands of different system interfaces all talking together using synchronous RPC semantics without experiencing any errors for days at a time. Systems spread across the planet. It's the perfect solution if you have the resources to maintain it.


Event driven architecture is like GOTOs but with data instead of execution flow. Throw your data to the wind, someone somewhere will handle it (I hope).


In general I tend to agree. However really the issue is one of correctly defining the granularity of events that the system should be aware of.

After all, even a completely synchronous webapp is event driven, it just so happens to only care about a single event, a request.

The problems come about when code which can all be synchronous becomes unnecessarily complicated and spread across multiple microservices.

Microservices are generally a solution to people scaling issues, not technical ones. If you have significantly more services than teams, you are probably doing something wrong.


My experience is similar - the problem becomes when an event is triggered - where did it come from, oh it was from this event which was from this event and so on.

Where events shine in my experience is in GUI's - the onClick event then does some code. So my rule of thumb is only one level of events and it comes from an external source - user, network etc.


Ohh my god this. My current project has this highly overengineered (with C++ and templates no less) event system that persists. Despite being one of the biggest complaints of our customers. We continue to think that the event system is "just one more" small tweak away from solving all of our problems.


I think the level of abstraction applied to "event" is key in determining whether the architecture is successful (or even appropriate).

If you attempt to implement at a granular level (got click on button!) then it turns into a giant mess because you're essentially reinventing the logical flow and event loop of programming via a message bus, which is a pretty terrible idea. If instead you operate at the business logic level of abstraction it can result in a much more coherent and easily-extended system, especially since that level also sees the most change from executives or operations. At that level it's also easier to integrate with third-party services.

It is certainly not a cure for all that ails you, despite the claims of some snake-oil salesmen to the contrary.


I couldn't agree more. It's like the pot of gold at the end of a rainbow.


> Everything is perfectly decoupled

And every damned stack trace in the debugger goes a couple of frames up into a "receive_event" function, and then the trail goes cold.


It works extremely well for ux development ;-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: