Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is a good read, but I'm wondering:

The Microservice architecture pattern significantly impacts the relationship between the application and the database. Rather than sharing a single database schema with other services, each service has its own database schema.

Is this a necessary prerequisite? One of the problems I'm dealing with now (and have been in the past) is the tyranny of multiple data stores. At any reasonable scale, this quickly leads to a lack of consistency, no matter how much you'd like to try.

It feels like most of the gain in a microservices architecture is from functional decomposition of code, with limited benefit from discarding the 'Canonical Schema' of SOA. I'd be interested to hear others' experiences with this, though.



The huge benefit that we see in our architecture (which I would call service-oriented, and not 'microservices' necessarily) is data separation.

Each of our services is a separate django app, and the database name is <consistent prefix>_<app name>. Originally, this meant we had 5-6 database schemas named <something>_friend, <something>_invite, <something>_news, etc., all one one database.

What ended up happening was some services rapidly outgrew the capacity of a single database server, such as our 'news' service, which handles chat services, private messages, and so on (and thus grows nonlinearly with community growth), unlike other services which grow linearly (like our 'identity' service). As a result, the 'news' database had to move to its own server. Thanks to this database schema separation, however, this was a trivial task. Dump the schema, restore the schema, change the DB host in the django config, and you're done.

If we had our data intermingled in the same schema, it would have been far, far harder to do this.

Fundamentally, your 'microservices' style architecture should be designed in such a way that you could take any of your services, tar up the code, and e-mail it to someone else, and they could use it in their architecture. For obvious reasons this isn't actually feasible (e.g. service interdependencies), but conceptually you should be able to draw firm, hard lines down your stack showing where each service starts and ends; this includes frontend services (nginx/haproxy/varnish/whatever configs), code (including interface definitions/client libraries), data persistence (database schemas, MongoDB collections, etc), and caching (Redis/Memcached/etc. instances).

The more interdependencies you have, the more problems you'll encounter down the road. If you intermingle MySQL data then any maintenance is downtime, any slowdown slows everything, any tuning is across your entire dataset, etc.


It's a requirement in the sense that sharing a datastore would break the abstraction. Each service should be independent from the others, which necessitates a separate data store.

Consistency should be maintained at the application level if you want to build a robust service, because doing it in the database leads to a single point of failure (the database)


> Is this a necessary prerequisite?

For something to really be using a microservice architecture? Yes.

Of course, real world systems don't have to use pure architectural styles, though its worth understanding why a named architectural style combines certain features before deciding to use some but not others.

> One of the problems I'm dealing with now (and have been in the past) is the tyranny of multiple data stores. At any reasonable scale, this quickly leads to a lack of consistency, no matter how much you'd like to try.

Honestly, I think if you have real inconsistency (rather than differences in data of similar form but different semantic meaning) with microservices with separate data stores, it means that you have designed your services improperly, such that they have overlapping responsibility.


I don't think it's a hard prerequisite (as there really aren't... too many of those, microservices can be built how you want them to be), but I think it's a good rule to follow.

If consistency is a necessary concern, and you have tightly coupled data, it's not a terrible idea to make the services a little bigger.

But also, if you have a service that depends on multiple other services to do work, I don't think it's so bad to get used to using the API for the other services (rather than trying to access their databases directly) -- despite the introduced latency overhead


If you have multiple "microservices", all operating on the same data store, it is difficult to guarantee the separation of concerns.

Conceptually, though, I don't think it is a requirement.


Separation of concerns or no, the problem with a single central datastore is that it will eventually become the bottleneck as you scale. This is really difficult to fix, not just technically but politically - as a central datastore grows, everything and everyone starts taking dependencies on it: reports, homegrown tools, documented troubleshooting strategies, etc. They become sacred cows of an organization.

Not only will there be resistance to the idea of splitting out that datastore, but major investment will be required to do it - implementing all of that disconnected messaging stuff you're going to need, reworking applications/services to communicate that way, and handling eventual consistency - which is a tough sell when the app works "perfectly fine" except for that scaling problem.


You could have separate schemas.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: