Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In my experience failure occurs more frequently when you use more and more systems in more complex ways. e.g. using HAProxy for load balancing, with Consul for service discovery and Consul template for configuration. Each of these is a single point of failure as they are all required for the system to work.

If you define single point of failure, as any computer goes down takes the system with it, then RabbitMQ is not a single point of failure.

I am not sure how domain driven design helps solve this.



> HAProxy for load balancing, with Consul for service discovery and Consul template for configuration. Each of these is a single point of failure as they are all required for the system to work.

Not necessarily. I don't know anything about consul, but if you use something like zookeeper to discover services and write those into an HAProxy config, include a failsafe in whatever writes the HAProxy config on ZK updates such that if the delta is "too large" it will refuse to rewrite the config.

Then if ZK becomes unavailable, what you lose is the ability to easily _make changes_ to what's in the service list. If your service instances come and go relatively infrequently, this might be fine while the ZK fire get put out.


Service instances in a continuous deployment environment are coming and going all day. IF your service discovery and config breaks then everything stops, nothing can be developed or deployed until the broken stuff is fixed.


If SD or config mgmt dies, you can't deploy new stuff, but the existing services continue to work. When your message bus dies, everything dies. It's a fundamentally different failure.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: