In my experience failure occurs more frequently when you use more and more syste...

philsnow · on March 14, 2016

> HAProxy for load balancing, with Consul for service discovery and Consul template for configuration. Each of these is a single point of failure as they are all required for the system to work.

Not necessarily. I don't know anything about consul, but if you use something like zookeeper to discover services and write those into an HAProxy config, include a failsafe in whatever writes the HAProxy config on ZK updates such that if the delta is "too large" it will refuse to rewrite the config.

Then if ZK becomes unavailable, what you lose is the ability to easily _make changes_ to what's in the service list. If your service instances come and go relatively infrequently, this might be fine while the ZK fire get put out.

grahar64 · on March 15, 2016

Service instances in a continuous deployment environment are coming and going all day. IF your service discovery and config breaks then everything stops, nothing can be developed or deployed until the broken stuff is fixed.

sagichmal · on March 15, 2016

If SD or config mgmt dies, you can't deploy new stuff, but the existing services continue to work. When your message bus dies, everything dies. It's a fundamentally different failure.