Heh, I am reminded of when the control plane at AWS went down... and we had a custom autoscaling config that would query for the number of instances running and scale appropriately... but when the AWS API died... we kept getting zero running instances...
So our system thought none were running and so it kept launching instances....
These were SPOT instances and thus only cost like .10 per hour...
But we launched like 2500 instances which all needed to slurp down their DB and config - so it overloaded all other control plane systems...
We had to reboot the entire system. Which took forever.
The only good things was this happened at 11am - so all team members were online and avail... and then AWS refunded all costs.
---
The other fun time was when a newbie dev checked in AWS creds to git - but he created the 201th repo (we had only paid for 200) -- and as it was the next repo which wasnt paid for, it was by default public - thus slurped up by bots asap - which then used the AWS creds to launch bitcoin mining bots in every single region around the globe. Like 1700 instances.
The thing that sucked about that was it happened at like 3am and we had to rally on that one pretty fast. AWS still refunded all costs...
So our system thought none were running and so it kept launching instances....
These were SPOT instances and thus only cost like .10 per hour...
But we launched like 2500 instances which all needed to slurp down their DB and config - so it overloaded all other control plane systems...
We had to reboot the entire system. Which took forever.
The only good things was this happened at 11am - so all team members were online and avail... and then AWS refunded all costs.
---
The other fun time was when a newbie dev checked in AWS creds to git - but he created the 201th repo (we had only paid for 200) -- and as it was the next repo which wasnt paid for, it was by default public - thus slurped up by bots asap - which then used the AWS creds to launch bitcoin mining bots in every single region around the globe. Like 1700 instances.
The thing that sucked about that was it happened at like 3am and we had to rally on that one pretty fast. AWS still refunded all costs...