2
[HN] Total data loss after botched GitOps and failed backups
(firefish.social)
Ouch.
The Kubernetes ecosystem is full of tools and addons to help solve particular problems (often utilizing the dynamic nature of K8s), but each of these brings additional complexity, which add up over time until it's very hard to intuitively reason about the consequences of change.
I personally prefer my IaaC with a manual review & approval step. Once you get more automated, the testing complexity & cost (and need for additional dev/test environments), and of course risk increases.
It's a shame that the backup/restore testing didn't work in this case, though. These kind of TIFUs are better with a happy-ish end.
Aggregated tech news.