Code, Deploy, Monitor.

Making Infra Boring, So Developers Can Be Brilliant


Hola, it’s me Owen πŸ‘‹
I’m just someone who enjoys building things, breaking them (in dev only, I promise), and learning along the way. This blog is a space for thoughts, stories, experiments, and everything in between β€” mostly tech, sometimes life. Welcome aboard!

Disaster Recovery

I designed and executed a Disaster Recovery (DR) scenario to ensure business continuity in the event of failure in our primary infrastructure hosted on Alicloud. The objective was to validate that all critical services could be seamlessly switched over to our secondary environment in Google Cloud Platform (GCP).

As part of the simulation, we deliberately shut down the entire Alicloud Kubernetes cluster for one full day. During this period, the cluster running in GCP took over all workloads, allowing services to operate normally without user disruption. The DR setup included syncing manifests, secrets, images, and persistent data, as well as configuring DNS failover and redeploying applications with minimal downtime.

This exercise proved the reliability of our multi-cloud strategy and reinforced operational readiness for unexpected outages in the primary environment.