This change transforms the cluster from single-master to a fully redundant
3-node control-plane setup for high availability.
Changes:
- Updated inventory/hosts.ini to promote cm4-02 and cm4-03 to master group
* Added k3s_server_init flag to distinguish primary (true) from joining (false) masters
* Reduced worker nodes from 3 to 1 (cm4-04)
* Added clear comments explaining the HA setup
- Modified roles/k3s-server/tasks/main.yml for multi-master support
* Separated primary master initialization from additional master joining
* Primary master (k3s_server_init=true) initializes cluster and generates token
* Additional masters (k3s_server_init=false) join using primary's token
* Proper sequencing ensures cluster stability during joining
* Common tasks (kubeconfig setup) run on all masters
- Updated site.yml for proper master deployment sequencing
* Primary master deploys first and initializes cluster
* Additional masters deploy serially (one at a time) for stability
* Serial deployment prevents etcd consensus issues during joining
* Workers join only after all masters are ready
* Test deployments run on primary master only
- Added comprehensive "High Availability - Multi-Master Cluster" section to README
* Explains benefits of multi-master setup
* Documents how to promote nodes to master
* Includes monitoring and failover procedures
* Shows how to recover from failed masters
* Explains demoting masters back to workers
Benefits:
✓ No single point of failure in control-plane
✓ Automatic etcd clustering across 3 nodes
✓ Can maintain master updates with 0 downtime
✓ Faster cluster recovery from node failures
✓ Better performance distribution for API server
✓ Works seamlessly with MikroTik VIP for external access
Deployment Flow:
1. cm4-01 initializes cluster (becomes primary master)
2. cm4-02 joins as control-plane node
3. cm4-03 joins as control-plane node
4. cm4-04 joins as worker node
5. All nodes join etcd cluster with proper quorum
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>