- Introduce Traefik ACME configuration using Cloudflare DNS-01 challenge
- Deploy Vaultwarden password manager with IP allowlist protection
- Add middleware for security headers, compression, and rate limiting
- Update IngressRoute resources to use new ACME resolver
- Add troubleshooting steps for certificate and TLS issues
- Include test application deployment and verification commands
- Obtain the token from the primary master via slurp and base64 decode
- Derive k3s_url from the primary master's ansible_host
- Use the decoded content as k3s_token
- Update the success message quoting
- Switch compute-blade-agent deployment from workers to all nodes
(control-plane and workers)
- Use /usr/bin/compute-blade-agent instead of /usr/local/bin
- Update verification scripts to reference /usr/bin/compute-blade-agent
- Update docs to refer to all nodes across Deployment Guide, Checklist,
and Getting Started
- Change site.yml to install on all hosts instead of just workers
- Align example commands to the all-nodes workflow
- Refactor MikroTik API parsing to robustly extract name and poe-out and
add debug logs
- Update boot config on CM4: ensure [cm4] exists, enable_uart=1, and
apply uart5 overlay
This change transforms the cluster from single-master to a fully redundant
3-node control-plane setup for high availability.
Changes:
- Updated inventory/hosts.ini to promote cm4-02 and cm4-03 to master group
* Added k3s_server_init flag to distinguish primary (true) from joining (false) masters
* Reduced worker nodes from 3 to 1 (cm4-04)
* Added clear comments explaining the HA setup
- Modified roles/k3s-server/tasks/main.yml for multi-master support
* Separated primary master initialization from additional master joining
* Primary master (k3s_server_init=true) initializes cluster and generates token
* Additional masters (k3s_server_init=false) join using primary's token
* Proper sequencing ensures cluster stability during joining
* Common tasks (kubeconfig setup) run on all masters
- Updated site.yml for proper master deployment sequencing
* Primary master deploys first and initializes cluster
* Additional masters deploy serially (one at a time) for stability
* Serial deployment prevents etcd consensus issues during joining
* Workers join only after all masters are ready
* Test deployments run on primary master only
- Added comprehensive "High Availability - Multi-Master Cluster" section to README
* Explains benefits of multi-master setup
* Documents how to promote nodes to master
* Includes monitoring and failover procedures
* Shows how to recover from failed masters
* Explains demoting masters back to workers
Benefits:
✓ No single point of failure in control-plane
✓ Automatic etcd clustering across 3 nodes
✓ Can maintain master updates with 0 downtime
✓ Faster cluster recovery from node failures
✓ Better performance distribution for API server
✓ Works seamlessly with MikroTik VIP for external access
Deployment Flow:
1. cm4-01 initializes cluster (becomes primary master)
2. cm4-02 joins as control-plane node
3. cm4-03 joins as control-plane node
4. cm4-04 joins as worker node
5. All nodes join etcd cluster with proper quorum
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Remove 'when: not k3s_binary.stat.exists' condition from k3s-server and
k3s-agent installation tasks to allow in-place upgrades of K3s versions
- Update task names to reflect both install and upgrade functionality
- Add change detection using stdout inspection for better Ansible reporting
Add InfluxDB v2 native dashboard alongside Grafana dashboard:
- Create influxdb/rpi-cluster-dashboard-v2.json for InfluxDB 2.8 compatibility
- Update Grafana dashboard datasource UID from 'influx' to 'influxdb'
- Remove unused disk usage and network traffic panels per user request
Update worker node discovery in compute-blade-agent verification script:
- Fix pattern matching to work with cm4-* node naming convention
- Add support for pi-worker and cb-0* patterns as fallbacks
- Now correctly parses [worker] section from inventory
Update inventory version documentation:
- Add comment explaining how to use 'latest' for auto-updates
- Set version to v1.35.0+k3s1 (updated from v1.34.2+k3s1)
- Add guidance on version format for users
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>