updating documentation
This commit is contained in:
+48
-33
@@ -68,36 +68,36 @@ bash scripts/verify-compute-blade-agent.sh
|
|||||||
- [ ] Service status shows "Running"
|
- [ ] Service status shows "Running"
|
||||||
- [ ] Config file exists at `/etc/compute-blade-agent/config.yaml`
|
- [ ] Config file exists at `/etc/compute-blade-agent/config.yaml`
|
||||||
|
|
||||||
### 3. Manual Verification on a Worker
|
### 3. Manual Verification on a Master Node
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ssh pi@192.168.30.102
|
# Connect to any master (cm4-01, cm4-02, or cm4-03)
|
||||||
sudo systemctl status compute-blade-agent
|
ssh pi@192.168.30.101
|
||||||
|
kubectl get nodes
|
||||||
```
|
```
|
||||||
|
|
||||||
- [ ] Service is active (running)
|
- [ ] All 3 masters show as "Ready"
|
||||||
- [ ] Service is enabled (will start on boot)
|
- [ ] Worker node (cm4-04) shows as "Ready"
|
||||||
|
|
||||||
### 4. Check Logs
|
### 4. Check Etcd Quorum
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ssh pi@192.168.30.102
|
ssh pi@192.168.30.101
|
||||||
sudo journalctl -u compute-blade-agent -n 50
|
sudo /var/lib/rancher/k3s/data/*/bin/etcdctl member list
|
||||||
```
|
```
|
||||||
|
|
||||||
- [ ] No error messages
|
- [ ] All 3 etcd members show as active
|
||||||
- [ ] Service started successfully
|
- [ ] Cluster has quorum (2/3 minimum for failover)
|
||||||
- [ ] Hardware detection messages present (if applicable)
|
|
||||||
|
|
||||||
### 5. Verify Installation
|
### 5. Verify Kubeconfig
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ssh pi@192.168.30.102
|
export KUBECONFIG=$(pwd)/kubeconfig
|
||||||
/usr/local/bin/compute-blade-agent --version
|
kubectl config get-contexts
|
||||||
```
|
```
|
||||||
|
|
||||||
- [ ] Binary responds with version information
|
- [ ] Shows contexts: cm4-01, cm4-02, cm4-03, and default
|
||||||
- [ ] bladectl CLI tool is available
|
- [ ] All contexts point to correct control-plane nodes
|
||||||
|
|
||||||
## Optional: Kubernetes Monitoring Setup
|
## Optional: Kubernetes Monitoring Setup
|
||||||
|
|
||||||
@@ -159,15 +159,20 @@ enable_compute_blade_agent=true # or false
|
|||||||
|
|
||||||
### Per-Node Configuration
|
### Per-Node Configuration
|
||||||
|
|
||||||
To enable/disable specific nodes, edit `inventory/hosts.ini`:
|
Note: cm4-02 and cm4-03 are now **master nodes**, not workers. To enable/disable compute-blade-agent on specific nodes:
|
||||||
|
|
||||||
```ini
|
```ini
|
||||||
|
[master]
|
||||||
|
cm4-01 ansible_host=192.168.30.101 ansible_user=pi k3s_server_init=true enable_compute_blade_agent=false
|
||||||
|
cm4-02 ansible_host=192.168.30.102 ansible_user=pi k3s_server_init=false enable_compute_blade_agent=false
|
||||||
|
cm4-03 ansible_host=192.168.30.103 ansible_user=pi k3s_server_init=false enable_compute_blade_agent=false
|
||||||
|
|
||||||
[worker]
|
[worker]
|
||||||
cm4-02 ansible_host=... enable_compute_blade_agent=false
|
cm4-04 ansible_host=192.168.30.104 ansible_user=pi enable_compute_blade_agent=true
|
||||||
cm4-03 ansible_host=... enable_compute_blade_agent=true
|
|
||||||
```
|
```
|
||||||
|
|
||||||
- [ ] Per-node settings configured as needed
|
- [ ] Per-node settings configured as needed
|
||||||
|
- [ ] Master nodes typically don't need compute-blade-agent
|
||||||
- [ ] Saved inventory file
|
- [ ] Saved inventory file
|
||||||
- [ ] Re-run playbook if changes made
|
- [ ] Re-run playbook if changes made
|
||||||
|
|
||||||
@@ -214,26 +219,36 @@ ansible worker -m shell -a "systemctl status compute-blade-agent" --become
|
|||||||
|
|
||||||
- [ ] All workers show active status
|
- [ ] All workers show active status
|
||||||
|
|
||||||
|
## HA Cluster Maintenance
|
||||||
|
|
||||||
|
### Testing Failover
|
||||||
|
|
||||||
|
Your 3-node HA cluster can handle one master going down (maintains 2/3 quorum):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Reboot one master while monitoring cluster
|
||||||
|
ssh pi@192.168.30.101
|
||||||
|
sudo reboot
|
||||||
|
|
||||||
|
# From another terminal, watch cluster status
|
||||||
|
watch kubectl get nodes
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] Cluster remains operational with 2/3 masters
|
||||||
|
- [ ] Pods continue running
|
||||||
|
- [ ] Can still kubectl from cm4-02 or cm4-03 context
|
||||||
|
|
||||||
## Uninstall (if needed)
|
## Uninstall (if needed)
|
||||||
|
|
||||||
### Uninstall from Single Node
|
### Uninstall K3s from All Nodes
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ssh pi@<worker-ip>
|
ansible all -m shell -a "bash /usr/local/bin/k3s-uninstall.sh" --become
|
||||||
sudo bash /usr/local/bin/k3s-uninstall-compute-blade-agent.sh
|
ansible worker -m shell -a "bash /usr/local/bin/k3s-agent-uninstall.sh" --become
|
||||||
```
|
```
|
||||||
|
|
||||||
- [ ] Uninstall script executed
|
- [ ] All K3s services stopped
|
||||||
- [ ] Service removed
|
- [ ] Cluster data cleaned up
|
||||||
- [ ] Configuration cleaned up
|
|
||||||
|
|
||||||
### Uninstall from All Workers
|
|
||||||
|
|
||||||
```bash
|
|
||||||
ansible worker -m shell -a "bash /usr/local/bin/k3s-uninstall-compute-blade-agent.sh" --become
|
|
||||||
```
|
|
||||||
|
|
||||||
- [ ] All workers uninstalled
|
|
||||||
|
|
||||||
### Disable in Future Deployments
|
### Disable in Future Deployments
|
||||||
|
|
||||||
|
|||||||
+33
-19
@@ -18,9 +18,9 @@ cat inventory/hosts.ini
|
|||||||
|
|
||||||
Verify:
|
Verify:
|
||||||
|
|
||||||
- Master node IP is correct (cm4-01)
|
- Master nodes are correct (cm4-01, cm4-02, cm4-03)
|
||||||
- Worker node IPs are correct (cm4-02, cm4-03, cm4-04)
|
- Worker node IP is correct (cm4-04)
|
||||||
- `enable_compute_blade_agent=true` is set
|
- `enable_compute_blade_agent=true` is set (optional for masters)
|
||||||
|
|
||||||
### Step 2: Test Connectivity
|
### Step 2: Test Connectivity
|
||||||
|
|
||||||
@@ -46,17 +46,22 @@ This will:
|
|||||||
|
|
||||||
**Total time**: ~30-45 minutes
|
**Total time**: ~30-45 minutes
|
||||||
|
|
||||||
### Step 4: Verify
|
### Step 4: Verify Cluster
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
bash scripts/verify-compute-blade-agent.sh
|
export KUBECONFIG=$(pwd)/kubeconfig
|
||||||
|
kubectl get nodes
|
||||||
```
|
```
|
||||||
|
|
||||||
All workers should show:
|
You should see all 4 nodes ready (3 masters + 1 worker):
|
||||||
|
|
||||||
- ✓ Network: Reachable
|
```bash
|
||||||
- ✓ Service Status: Running
|
NAME STATUS ROLES AGE VERSION
|
||||||
- ✓ Binary: Installed
|
cm4-01 Ready control-plane,etcd,master 5m v1.35.0+k3s1
|
||||||
|
cm4-02 Ready control-plane,etcd 3m v1.35.0+k3s1
|
||||||
|
cm4-03 Ready control-plane,etcd 3m v1.35.0+k3s1
|
||||||
|
cm4-04 Ready <none> 3m v1.35.0+k3s1
|
||||||
|
```
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
@@ -215,22 +220,31 @@ sudo systemctl status compute-blade-agent
|
|||||||
|
|
||||||
## Common Tasks
|
## Common Tasks
|
||||||
|
|
||||||
### Restart Agent on All Workers
|
### Check Cluster Status
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ansible worker -m shell -a "sudo systemctl restart compute-blade-agent" --become
|
export KUBECONFIG=$(pwd)/kubeconfig
|
||||||
|
kubectl get nodes
|
||||||
|
kubectl get pods --all-namespaces
|
||||||
```
|
```
|
||||||
|
|
||||||
### View Agent Logs on All Workers
|
### Access Any Master Node
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ansible worker -m shell -a "sudo journalctl -u compute-blade-agent -n 20" --become
|
# Access cm4-01
|
||||||
|
ssh pi@192.168.30.101
|
||||||
|
|
||||||
|
# Or access cm4-02 (backup master)
|
||||||
|
ssh pi@192.168.30.102
|
||||||
|
|
||||||
|
# Or access cm4-03 (backup master)
|
||||||
|
ssh pi@192.168.30.103
|
||||||
```
|
```
|
||||||
|
|
||||||
### Deploy Only to Specific Nodes
|
### Deploy Only to Specific Nodes
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ansible-playbook site.yml --tags compute-blade-agent --limit cm4-02,cm4-03
|
ansible-playbook site.yml --tags compute-blade-agent --limit cm4-04
|
||||||
```
|
```
|
||||||
|
|
||||||
### Disable Agent for Next Deployment
|
### Disable Agent for Next Deployment
|
||||||
@@ -257,12 +271,12 @@ ansible worker -m shell -a "bash /usr/local/bin/k3s-uninstall-compute-blade-agen
|
|||||||
ansible all -m shell -a "bash /usr/local/bin/k3s-uninstall.sh" --become
|
ansible all -m shell -a "bash /usr/local/bin/k3s-uninstall.sh" --become
|
||||||
```
|
```
|
||||||
|
|
||||||
## Support
|
## Documentation
|
||||||
|
|
||||||
- **Quick Reference**: `cat COMPUTE_BLADE_AGENT.md`
|
- **README.md** - Full guide with all configuration options
|
||||||
- **Checklist**: `cat DEPLOYMENT_CHECKLIST.md`
|
- **DEPLOYMENT_CHECKLIST.md** - Step-by-step checklist
|
||||||
- **Full Guide**: `cat README.md`
|
- **COMPUTE_BLADE_AGENT.md** - Quick reference for agent deployment
|
||||||
- **GitHub**: [compute-blade-agent](https://github.com/compute-blade-community/compute-blade-agent)
|
- **MIKROTIK-VIP-SETUP-CUSTOM.md** - Virtual IP failover configuration
|
||||||
|
|
||||||
## File Locations
|
## File Locations
|
||||||
|
|
||||||
|
|||||||
@@ -8,16 +8,18 @@ Customized setup guide for your MikroTik RouterOS configuration.
|
|||||||
Uplink Network: 192.168.1.0/24 (br-uplink - WAN/External)
|
Uplink Network: 192.168.1.0/24 (br-uplink - WAN/External)
|
||||||
LAB Network: 192.168.30.0/24 (br-lab - K3s Cluster)
|
LAB Network: 192.168.30.0/24 (br-lab - K3s Cluster)
|
||||||
|
|
||||||
K3s Nodes:
|
K3s Nodes (3-node HA Cluster):
|
||||||
cm4-01: 192.168.30.101 (Master)
|
cm4-01: 192.168.30.101 (Master/Control-Plane)
|
||||||
cm4-02: 192.168.30.102 (Worker)
|
cm4-02: 192.168.30.102 (Master/Control-Plane)
|
||||||
cm4-03: 192.168.30.103 (Worker)
|
cm4-03: 192.168.30.103 (Master/Control-Plane)
|
||||||
cm4-04: 192.168.30.104 (Worker)
|
cm4-04: 192.168.30.104 (Worker)
|
||||||
|
|
||||||
Virtual IP to Create:
|
Virtual IP to Create:
|
||||||
192.168.30.100/24 (on br-lab bridge)
|
192.168.30.100/24 (on br-lab bridge - HAProxy or MikroTik failover)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**⚠️ Important Note**: The basic NAT rules below will route to cm4-01 only. To achieve true failover in your 3-node HA cluster, activate the health check script (Step 8) so traffic automatically routes to another master if cm4-01 goes down.
|
||||||
|
|
||||||
## Step 1: Add Virtual IP Address on MikroTik
|
## Step 1: Add Virtual IP Address on MikroTik
|
||||||
|
|
||||||
Since your K3s nodes are on the `br-lab` bridge, add the VIP there:
|
Since your K3s nodes are on the `br-lab` bridge, add the VIP there:
|
||||||
@@ -183,9 +185,9 @@ curl http://test.zlor.fi
|
|||||||
curl -k https://test.zlor.fi
|
curl -k https://test.zlor.fi
|
||||||
```
|
```
|
||||||
|
|
||||||
## Step 8: Optional - Add Health Check Script
|
## Step 8: Add Health Check Script (Recommended for HA)
|
||||||
|
|
||||||
For automatic failover, create a health check script that monitors the master node and updates NAT rules if it goes down.
|
**For automatic failover with your 3-node HA cluster**, create a health check script that monitors the master node and updates NAT rules if it goes down. This ensures traffic automatically routes to cm4-02 or cm4-03 if cm4-01 fails.
|
||||||
|
|
||||||
### Create Health Check Script
|
### Create Health Check Script
|
||||||
|
|
||||||
@@ -237,6 +239,8 @@ For automatic failover, create a health check script that monitors the master no
|
|||||||
comment="Monitor K3s cluster and update VIP routes"
|
comment="Monitor K3s cluster and update VIP routes"
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Status**: This scheduler will run every 30 seconds and automatically switch the VIP NAT rules to an available master if cm4-01 becomes unreachable.
|
||||||
|
|
||||||
### View Health Check Logs
|
### View Health Check Logs
|
||||||
|
|
||||||
```mikrotik
|
```mikrotik
|
||||||
@@ -247,14 +251,33 @@ For automatic failover, create a health check script that monitors the master no
|
|||||||
## Verification Checklist
|
## Verification Checklist
|
||||||
|
|
||||||
- [ ] VIP address (192.168.30.100) added to br-lab
|
- [ ] VIP address (192.168.30.100) added to br-lab
|
||||||
- [ ] NAT rules for port 80 and 443 created
|
- [ ] NAT rules for port 80 and 443 created (routed to cm4-01)
|
||||||
- [ ] Firewall rules allow traffic to VIP
|
- [ ] Firewall rules allow traffic to VIP
|
||||||
- [ ] Ping 192.168.30.100 succeeds
|
- [ ] Ping 192.168.30.100 succeeds
|
||||||
- [ ] curl http://192.168.30.100 returns nginx page
|
- [ ] curl http://192.168.30.100 returns nginx page
|
||||||
- [ ] DNS A record added: test.zlor.fi → 192.168.30.100
|
- [ ] DNS A record added: test.zlor.fi → 192.168.30.100
|
||||||
- [ ] curl http://test.zlor.fi works
|
- [ ] curl http://test.zlor.fi works
|
||||||
- [ ] Health check script created (optional)
|
- [ ] **Health check script created** (recommended for HA failover)
|
||||||
- [ ] Health check scheduled (optional)
|
- [ ] **Health check scheduled** (recommended for HA failover)
|
||||||
|
- [ ] Test failover by pinging health check scheduler status
|
||||||
|
|
||||||
|
## Testing Failover (HA Cluster)
|
||||||
|
|
||||||
|
If you've enabled the health check script, you can test automatic failover:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# From your machine, start monitoring
|
||||||
|
watch -n 5 'curl -v http://192.168.30.100 2>&1 | grep "200 OK\|Connected"'
|
||||||
|
|
||||||
|
# In another terminal, SSH to cm4-01 and reboot it
|
||||||
|
ssh pi@192.168.30.101
|
||||||
|
sudo reboot
|
||||||
|
|
||||||
|
# Watch the curl output - after ~30 seconds, it should reconnect
|
||||||
|
# This means the health check script switched traffic to cm4-02 or cm4-03
|
||||||
|
```
|
||||||
|
|
||||||
|
**Expected result**: Traffic stays online during the reboot (except for ~30 second switchover window)
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
@@ -368,16 +391,27 @@ Your VIP is now configured on MikroTik:
|
|||||||
```
|
```
|
||||||
External Traffic
|
External Traffic
|
||||||
↓
|
↓
|
||||||
192.168.30.100:80 (VIP on br-lab)
|
192.168.30.100:80/443 (VIP on br-lab)
|
||||||
↓
|
↓
|
||||||
NAT Rule Routes to 192.168.30.101:80
|
NAT Rule Routes to 192.168.30.101:80/443 (cm4-01 Master)
|
||||||
↓
|
↓
|
||||||
K3s Master Node (cm4-01)
|
If Health Check Enabled:
|
||||||
|
- Routes to cm4-02 if cm4-01 down (every 30 seconds check)
|
||||||
|
- Routes to cm4-03 if both cm4-01 and cm4-02 down
|
||||||
↓
|
↓
|
||||||
If Master Down → Failover to Worker
|
Ingress → K3s Service → Pods
|
||||||
(Optional with health check script)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
DNS: `test.zlor.fi → 192.168.30.100`
|
**DNS**: `test.zlor.fi → 192.168.30.100`
|
||||||
|
|
||||||
Single IP for your entire cluster with automatic failover! ✅
|
**Status**:
|
||||||
|
|
||||||
|
- ✅ Single IP for entire cluster
|
||||||
|
- ✅ Automatic failover (with health check script)
|
||||||
|
- ✅ 3-node HA masters provide etcd quorum
|
||||||
|
|
||||||
|
**Next Steps**:
|
||||||
|
|
||||||
|
1. Enable health check script (Step 8) for automatic failover
|
||||||
|
2. Test failover by rebooting cm4-01 and monitoring connectivity
|
||||||
|
3. Your cluster now has true high availability!
|
||||||
|
|||||||
@@ -42,19 +42,19 @@ Edit `inventory/hosts.ini` and add your Raspberry Pi nodes:
|
|||||||
|
|
||||||
```ini
|
```ini
|
||||||
[master]
|
[master]
|
||||||
pi-master ansible_host=192.168.30.100 ansible_user=pi
|
cm4-01 ansible_host=192.168.30.101 ansible_user=pi k3s_server_init=true
|
||||||
|
cm4-02 ansible_host=192.168.30.102 ansible_user=pi k3s_server_init=false
|
||||||
|
cm4-03 ansible_host=192.168.30.103 ansible_user=pi k3s_server_init=false
|
||||||
|
|
||||||
[worker]
|
[worker]
|
||||||
pi-worker-1 ansible_host=192.168.30.102 ansible_user=pi
|
cm4-04 ansible_host=192.168.30.104 ansible_user=pi
|
||||||
pi-worker-2 ansible_host=192.168.30.103 ansible_user=pi
|
|
||||||
pi-worker-3 ansible_host=192.168.30.104 ansible_user=pi
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configure Variables
|
### 2. Configure Variables
|
||||||
|
|
||||||
In `inventory/hosts.ini`, you can customize:
|
In `inventory/hosts.ini`, you can customize:
|
||||||
|
|
||||||
- `k3s_version`: K3s version to install (default: v1.34.2+k3s1)
|
- `k3s_version`: K3s version to install (default: v1.35.0+k3s1)
|
||||||
- `extra_server_args`: Additional arguments for k3s server
|
- `extra_server_args`: Additional arguments for k3s server
|
||||||
- `extra_agent_args`: Additional arguments for k3s agent
|
- `extra_agent_args`: Additional arguments for k3s agent
|
||||||
- `extra_packages`: List of additional packages to install on all nodes
|
- `extra_packages`: List of additional packages to install on all nodes
|
||||||
@@ -304,20 +304,21 @@ kubectl get nodes
|
|||||||
You should see all your nodes in Ready state:
|
You should see all your nodes in Ready state:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
NAME STATUS ROLES AGE VERSION
|
NAME STATUS ROLES AGE VERSION
|
||||||
pi-master Ready control-plane,master 5m v1.34.2+k3s1
|
cm4-01 Ready control-plane,etcd,master 5m v1.35.0+k3s1
|
||||||
pi-worker-1 Ready <none> 3m v1.34.2+k3s1
|
cm4-02 Ready control-plane,etcd 3m v1.35.0+k3s1
|
||||||
pi-worker-2 Ready <none> 3m v1.34.2+k3s1
|
cm4-03 Ready control-plane,etcd 3m v1.35.0+k3s1
|
||||||
|
cm4-04 Ready <none> 3m v1.35.0+k3s1
|
||||||
```
|
```
|
||||||
|
|
||||||
## Accessing the Cluster
|
## Accessing the Cluster
|
||||||
|
|
||||||
### From Master Node
|
### From Master Node
|
||||||
|
|
||||||
SSH into the master node and use kubectl:
|
SSH into a master node and use kubectl:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ssh pi@pi-master
|
ssh pi@192.168.30.101
|
||||||
kubectl get nodes
|
kubectl get nodes
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -461,8 +462,11 @@ nginx-test-7d8f4c9b6d-xr5wp 1/1 Running 0 1m pi-worker-2
|
|||||||
Add your master node IP to /etc/hosts:
|
Add your master node IP to /etc/hosts:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Replace 192.168.30.101 with your master node IP
|
# Replace with any master or worker node IP
|
||||||
192.168.30.101 nginx-test.local nginx.pi.local
|
192.168.30.101 nginx-test.local nginx.pi.local
|
||||||
|
192.168.30.102 nginx-test.local nginx.pi.local
|
||||||
|
192.168.30.103 nginx-test.local nginx.pi.local
|
||||||
|
192.168.30.104 nginx-test.local nginx.pi.local
|
||||||
```
|
```
|
||||||
|
|
||||||
Then access via browser:
|
Then access via browser:
|
||||||
@@ -473,8 +477,9 @@ Then access via browser:
|
|||||||
Or test with curl:
|
Or test with curl:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Replace with your master node IP
|
# Test with any cluster node IP (master or worker)
|
||||||
curl -H "Host: nginx-test.local" http://192.168.30.101
|
curl -H "Host: nginx-test.local" http://192.168.30.101
|
||||||
|
curl -H "Host: nginx-test.local" http://192.168.30.102
|
||||||
```
|
```
|
||||||
|
|
||||||
### Scale the Deployment
|
### Scale the Deployment
|
||||||
@@ -624,7 +629,7 @@ ansible-playbook site.yml --tags k3s-server --limit <failed-master>
|
|||||||
|
|
||||||
### Demoting a Master to Worker
|
### Demoting a Master to Worker
|
||||||
|
|
||||||
To remove a master from control-plane and make it a worker:
|
To remove a master from control-plane and make it a worker (note: this reduces HA from 3-node to 2-node):
|
||||||
|
|
||||||
1. Edit `inventory/hosts.ini`:
|
1. Edit `inventory/hosts.ini`:
|
||||||
|
|
||||||
@@ -638,6 +643,8 @@ To remove a master from control-plane and make it a worker:
|
|||||||
cm4-04 ansible_host=192.168.30.104 ansible_user=pi
|
cm4-04 ansible_host=192.168.30.104 ansible_user=pi
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Warning**: This reduces your cluster to 2 master nodes. With only 2 masters, you lose quorum (require 2/3, have only 1/2 if one fails).
|
||||||
|
|
||||||
2. Drain the node:
|
2. Drain the node:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -690,7 +697,7 @@ To update to a specific k3s version:
|
|||||||
|
|
||||||
```ini
|
```ini
|
||||||
[k3s_cluster:vars]
|
[k3s_cluster:vars]
|
||||||
k3s_version=v1.35.0+k3s1
|
k3s_version=v1.36.0+k3s1
|
||||||
```
|
```
|
||||||
|
|
||||||
1. Run the k3s playbook to update all nodes:
|
1. Run the k3s playbook to update all nodes:
|
||||||
@@ -711,7 +718,7 @@ For more control, you can manually update k3s on individual nodes:
|
|||||||
ssh pi@<node-ip>
|
ssh pi@<node-ip>
|
||||||
|
|
||||||
# Download and install specific version
|
# Download and install specific version
|
||||||
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.35.0+k3s1 sh -
|
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.36.0+k3s1 sh -
|
||||||
|
|
||||||
# Restart k3s
|
# Restart k3s
|
||||||
sudo systemctl restart k3s # On master
|
sudo systemctl restart k3s # On master
|
||||||
@@ -775,7 +782,7 @@ If an update causes issues, you can rollback to a previous version:
|
|||||||
```bash
|
```bash
|
||||||
# Update inventory with previous version
|
# Update inventory with previous version
|
||||||
# [k3s_cluster:vars]
|
# [k3s_cluster:vars]
|
||||||
# k3s_version=v1.34.2+k3s1
|
# k3s_version=v1.35.0+k3s1
|
||||||
|
|
||||||
# Re-run the playbook
|
# Re-run the playbook
|
||||||
ansible-playbook site.yml --tags k3s-server,k3s-agent
|
ansible-playbook site.yml --tags k3s-server,k3s-agent
|
||||||
@@ -814,7 +821,7 @@ ansible-playbook reboot.yml --limit master
|
|||||||
### Reboot a Specific Node
|
### Reboot a Specific Node
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ansible-playbook reboot.yml --limit pi-worker-1
|
ansible-playbook reboot.yml --limit cm4-04
|
||||||
```
|
```
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
@@ -1001,26 +1008,33 @@ ansible-playbook site.yml --tags compute-blade-agent
|
|||||||
|
|
||||||
## External DNS Configuration
|
## External DNS Configuration
|
||||||
|
|
||||||
To use external domains (like `test.zlor.fi`) with your k3s cluster ingress, you need to configure DNS and update your nodes.
|
To use external domains (like `test.zlor.fi`) with your k3s cluster ingress, you need to configure DNS. Your cluster uses a Virtual IP (192.168.30.100) via MikroTik for high availability.
|
||||||
|
|
||||||
### Step 1: Configure DNS Server Records
|
### Step 1: Configure DNS Server Records
|
||||||
|
|
||||||
On your DNS server, add **A records** pointing to your k3s cluster nodes:
|
On your DNS server, add **A records** pointing to your k3s cluster nodes:
|
||||||
|
|
||||||
#### Option A: Single Record (Master Node Only) - Simplest
|
#### Option A: Virtual IP (VIP) via MikroTik - Recommended for HA
|
||||||
|
|
||||||
If your DNS only allows one A record:
|
Use your MikroTik router's Virtual IP (192.168.30.100) for high availability:
|
||||||
|
|
||||||
```dns
|
```dns
|
||||||
test.zlor.fi A 192.168.30.101
|
test.zlor.fi A 192.168.30.100
|
||||||
```
|
```
|
||||||
|
|
||||||
**Pros:** Simple, works with any DNS server
|
**Pros:**
|
||||||
**Cons:** No failover if master node is down
|
|
||||||
|
|
||||||
#### Option B: Multiple Records (Load Balanced) - Best Redundancy
|
- Single IP for entire cluster
|
||||||
|
- Hardware-based failover (more reliable)
|
||||||
|
- Better performance
|
||||||
|
- No additional software needed
|
||||||
|
- Automatically routes to available masters
|
||||||
|
|
||||||
If your DNS supports multiple A records:
|
See [MIKROTIK-VIP-SETUP-CUSTOM.md](MIKROTIK-VIP-SETUP-CUSTOM.md) for detailed setup instructions.
|
||||||
|
|
||||||
|
#### Option B: Multiple Records (Load Balanced)
|
||||||
|
|
||||||
|
If your DNS supports multiple A records, point to all cluster nodes:
|
||||||
|
|
||||||
```dns
|
```dns
|
||||||
test.zlor.fi A 192.168.30.101
|
test.zlor.fi A 192.168.30.101
|
||||||
@@ -1029,32 +1043,19 @@ test.zlor.fi A 192.168.30.103
|
|||||||
test.zlor.fi A 192.168.30.104
|
test.zlor.fi A 192.168.30.104
|
||||||
```
|
```
|
||||||
|
|
||||||
DNS clients will distribute requests across all nodes (round-robin).
|
|
||||||
|
|
||||||
**Pros:** Load balanced, automatic failover
|
**Pros:** Load balanced, automatic failover
|
||||||
**Cons:** Requires DNS server support for multiple A records
|
**Cons:** Requires DNS server support for multiple A records
|
||||||
|
|
||||||
#### Option C: Virtual IP (VIP) - Best of Both Worlds
|
#### Option C: Single Master Node (No Failover)
|
||||||
|
|
||||||
If your DNS only allows one A record but you want redundancy:
|
For simple setups without redundancy:
|
||||||
|
|
||||||
```dns
|
```dns
|
||||||
test.zlor.fi A 192.168.30.100
|
test.zlor.fi A 192.168.30.101
|
||||||
```
|
```
|
||||||
|
|
||||||
Set up a virtual IP that automatically handles failover. You have two sub-options:
|
**Pros:** Simple, works with any DNS server
|
||||||
|
**Cons:** No failover if that node is down (not recommended for HA clusters)
|
||||||
##### Option C: MikroTik VIP (Recommended)
|
|
||||||
|
|
||||||
Configure VIP directly on your MikroTik router. See [MIKROTIK-VIP-SETUP.md](MIKROTIK-VIP-SETUP.md) for customized setup instructions for your network topology.
|
|
||||||
|
|
||||||
Pros:
|
|
||||||
|
|
||||||
- Simple setup (5 minutes)
|
|
||||||
- No additional software on cluster nodes
|
|
||||||
- Hardware-based failover (more reliable)
|
|
||||||
- Better performance
|
|
||||||
- Reduced CPU overhead on nodes
|
|
||||||
|
|
||||||
### Step 2: Configure Cluster Nodes for External DNS
|
### Step 2: Configure Cluster Nodes for External DNS
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user