Add Virtual IP (VIP) solution for single DNS record with failover

Create vip-setup.yml Ansible playbook for Keepalived-based VIP configuration
- Automatic failover between cluster nodes using VRRP protocol
- Health checks for API server availability
- Single IP address can be used in DNS instead of multiple A records
- Master node holds VIP by default, workers act as backups

Update README.md with comprehensive VIP documentation:
- Add three DNS options (single record, multiple records, VIP)
- Detailed VIP installation and verification steps
- Monitoring and failover testing procedures
- Troubleshooting guide for common VIP issues
- Instructions for disabling VIP if needed

Benefits:
- Single DNS A record pointing to VIP (192.168.30.100)
- Automatic failover with no manual intervention
- Load balancing capability across all nodes
- Transparent to applications

Fix markdown linting issues:
- Add proper blank lines around lists and code blocks
- Use consistent ordered list numbering (all 1.)
- Remove duplicate/extra blank lines
- Ensure proper spacing around headings

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-08 16:44:06 +01:00
parent 0434039b80
commit 6049509c5d
2 changed files with 388 additions and 3 deletions

232
README.md
View File

@@ -850,13 +850,20 @@ To use external domains (like `test.zlor.fi`) with your k3s cluster ingress, you
On your DNS server, add **A records** pointing to your k3s cluster nodes:
#### Option A: Single Record (Simpler, Master Node Only)
#### Option A: Single Record (Master Node Only) - Simplest
If your DNS only allows one A record:
```dns
test.zlor.fi A 192.168.30.101
```
#### Option B: Multiple Records (Load Balanced Across All Nodes)
**Pros:** Simple, works with any DNS server
**Cons:** No failover if master node is down
#### Option B: Multiple Records (Load Balanced) - Best Redundancy
If your DNS supports multiple A records:
```dns
test.zlor.fi A 192.168.30.101
@@ -865,7 +872,23 @@ test.zlor.fi A 192.168.30.103
test.zlor.fi A 192.168.30.104
```
DNS clients will distribute requests across all configured IPs (round-robin).
DNS clients will distribute requests across all nodes (round-robin).
**Pros:** Load balanced, automatic failover
**Cons:** Requires DNS server support for multiple A records
#### Option C: Virtual IP (VIP) with Keepalived - Best of Both Worlds
If your DNS only allows one A record but you want redundancy:
```dns
test.zlor.fi A 192.168.30.100
```
Set up a virtual IP that automatically floats between nodes. See "Virtual IP Setup" below for detailed instructions.
**Pros:** Single DNS record, automatic failover, load balancing
**Cons:** Requires additional setup with Keepalived
### Step 2: Configure Cluster Nodes for External DNS
@@ -1089,6 +1112,209 @@ spec:
kubectl apply -f manifests/nginx-test-deployment.yaml
```
## Virtual IP Setup (Option C)
If your DNS server only allows a single A record but you want high availability across all nodes, use a Virtual IP (VIP) with Keepalived.
### How It Works
- A virtual IP (192.168.30.100) floats between cluster nodes using VRRP protocol
- The master node holds the VIP by default
- If the master fails, a worker node automatically takes over
- All traffic reaches the cluster through a single IP address
- Clients experience automatic failover with minimal downtime
### Prerequisites
- All nodes must be on the same network segment
- Network must support ARP protocol (standard on most networks)
- No other services should use 192.168.30.100
### Installation
#### Step 1: Update Your VIP Address
Edit `vip-setup.yml` and change the VIP to an unused IP on your network:
```yaml
vars:
vip_address: "192.168.30.100" # Change this to your desired VIP
vip_interface: "eth0" # Change if your interface is different
```
#### Step 2: Run the VIP Setup Playbook
```bash
ansible-playbook vip-setup.yml
```
This will:
- Install Keepalived on all nodes
- Configure VRRP with master on cm4-01 and backup on workers
- Set up health checks for automatic failover
- Enable the virtual IP
#### Step 3: Verify VIP is Active
Check that the VIP is assigned to the master node:
```bash
# From your local machine
ping 192.168.30.100
# From any cluster node
ssh pi@192.168.30.101
ip addr show
# Look for your VIP address in the output
```
#### Step 4: Update DNS Records
Now you can use just one A record pointing to the VIP:
```dns
test.zlor.fi A 192.168.30.100
```
#### Step 5: Update Ingress (Optional)
If you want to reference the VIP in your ingress, update the manifest:
```yaml
spec:
rules:
- host: test.zlor.fi
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: nginx-test
port:
number: 80
```
The ingress is already correct - it will reach the cluster through any node IP.
### Monitoring the VIP
Check VIP status and failover behavior:
```bash
# View Keepalived status
ssh pi@192.168.30.101
systemctl status keepalived
# Watch VIP transitions (open in separate terminal)
watch 'ip addr show | grep 192.168.30.100'
# View Keepalived logs
sudo journalctl -u keepalived -f
# Check health check script
sudo cat /usr/local/bin/check_apiserver.sh
```
### Testing Failover
To test automatic failover:
1. Note which node has the VIP:
```bash
for ip in 192.168.30.{101..104}; do
echo "=== $ip ==="
ssh pi@$ip "ip addr show | grep 192.168.30.100" 2>/dev/null || echo "Not on this node"
done
```
1. SSH into the node holding the VIP and stop keepalived:
```bash
ssh pi@192.168.30.101 # or whichever node has the VIP
sudo systemctl stop keepalived
```
1. Watch the VIP migrate to another node:
```bash
# From another terminal, watch the migration
ping 192.168.30.100 -c 5
# Connection may drop briefly, then resume on new node
```
1. Restart keepalived on the original node:
```bash
sudo systemctl start keepalived
```
### Troubleshooting VIP
#### VIP is not appearing on any node
Check if Keepalived is running:
```bash
ssh pi@192.168.30.101
sudo systemctl status keepalived
sudo journalctl -u keepalived -n 20
```
Verify the interface name:
```bash
ip route | grep default # Should show your interface name
```
Update `vip_interface` in `vip-setup.yml` if needed and re-run.
#### VIP keeps switching between nodes
This indicates the health check is failing. Verify:
```bash
# Check if API server is responding
curl -k https://127.0.0.1:6443/healthz
# Check the health check script
cat /usr/local/bin/check_apiserver.sh
sudo bash /usr/local/bin/check_apiserver.sh
```
#### DNS resolves but connections time out
Verify all nodes have the VIP configured:
```bash
for ip in 192.168.30.{101..104}; do
echo "=== $ip ==="
ssh pi@$ip "ip addr show | grep 192.168.30.100"
done
```
Test direct connectivity to the VIP from each node:
```bash
ssh pi@192.168.30.101
curl -H "Host: test.zlor.fi" http://192.168.30.100
```
### Disabling VIP
If you no longer need the VIP:
```bash
# Stop Keepalived on all nodes
ansible all -m systemd -a "name=keepalived state=stopped enabled=no" --become
# Remove configuration
ansible all -m file -a "path=/etc/keepalived/keepalived.conf state=absent" --become
```
## Uninstall
To completely remove k3s from all nodes:

159
vip-setup.yml Normal file
View File

@@ -0,0 +1,159 @@
---
- name: Configure Virtual IP (VIP) with Keepalived for k3s cluster
hosts: all
become: yes
vars:
vip_address: "192.168.30.100" # Change this to your desired VIP
vip_netmask: "255.255.255.0"
vip_interface: "eth0" # Change if your interface is different
cluster_nodes:
- 192.168.30.101
- 192.168.30.102
- 192.168.30.103
- 192.168.30.104
pre_tasks:
- name: Determine interface name
shell: |
ip route | grep default | awk '{print $5}' | head -1
register: default_interface
changed_when: false
- name: Set interface fact
set_fact:
vip_interface: "{{ default_interface.stdout }}"
tasks:
- name: Install Keepalived
apt:
name: keepalived
state: present
update_cache: yes
- name: Configure Keepalived on master node
template:
content: |
global_defs {
router_id K3S_MASTER
script_user root
enable_script_security
}
vrrp_script check_apiserver {
script "/usr/local/bin/check_apiserver.sh"
interval 3
weight -2
fall 5
rise 2
}
vrrp_instance VI_1 {
state MASTER
interface {{ vip_interface }}
virtual_router_id 51
priority 100
advert_int 1
nopreempt
virtual_ipaddress {
{{ vip_address }}/24
}
track_script {
check_apiserver
}
}
dest: /etc/keepalived/keepalived.conf
owner: root
group: root
mode: '0644'
notify: Restart Keepalived
when: inventory_hostname in groups['master']
- name: Configure Keepalived on worker nodes
template:
content: |
global_defs {
router_id K3S_WORKER
script_user root
enable_script_security
}
vrrp_script check_apiserver {
script "/usr/local/bin/check_apiserver.sh"
interval 3
weight -2
fall 5
rise 2
}
vrrp_instance VI_1 {
state BACKUP
interface {{ vip_interface }}
virtual_router_id 51
priority 50
advert_int 1
virtual_ipaddress {
{{ vip_address }}/24
}
track_script {
check_apiserver
}
}
dest: /etc/keepalived/keepalived.conf
owner: root
group: root
mode: '0644'
notify: Restart Keepalived
when: inventory_hostname not in groups['master']
- name: Create API server health check script
copy:
content: |
#!/bin/bash
# Check if k3s API server is responding
curl -sf https://127.0.0.1:6443/healthz > /dev/null 2>&1
exit $?
dest: /usr/local/bin/check_apiserver.sh
owner: root
group: root
mode: '0755'
- name: Enable and start Keepalived
systemd:
name: keepalived
enabled: yes
state: started
daemon_reload: yes
- name: Verify VIP is assigned
shell: |
ip addr show {{ vip_interface }} | grep {{ vip_address }}
register: vip_check
retries: 3
delay: 2
until: vip_check is succeeded
changed_when: false
- name: Display VIP configuration
debug:
msg: |
Virtual IP configured successfully!
VIP Address: {{ vip_address }}
Interface: {{ vip_interface }}
Use this IP for your DNS records:
test.zlor.fi A {{ vip_address }}
The VIP will automatically failover to a worker node
if the master node becomes unavailable.
handlers:
- name: Restart Keepalived
systemd:
name: keepalived
state: restarted
daemon_reload: yes