Add Virtual IP (VIP) solution for single DNS record with failover

Create vip-setup.yml Ansible playbook for Keepalived-based VIP configuration - Automatic failover between cluster nodes using VRRP protocol - Health checks for API server availability - Single IP address can be used in DNS instead of multiple A records - Master node holds VIP by default, workers act as backups Update README.md with comprehensive VIP documentation: - Add three DNS options (single record, multiple records, VIP) - Detailed VIP installation and verification steps - Monitoring and failover testing procedures - Troubleshooting guide for common VIP issues - Instructions for disabling VIP if needed Benefits: - Single DNS A record pointing to VIP (192.168.30.100) - Automatic failover with no manual intervention - Load balancing capability across all nodes - Transparent to applications Fix markdown linting issues: - Add proper blank lines around lists and code blocks - Use consistent ordered list numbering (all 1.) - Remove duplicate/extra blank lines - Ensure proper spacing around headings Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-08 16:44:06 +01:00
parent 0434039b80
commit 6049509c5d
2 changed files with 388 additions and 3 deletions
--- a/README.md
+++ b/README.md
@@ -850,13 +850,20 @@ To use external domains (like `test.zlor.fi`) with your k3s cluster ingress, you

 On your DNS server, add **A records** pointing to your k3s cluster nodes:

-#### Option A: Single Record (Simpler, Master Node Only)
+#### Option A: Single Record (Master Node Only) - Simplest
+
+If your DNS only allows one A record:

 ```dns
 test.zlor.fi  A  192.168.30.101
 ```

-#### Option B: Multiple Records (Load Balanced Across All Nodes)
+**Pros:** Simple, works with any DNS server
+**Cons:** No failover if master node is down
+
+#### Option B: Multiple Records (Load Balanced) - Best Redundancy
+
+If your DNS supports multiple A records:

 ```dns
 test.zlor.fi  A  192.168.30.101
@@ -865,7 +872,23 @@ test.zlor.fi  A  192.168.30.103
 test.zlor.fi  A  192.168.30.104
 ```

-DNS clients will distribute requests across all configured IPs (round-robin).
+DNS clients will distribute requests across all nodes (round-robin).
+
+**Pros:** Load balanced, automatic failover
+**Cons:** Requires DNS server support for multiple A records
+
+#### Option C: Virtual IP (VIP) with Keepalived - Best of Both Worlds
+
+If your DNS only allows one A record but you want redundancy:
+
+```dns
+test.zlor.fi  A  192.168.30.100
+```
+
+Set up a virtual IP that automatically floats between nodes. See "Virtual IP Setup" below for detailed instructions.
+
+**Pros:** Single DNS record, automatic failover, load balancing
+**Cons:** Requires additional setup with Keepalived

 ### Step 2: Configure Cluster Nodes for External DNS

@@ -1089,6 +1112,209 @@ spec:
 kubectl apply -f manifests/nginx-test-deployment.yaml
 ```

+## Virtual IP Setup (Option C)
+
+If your DNS server only allows a single A record but you want high availability across all nodes, use a Virtual IP (VIP) with Keepalived.
+
+### How It Works
+
+- A virtual IP (192.168.30.100) floats between cluster nodes using VRRP protocol
+- The master node holds the VIP by default
+- If the master fails, a worker node automatically takes over
+- All traffic reaches the cluster through a single IP address
+- Clients experience automatic failover with minimal downtime
+
+### Prerequisites
+
+- All nodes must be on the same network segment
+- Network must support ARP protocol (standard on most networks)
+- No other services should use 192.168.30.100
+
+### Installation
+
+#### Step 1: Update Your VIP Address
+
+Edit `vip-setup.yml` and change the VIP to an unused IP on your network:
+
+```yaml
+vars:
+  vip_address: "192.168.30.100"  # Change this to your desired VIP
+  vip_interface: "eth0"           # Change if your interface is different
+```
+
+#### Step 2: Run the VIP Setup Playbook
+
+```bash
+ansible-playbook vip-setup.yml
+```
+
+This will:
+
+- Install Keepalived on all nodes
+- Configure VRRP with master on cm4-01 and backup on workers
+- Set up health checks for automatic failover
+- Enable the virtual IP
+
+#### Step 3: Verify VIP is Active
+
+Check that the VIP is assigned to the master node:
+
+```bash
+# From your local machine
+ping 192.168.30.100
+
+# From any cluster node
+ssh pi@192.168.30.101
+ip addr show
+
+# Look for your VIP address in the output
+```
+
+#### Step 4: Update DNS Records
+
+Now you can use just one A record pointing to the VIP:
+
+```dns
+test.zlor.fi  A  192.168.30.100
+```
+
+#### Step 5: Update Ingress (Optional)
+
+If you want to reference the VIP in your ingress, update the manifest:
+
+```yaml
+spec:
+  rules:
+  - host: test.zlor.fi
+    http:
+      paths:
+      - path: /
+        pathType: Prefix
+        backend:
+          service:
+            name: nginx-test
+            port:
+              number: 80
+```
+
+The ingress is already correct - it will reach the cluster through any node IP.
+
+### Monitoring the VIP
+
+Check VIP status and failover behavior:
+
+```bash
+# View Keepalived status
+ssh pi@192.168.30.101
+systemctl status keepalived
+
+# Watch VIP transitions (open in separate terminal)
+watch 'ip addr show | grep 192.168.30.100'
+
+# View Keepalived logs
+sudo journalctl -u keepalived -f
+
+# Check health check script
+sudo cat /usr/local/bin/check_apiserver.sh
+```
+
+### Testing Failover
+
+To test automatic failover:
+
+1. Note which node has the VIP:
+
+```bash
+for ip in 192.168.30.{101..104}; do
+  echo "=== $ip ==="
+  ssh pi@$ip "ip addr show | grep 192.168.30.100" 2>/dev/null || echo "Not on this node"
+done
+```
+
+1. SSH into the node holding the VIP and stop keepalived:
+
+```bash
+ssh pi@192.168.30.101  # or whichever node has the VIP
+sudo systemctl stop keepalived
+```
+
+1. Watch the VIP migrate to another node:
+
+```bash
+# From another terminal, watch the migration
+ping 192.168.30.100 -c 5
+# Connection may drop briefly, then resume on new node
+```
+
+1. Restart keepalived on the original node:
+
+```bash
+sudo systemctl start keepalived
+```
+
+### Troubleshooting VIP
+
+#### VIP is not appearing on any node
+
+Check if Keepalived is running:
+
+```bash
+ssh pi@192.168.30.101
+sudo systemctl status keepalived
+sudo journalctl -u keepalived -n 20
+```
+
+Verify the interface name:
+
+```bash
+ip route | grep default  # Should show your interface name
+```
+
+Update `vip_interface` in `vip-setup.yml` if needed and re-run.
+
+#### VIP keeps switching between nodes
+
+This indicates the health check is failing. Verify:
+
+```bash
+# Check if API server is responding
+curl -k https://127.0.0.1:6443/healthz
+
+# Check the health check script
+cat /usr/local/bin/check_apiserver.sh
+sudo bash /usr/local/bin/check_apiserver.sh
+```
+
+#### DNS resolves but connections time out
+
+Verify all nodes have the VIP configured:
+
+```bash
+for ip in 192.168.30.{101..104}; do
+  echo "=== $ip ==="
+  ssh pi@$ip "ip addr show | grep 192.168.30.100"
+done
+```
+
+Test direct connectivity to the VIP from each node:
+
+```bash
+ssh pi@192.168.30.101
+curl -H "Host: test.zlor.fi" http://192.168.30.100
+```
+
+### Disabling VIP
+
+If you no longer need the VIP:
+
+```bash
+# Stop Keepalived on all nodes
+ansible all -m systemd -a "name=keepalived state=stopped enabled=no" --become
+
+# Remove configuration
+ansible all -m file -a "path=/etc/keepalived/keepalived.conf state=absent" --become
+```
+
 ## Uninstall

 To completely remove k3s from all nodes:
--- a/vip-setup.yml
+++ b/vip-setup.yml
@@ -0,0 +1,159 @@
+---
+- name: Configure Virtual IP (VIP) with Keepalived for k3s cluster
+  hosts: all
+  become: yes
+  vars:
+    vip_address: "192.168.30.100"  # Change this to your desired VIP
+    vip_netmask: "255.255.255.0"
+    vip_interface: "eth0"  # Change if your interface is different
+    cluster_nodes:
+      - 192.168.30.101
+      - 192.168.30.102
+      - 192.168.30.103
+      - 192.168.30.104
+
+  pre_tasks:
+    - name: Determine interface name
+      shell: |
+        ip route | grep default | awk '{print $5}' | head -1
+      register: default_interface
+      changed_when: false
+
+    - name: Set interface fact
+      set_fact:
+        vip_interface: "{{ default_interface.stdout }}"
+
+  tasks:
+    - name: Install Keepalived
+      apt:
+        name: keepalived
+        state: present
+        update_cache: yes
+
+    - name: Configure Keepalived on master node
+      template:
+        content: |
+          global_defs {
+            router_id K3S_MASTER
+            script_user root
+            enable_script_security
+          }
+
+          vrrp_script check_apiserver {
+            script "/usr/local/bin/check_apiserver.sh"
+            interval 3
+            weight -2
+            fall 5
+            rise 2
+          }
+
+          vrrp_instance VI_1 {
+            state MASTER
+            interface {{ vip_interface }}
+            virtual_router_id 51
+            priority 100
+            advert_int 1
+            nopreempt
+
+            virtual_ipaddress {
+              {{ vip_address }}/24
+            }
+
+            track_script {
+              check_apiserver
+            }
+          }
+        dest: /etc/keepalived/keepalived.conf
+        owner: root
+        group: root
+        mode: '0644'
+      notify: Restart Keepalived
+      when: inventory_hostname in groups['master']
+
+    - name: Configure Keepalived on worker nodes
+      template:
+        content: |
+          global_defs {
+            router_id K3S_WORKER
+            script_user root
+            enable_script_security
+          }
+
+          vrrp_script check_apiserver {
+            script "/usr/local/bin/check_apiserver.sh"
+            interval 3
+            weight -2
+            fall 5
+            rise 2
+          }
+
+          vrrp_instance VI_1 {
+            state BACKUP
+            interface {{ vip_interface }}
+            virtual_router_id 51
+            priority 50
+            advert_int 1
+
+            virtual_ipaddress {
+              {{ vip_address }}/24
+            }
+
+            track_script {
+              check_apiserver
+            }
+          }
+        dest: /etc/keepalived/keepalived.conf
+        owner: root
+        group: root
+        mode: '0644'
+      notify: Restart Keepalived
+      when: inventory_hostname not in groups['master']
+
+    - name: Create API server health check script
+      copy:
+        content: |
+          #!/bin/bash
+          # Check if k3s API server is responding
+          curl -sf https://127.0.0.1:6443/healthz > /dev/null 2>&1
+          exit $?
+        dest: /usr/local/bin/check_apiserver.sh
+        owner: root
+        group: root
+        mode: '0755'
+
+    - name: Enable and start Keepalived
+      systemd:
+        name: keepalived
+        enabled: yes
+        state: started
+        daemon_reload: yes
+
+    - name: Verify VIP is assigned
+      shell: |
+        ip addr show {{ vip_interface }} | grep {{ vip_address }}
+      register: vip_check
+      retries: 3
+      delay: 2
+      until: vip_check is succeeded
+      changed_when: false
+
+    - name: Display VIP configuration
+      debug:
+        msg: |
+          Virtual IP configured successfully!
+
+          VIP Address: {{ vip_address }}
+          Interface: {{ vip_interface }}
+
+          Use this IP for your DNS records:
+          test.zlor.fi  A  {{ vip_address }}
+
+          The VIP will automatically failover to a worker node
+          if the master node becomes unavailable.
+
+  handlers:
+    - name: Restart Keepalived
+      systemd:
+        name: keepalived
+        state: restarted
+        daemon_reload: yes