Files
k3s-ansible/COMPUTE_BLADE_AGENT.md
Michael Skrynski fd7c9239b5 Update docs and roles for agent on all nodes
- Switch compute-blade-agent deployment from workers to all nodes
  (control-plane and workers)
- Use /usr/bin/compute-blade-agent instead of /usr/local/bin
- Update verification scripts to reference /usr/bin/compute-blade-agent
- Update docs to refer to all nodes across Deployment Guide, Checklist,
  and Getting Started
- Change site.yml to install on all hosts instead of just workers
- Align example commands to the all-nodes workflow
2026-01-12 08:54:41 +01:00

4.1 KiB

Compute Blade Agent Deployment Guide

Quick reference for deploying and managing the Compute Blade Agent on all nodes in your k3s-ansible cluster (control-plane and worker nodes).

Quick Start

Deploy Everything

ansible-playbook site.yml

Deploy Only Compute Blade Agent

ansible-playbook site.yml --tags compute-blade-agent

Skip Compute Blade Agent

ansible-playbook site.yml --skip-tags compute-blade-agent

Configuration

Enable/Disable Globally

Edit inventory/hosts.ini:

[k3s_cluster:vars]
enable_compute_blade_agent=true    # Set to false to disable

Enable/Disable Per-Node

Edit inventory/hosts.ini:

[worker]
cm4-02 ansible_host=192.168.30.102 ansible_user=pi enable_compute_blade_agent=true
cm4-03 ansible_host=192.168.30.103 ansible_user=pi enable_compute_blade_agent=false
cm4-04 ansible_host=192.168.30.104 ansible_user=pi

Verification

Check Service Status

ssh pi@<node-ip>
sudo systemctl status compute-blade-agent

View Logs

ssh pi@<node-ip>
sudo journalctl -u compute-blade-agent -f

Check Installation

ssh pi@<node-ip>
/usr/bin/compute-blade-agent --version
ls -la /etc/compute-blade-agent/

File Locations

  • Binary: /usr/bin/compute-blade-agent
  • Config: /etc/compute-blade-agent/config.yaml
  • Systemd Service: /etc/systemd/system/compute-blade-agent.service
  • Logs: journalctl -u compute-blade-agent

Environment Variables

Configure via BLADE_ prefixed environment variables:

export BLADE_CONFIG_PATH=/etc/compute-blade-agent/config.yaml
/usr/bin/compute-blade-agent

Monitoring

Optional Kubernetes Resources

Deploy monitoring components:

kubectl apply -f manifests/compute-blade-agent-daemonset.yaml

This creates:

  • Namespace: compute-blade-agent
  • ConfigMap: compute-blade-agent-config
  • DaemonSet: compute-blade-agent-exporter
  • Service: compute-blade-agent-metrics

Prometheus Integration

To enable Prometheus scraping, uncomment the ServiceMonitor in the manifest:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: compute-blade-agent
  namespace: compute-blade-agent
spec:
  selector:
    matchLabels:
      app: compute-blade-agent
  endpoints:
    - port: metrics
      interval: 30s
      path: /metrics

Troubleshooting

Service Not Running

# Check service status
sudo systemctl status compute-blade-agent

# View error logs
sudo journalctl -u compute-blade-agent -n 50

# Try restarting
sudo systemctl restart compute-blade-agent

Hardware Not Detected

  1. Verify hardware is connected
  2. Check logs for hardware errors
  3. Ensure systemd service started correctly

Reinstall

# SSH to node
ssh pi@<node-ip>

# Check if uninstall script exists
ls -la /usr/local/bin/*compute-blade*

# Run uninstall if it exists
sudo bash /usr/local/bin/k3s-uninstall-compute-blade-agent.sh 2>/dev/null || true

# Re-run Ansible
ansible-playbook site.yml --tags compute-blade-agent

Uninstall

From Single Node

ssh pi@<node-ip>
sudo bash /usr/local/bin/k3s-uninstall-compute-blade-agent.sh

From All Nodes

ansible k3s_cluster -m shell -a "bash /usr/local/bin/k3s-uninstall-compute-blade-agent.sh" --become

Features

  • Hardware Monitoring: Temperature, fan speed, button events
  • Automatic Scaling: Critical mode on overheat (max fan + red LED)
  • Identification: LED blinking for locating blades in racks
  • Metrics Export: Prometheus-compatible metrics
  • CLI Tool: bladectl for local/remote interaction

Files in This Repository

  • roles/compute-blade-agent/tasks/main.yml - Installation role
  • manifests/compute-blade-agent-daemonset.yaml - Kubernetes monitoring resources
  • site.yml - Playbook that runs the role
  • inventory/hosts.ini - Configuration variables
  • README.md - Full documentation

Additional Resources