- Switch compute-blade-agent deployment from workers to all nodes (control-plane and workers) - Use /usr/bin/compute-blade-agent instead of /usr/local/bin - Update verification scripts to reference /usr/bin/compute-blade-agent - Update docs to refer to all nodes across Deployment Guide, Checklist, and Getting Started - Change site.yml to install on all hosts instead of just workers - Align example commands to the all-nodes workflow
4.1 KiB
4.1 KiB
Compute Blade Agent Deployment Guide
Quick reference for deploying and managing the Compute Blade Agent on all nodes in your k3s-ansible cluster (control-plane and worker nodes).
Quick Start
Deploy Everything
ansible-playbook site.yml
Deploy Only Compute Blade Agent
ansible-playbook site.yml --tags compute-blade-agent
Skip Compute Blade Agent
ansible-playbook site.yml --skip-tags compute-blade-agent
Configuration
Enable/Disable Globally
Edit inventory/hosts.ini:
[k3s_cluster:vars]
enable_compute_blade_agent=true # Set to false to disable
Enable/Disable Per-Node
Edit inventory/hosts.ini:
[worker]
cm4-02 ansible_host=192.168.30.102 ansible_user=pi enable_compute_blade_agent=true
cm4-03 ansible_host=192.168.30.103 ansible_user=pi enable_compute_blade_agent=false
cm4-04 ansible_host=192.168.30.104 ansible_user=pi
Verification
Check Service Status
ssh pi@<node-ip>
sudo systemctl status compute-blade-agent
View Logs
ssh pi@<node-ip>
sudo journalctl -u compute-blade-agent -f
Check Installation
ssh pi@<node-ip>
/usr/bin/compute-blade-agent --version
ls -la /etc/compute-blade-agent/
File Locations
- Binary:
/usr/bin/compute-blade-agent - Config:
/etc/compute-blade-agent/config.yaml - Systemd Service:
/etc/systemd/system/compute-blade-agent.service - Logs:
journalctl -u compute-blade-agent
Environment Variables
Configure via BLADE_ prefixed environment variables:
export BLADE_CONFIG_PATH=/etc/compute-blade-agent/config.yaml
/usr/bin/compute-blade-agent
Monitoring
Optional Kubernetes Resources
Deploy monitoring components:
kubectl apply -f manifests/compute-blade-agent-daemonset.yaml
This creates:
- Namespace:
compute-blade-agent - ConfigMap:
compute-blade-agent-config - DaemonSet:
compute-blade-agent-exporter - Service:
compute-blade-agent-metrics
Prometheus Integration
To enable Prometheus scraping, uncomment the ServiceMonitor in the manifest:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: compute-blade-agent
namespace: compute-blade-agent
spec:
selector:
matchLabels:
app: compute-blade-agent
endpoints:
- port: metrics
interval: 30s
path: /metrics
Troubleshooting
Service Not Running
# Check service status
sudo systemctl status compute-blade-agent
# View error logs
sudo journalctl -u compute-blade-agent -n 50
# Try restarting
sudo systemctl restart compute-blade-agent
Hardware Not Detected
- Verify hardware is connected
- Check logs for hardware errors
- Ensure systemd service started correctly
Reinstall
# SSH to node
ssh pi@<node-ip>
# Check if uninstall script exists
ls -la /usr/local/bin/*compute-blade*
# Run uninstall if it exists
sudo bash /usr/local/bin/k3s-uninstall-compute-blade-agent.sh 2>/dev/null || true
# Re-run Ansible
ansible-playbook site.yml --tags compute-blade-agent
Uninstall
From Single Node
ssh pi@<node-ip>
sudo bash /usr/local/bin/k3s-uninstall-compute-blade-agent.sh
From All Nodes
ansible k3s_cluster -m shell -a "bash /usr/local/bin/k3s-uninstall-compute-blade-agent.sh" --become
Features
- Hardware Monitoring: Temperature, fan speed, button events
- Automatic Scaling: Critical mode on overheat (max fan + red LED)
- Identification: LED blinking for locating blades in racks
- Metrics Export: Prometheus-compatible metrics
- CLI Tool:
bladectlfor local/remote interaction
Files in This Repository
roles/compute-blade-agent/tasks/main.yml- Installation rolemanifests/compute-blade-agent-daemonset.yaml- Kubernetes monitoring resourcessite.yml- Playbook that runs the roleinventory/hosts.ini- Configuration variablesREADME.md- Full documentation