Files
compute-blade-agent/README.md
Cedric Kienzler 70541d86ba feat(agent)!: add support for mTLS authentication in gRPC server (#54)
* refactor(fancontroller): improve fan controller validation logic and error handling for temperature steps

* refactor(agent): restructure gRPC server implementation by moving it to a new api package for better organization and maintainability

* feat(agent): implement gRPC server for managing compute blade agents and add graceful shutdown support
refactor(agent): restructure agent code by moving API logic to a dedicated file and improving error handling
fix(agent): update logging messages for clarity and consistency across the agent's operations
chore(agent): remove unused API code and consolidate event handling logic for better maintainability
style(agent): improve code formatting and organization for better readability and adherence to conventions

* feat(agent): add support for TLS configuration in gRPC server

* feat(api): add gRPC server authentication

* fix

* feat(config): add listen mode configuration to support tcp or unix sockets
feat(agent): implement listen mode in gRPC service to allow flexible socket types
feat(bladectl): enhance configuration loading and add support for TLS credentials
fix(bladectl): improve error handling for gRPC connection and event emission
style(logging): change log level from Warn to Info for better clarity in logs

* add logging middleware + fixes

* fix remote-connection to gRPC API Server

debugging the SAN issues took the soul out of me... And then the stupid
mistake in cmd_root where I didn't construct the TLS credentials
correctly... Oh dear...

* cleanup

* cleanup

* cleanup commands

* cleanup

* make README.md nicer

* Update cmd/agent/main.go

Co-authored-by: Matthias Riegler <github@m4tbit.de>

* Update cmd/bladectl/cmd_root.go

Co-authored-by: Matthias Riegler <github@m4tbit.de>

* move bladectl config into correct directory

* fix bugs

* // FIXME: No dead code

* nit: code style

* nit(YAGNI): you aint gonna need it. Don't make life harder than it needs to be

* nit(YAGNI): you aint gonna need it. Don't make life harder than it needs to be

* nit(YAGNI): you aint gonna need it. Don't make life harder than it needs to be

* nit(cmd_identify)

---------

Co-authored-by: Matthias Riegler <github@m4tbit.de>
2025-05-12 00:00:55 +02:00

4.3 KiB

compute-blade-agent

⚠️ Beta Release: This software is currently in beta, and both configurations and APIs may undergo breaking changes. It is not yet 100% feature complete, but it functions as intended.

Quick Start

Install the agent with the one-liner below:

curl -L -o /tmp/compute-blade-agent-installer.sh https://raw.githubusercontent.com/uptime-industries/compute-blade-agent/main/hack/autoinstall.sh
chmod +x /tmp/compute-blade-agent-installer.sh
/tmp/compute-blade-agent-installer.sh

Components

compute-blade-agent: Hardware Interaction & Monitoring

The agent runs as a system service and monitors various hardware states and events:

  • Reacts to button presses and SoC temperature.
  • Automatically enters critical mode (fan 100%, red LED) when overheating.
  • Exposes system metrics via a Prometheus endpoint (/metrics).

The identify function can be triggered via bladectl or a physical button press. It makes the edge LED blink to assist locating a blade in a rack.

bladectl: User Command-Line Tool

bladectl is a CLI utility for remote or local interaction with the running agent. Example use cases:

bladectl set identify --wait    # Blink LED until button is pressed
bladectl set identify --confirm # Cancel identification
bladectl unset identify         # Cancel identification (alternative)

fanunit.uf2: Smart Fan Unit Firmware

This firmware runs on the fan unit microcontroller and:

  • Controls fan speed via UART commands from blade agents.
  • Reports RPM and airflow temperature back to the blade.
  • Forwards button events (1x = left blade, 2x = right blade).
  • Uses EMC2101 for optional advanced features like airflow-based fan control.

To install it, download the fanunit.uf2, and follow the firmware upgrade instructions here.

Installation

Install the agent with the one-liner below:

curl -L -o /tmp/compute-blade-agent-installer.sh https://raw.githubusercontent.com/uptime-industries/compute-blade-agent/main/hack/autoinstall.sh
chmod +x /tmp/compute-blade-agent-installer.sh
/tmp/compute-blade-agent-installer.sh

Note: bladectl requires root privileges when used locally, due to restricted access to the Unix socket (/tmp/compute-blade-agent.sock).

Configuration

The default configuration file is located at:

/etc/compute-blade-agent/config.yaml

You can also override any config option via environment variables using the BLADE_ prefix.

Examples

YAML:

listen:
  metrics: ":9666"

Environment variable override:

BLADE_LISTEN_METRICS=":1234"

Common Overrides

Variable Description
BLADE_STEALTH_MODE=false Enable/disable stealth mode
BLADE_FAN_SPEED_PERCENT=80 Set static fan speed
BLADE_CRITICAL_TEMPERATURE_THRESHOLD=60 Set critical temp threshold (°C)
BLADE_HAL_RPM_REPORTING_STANDARD_FAN_UNIT=false Disable RPM monitoring for lower CPU use

Exposing the gRPC API for Remote Access

To allow secure remote use of bladectl over the network:

1. Update your config (/etc/compute-blade-agent/config.yaml):

listen:
  metrics: ":9666"
  grpc: ":8081"
  authenticated: true
  mode: tcp

2. Restart the agent:

systemctl restart compute-blade-agent

This will:

  • Generate new mTLS server and client certificates in /etc/compute-blade-agent/*.pem
  • Write a new bladectl config to: ~/.config/bladectl/config.yaml with the client certificates in place

Using bladectl from your local machine

  1. Copy the config from the blade:
scp root@blade-pi1:~/.config/bladectl/config.yaml ~/.config/bladectl/config.yaml
  1. Fix the server address to point to the blade:
yq e '.blades[] | select(.name == "blade-pi1") .blade.server = "blade-pi1.local:8081"' -i ~/.config/bladectl/config.yaml

Your bladectl tool can now securely talk to the remote agent via gRPC over mTLS.