Version v1.7 of the documentation is no longer actively maintained. The site that you are currently viewing is an archived snapshot. For up-to-date documentation, see the latest version.

Watchdog Timers

Using hardware watchdogs to workaround hardware/software lockups.

Talos Linux now supports hardware watchdog timers configuration. Hardware watchdog timers allow to reset (reboot) the system if the software stack becomes unresponsive. Please consult your hardware/VM documentation for the availability of the hardware watchdog timers.

Configuration

To discover the available watchdog devices, run:

$ talosctl ls /sys/class/watchdog/
NODE         NAME
172.20.0.2   .
172.20.0.2   watchdog0
172.20.0.2   watchdog1

The implementation of the watchdog device can be queried with:

$ talosctl read /sys/class/watchdog/watchdog0/identity
i6300ESB timer

To enable the watchdog timer, patch the machine configuration with the following:

# watchdog.yaml
apiVersion: v1alpha1
kind: WatchdogTimerConfig
device: /dev/watchdog0
timeout: 5m
talosctl patch mc -p @watchdog.yaml

Talos Linux will set up the watchdog time with a 5-minute timeout, and it will keep resetting the timer to prevent the system from rebooting. If the software becomes unresponsive, the watchdog timer will expire, and the system will be reset by the watchdog hardware.

Inspection

To inspect the watchdog timer configuration, run:

$ talosctl get watchdogtimerconfig
NODE         NAMESPACE   TYPE                  ID      VERSION   DEVICE           TIMEOUT
172.20.0.2   runtime     WatchdogTimerConfig   timer   1         /dev/watchdog0   5m0s

To inspect the watchdog timer status, run:

$ talosctl get watchdogtimerstatus
NODE         NAMESPACE   TYPE                  ID      VERSION   DEVICE           TIMEOUT
172.20.0.2   runtime     WatchdogTimerStatus   timer   1         /dev/watchdog0   5m0s

Current status of the watchdog timer can also be inspected via Linux sysfs:

$ talosctl read /sys/class/watchdog/watchdog0/state
active
Last modified April 11, 2024: docs: add what's new for v1.7 (d7c3a0735)