1 - Advanced Networking

How to configure advanced networking options on Talos Linux.

Static Addressing

Static addressing is comprised of specifying addresses, routes ( remember to add your default gateway ), and interface. Most likely you’ll also want to define the nameservers so you have properly functioning DNS.

machine:
  network:
    hostname: talos
    nameservers:
      - 10.0.0.1
    interfaces:
      - interface: eth0
        addresses:
          - 10.0.0.201/8
        mtu: 8765
        routes:
          - network: 0.0.0.0/0
            gateway: 10.0.0.1
      - interface: eth1
        ignore: true
  time:
    servers:
      - time.cloudflare.com

Additional Addresses for an Interface

In some environments you may need to set additional addresses on an interface. In the following example, we set two additional addresses on the loopback interface.

machine:
  network:
    interfaces:
      - interface: lo
        addresses:
          - 192.168.0.21/24
          - 10.2.2.2/24

Bonding

The following example shows how to create a bonded interface.

machine:
  network:
    interfaces:
      - interface: bond0
        dhcp: true
        bond:
          mode: 802.3ad
          lacpRate: fast
          xmitHashPolicy: layer3+4
          miimon: 100
          updelay: 200
          downdelay: 200
          interfaces:
            - eth0
            - eth1

Setting Up a Bridge

The following example shows how to set up a bridge between two interfaces with an assigned static address.

machine:
  network:
    interfaces:
      - interface: br0
        addresses:
          - 192.168.0.42/24
        bridge:
          stp:
            enabled: true
          interfaces:
              - eth0
              - eth1

VLANs

To setup vlans on a specific device use an array of VLANs to add. The master device may be configured without addressing by setting dhcp to false.

machine:
  network:
    interfaces:
      - interface: eth0
        dhcp: false
        vlans:
          - vlanId: 100
            addresses:
              - "192.168.2.10/28"
            routes:
              - network: 0.0.0.0/0
                gateway: 192.168.2.1

2 - Air-gapped Environments

Setting up Talos Linux to work in environments with no internet access.

In this guide we will create a Talos cluster running in an air-gapped environment with all the required images being pulled from an internal registry. We will use the QEMU provisioner available in talosctl to create a local cluster, but the same approach could be used to deploy Talos in bigger air-gapped networks.

Requirements

The follow are requirements for this guide:

  • Docker 18.03 or greater
  • Requirements for the Talos QEMU cluster

Identifying Images

In air-gapped environments, access to the public Internet is restricted, so Talos can’t pull images from public Docker registries (docker.io, ghcr.io, etc.) We need to identify the images required to install and run Talos. The same strategy can be used for images required by custom workloads running on the cluster.

The talosctl image default command provides a list of default images used by the Talos cluster (with default configuration settings). To print the list of images, run:

talosctl image default

This list contains images required by a default deployment of Talos. There might be additional images required for the workloads running on this cluster, and those should be added to this list.

Preparing the Internal Registry

As access to the public registries is restricted, we have to run an internal Docker registry. In this guide, we will launch the registry on the same machine using Docker:

$ docker run -d -p 6000:5000 --restart always --name registry-airgapped registry:2
1bf09802bee1476bc463d972c686f90a64640d87dacce1ac8485585de69c91a5

This registry will be accepting connections on port 6000 on the host IPs. The registry is empty by default, so we have fill it with the images required by Talos.

First, we pull all the images to our local Docker daemon:

$ for image in `talosctl image default`; do docker pull $image; done
v0.15.1: Pulling from coreos/flannel
Digest: sha256:9a296fbb67790659adc3701e287adde3c59803b7fcefe354f1fc482840cdb3d9
...

All images are now stored in the Docker daemon store:

$ docker images
REPOSITORY                               TAG                                        IMAGE ID       CREATED         SIZE
gcr.io/etcd-development/etcd             v3.5.3                                     604d4f022632   6 days ago      181MB
ghcr.io/siderolabs/install-cni           v1.0.0-2-gc5d3ab0                          4729e54f794d   6 days ago      76MB
...

Now we need to re-tag them so that we can push them to our local registry. We are going to replace the first component of the image name (before the first slash) with our registry endpoint 127.0.0.1:6000:

$ for image in `talosctl image default`; do \
    docker tag $image `echo $image | sed -E 's#^[^/]+/#127.0.0.1:6000/#'`; \
  done

As the next step, we push images to the internal registry:

$ for image in `talosctl image default`; do \
    docker push `echo $image | sed -E 's#^[^/]+/#127.0.0.1:6000/#'`; \
  done

We can now verify that the images are pushed to the registry:

$ curl http://127.0.0.1:6000/v2/_catalog
{"repositories":["coredns/coredns","coreos/flannel","etcd-development/etcd","kube-apiserver","kube-controller-manager","kube-proxy","kube-scheduler","pause","siderolabs/install-cni","siderolabs/installer","siderolabs/kubelet"]}

Note: images in the registry don’t have the registry endpoint prefix anymore.

Launching Talos in an Air-gapped Environment

For Talos to use the internal registry, we use the registry mirror feature to redirect all image pull requests to the internal registry. This means that the registry endpoint (as the first component of the image reference) gets ignored, and all pull requests are sent directly to the specified endpoint.

We are going to use a QEMU-based Talos cluster for this guide, but the same approach works with Docker-based clusters as well. As QEMU-based clusters go through the Talos install process, they can be used better to model a real air-gapped environment.

Identify all registry prefixes from talosctl image default, for example:

  • docker.io
  • gcr.io
  • ghcr.io
  • registry.k8s.io

The talosctl cluster create command provides conveniences for common configuration options. The only required flag for this guide is --registry-mirror <endpoint>=http://10.5.0.1:6000 which redirects every pull request to the internal registry, this flag needs to be repeated for each of the identified registry prefixes above. The endpoint being used is 10.5.0.1, as this is the default bridge interface address which will be routable from the QEMU VMs (127.0.0.1 IP will be pointing to the VM itself).

$ sudo --preserve-env=HOME talosctl cluster create --provisioner=qemu --install-image=ghcr.io/siderolabs/installer:v1.9.2 \
  --registry-mirror docker.io=http://10.5.0.1:6000 \
  --registry-mirror gcr.io=http://10.5.0.1:6000 \
  --registry-mirror ghcr.io=http://10.5.0.1:6000 \
  --registry-mirror registry.k8s.io=http://10.5.0.1:6000 \
validating CIDR and reserving IPs
generating PKI and tokens
creating state directory in "/home/user/.talos/clusters/talos-default"
creating network talos-default
creating load balancer
creating dhcpd
creating master nodes
creating worker nodes
waiting for API
...

Note: --install-image should match the image which was copied into the internal registry in the previous step.

You can be verify that the cluster is air-gapped by inspecting the registry logs: docker logs -f registry-airgapped.

Closing Notes

Running in an air-gapped environment might require additional configuration changes, for example using custom settings for DNS and NTP servers.

When scaling this guide to the bare-metal environment, following Talos config snippet could be used as an equivalent of the --registry-mirror flag above:

machine:
  ...
  registries:
      mirrors:
        docker.io:
          endpoints:
          - http://10.5.0.1:6000/
        gcr.io:
          endpoints:
          - http://10.5.0.1:6000/
        ghcr.io:
          endpoints:
          - http://10.5.0.1:6000/
        registry.k8s.io:
          endpoints:
          - http://10.5.0.1:6000/
...

Other implementations of Docker registry can be used in place of the Docker registry image used above to run the registry. If required, auth can be configured for the internal registry (and custom TLS certificates if needed).

Please see pull-through cache guide for an example using Harbor container registry with Talos.

3 - Building Custom Talos Images

How to build a custom Talos image from source.

There might be several reasons to build Talos images from source:

Checkout Talos Source

git clone https://github.com/siderolabs/talos.git

If building for a specific release, checkout the corresponding tag:

git checkout v1.9.2

Set up the Build Environment

See Developing Talos for details on setting up the buildkit builder.

Architectures

By default, Talos builds for linux/amd64, but you can customize that by passing PLATFORM variable to make:

make <target> PLATFORM=linux/arm64 # build for arm64 only
make <target> PLATFORM=linux/arm64,linux/amd64 # build for arm64 and amd64, container images will be multi-arch

Custom PKGS

When customizing Linux kernel, the source for the siderolabs/pkgs repository can be overridden with:

  • if you built and pushed only a custom kernel package, the reference can be overridden with PKG_KERNEL variable: make <target> PKG_KERNEL=<registry>/<username>/kernel:<tag>
  • if any other single package was customized, the reference can be overridden with PKG_<pkg> (e.g. PKG_IPTABLES) variable: make <target> PKG_<pkg>=<registry>/<username>/<pkg>:<tag>
  • if the full pkgs repository was built and pushed, the references can be overridden with PKGS_PREFIX and PKGS variables: make <target> PKGS_PREFIX=<registry>/<username> PKGS=<tag>

Customizations

Some of the build parameters can be customized by passing environment variables to make, e.g. GOAMD64=v1 can be used to build Talos images compatible with old AMD64 CPUs:

make <target> GOAMD64=v1

Building Kernel and Initramfs

The most basic boot assets can be built with:

make kernel initramfs

Build result will be stored as _out/vmlinuz-<arch> and _out/initramfs-<arch>.xz.

Building Container Images

Talos container images should be pushed to the registry as the result of the build process.

The default settings are:

  • IMAGE_REGISTRY is set to ghcr.io
  • USERNAME is set to the siderolabs (or value of environment variable USERNAME if it is set)

The image can be pushed to any registry you have access to, but the access credentials should be stored in ~/.docker/config.json file (e.g. with docker login).

Building and pushing the image can be done with:

make installer PUSH=true IMAGE_REGISTRY=docker.io USERNAME=<username> # ghcr.io/siderolabs/installer
make imager PUSH=true IMAGE_REGISTRY=docker.io USERNAME=<username> # ghcr.io/siderolabs/imager

The local registry running on 127.0.0.1:5005 can be used as well to avoid pushing/pulling over the network:

make installer PUSH=true REGISTRY=127.0.0.1:5005

When building imager container, by default Talos will include the boot assets for both amd64 and arm64 architectures, if building only for single architecture, specify INSTALLER_ARCH variable:

make imager INSTALLER_ARCH=targetarch PLATFORM=linux/amd64

Building ISO

The ISO image is built with the help of imager container image, by default ghcr.io/siderolabs/imager will be used with the matching tag:

make iso

The ISO image will be stored as _out/talos-<arch>.iso.

If ISO image should be built with the custom imager image, it can be specified with IMAGE_REGISTRY/USERNAME variables:

make iso IMAGE_REGISTRY=docker.io USERNAME=<username>

Building Disk Images

The disk image is built with the help of imager container image, by default ghcr.io/siderolabs/imager will be used with the matching tag:

make image-metal

Available disk images are encoded in the image-% target, e.g. make image-aws. Same as with ISO image, the custom imager image can be specified with IMAGE_REGISTRY/USERNAME variables.

4 - CA Rotation

How to rotate Talos and Kubernetes API root certificate authorities.

In general, you almost never need to rotate the root CA certificate and key for the Talos API and Kubernetes API. Talos sets up root certificate authorities with the lifetime of 10 years, and all Talos and Kubernetes API certificates are issued by these root CAs. So the rotation of the root CA is only needed if:

  • you suspect that the private key has been compromised;
  • you want to revoke access to the cluster for a leaked talosconfig or kubeconfig;
  • once in 10 years.

Overview

There are some details which make Talos and Kubernetes API root CA rotation a bit different, but the general flow is the same:

  • generate new CA certificate and key;
  • add new CA certificate as ‘accepted’, so new certificates will be accepted as valid;
  • swap issuing CA to the new one, old CA as accepted;
  • refresh all certificates in the cluster;
  • remove old CA from ‘accepted’.

At the end of the flow, old CA is completely removed from the cluster, so all certificates issued by it will be considered invalid.

Both rotation flows are described in detail below.

Talos API

Automated Talos API CA Rotation

Talos API CA rotation doesn’t interrupt connections within the cluster, and it doesn’t require a reboot of the nodes.

Run the following command in dry-run mode to see the steps which will be taken:

$ talosctl -n <CONTROLPLANE> rotate-ca --dry-run=true --talos=true --kubernetes=false
> Starting Talos API PKI rotation, dry-run mode true...
> Using config context: "talos-default"
> Using Talos API endpoints: ["172.20.0.2"]
> Cluster topology:
  - control plane nodes: ["172.20.0.2"]
  - worker nodes: ["172.20.0.3"]
> Current Talos CA:
...

No changes will be done to the cluster in dry-run mode, so you can safely run it to see the steps.

Before proceeding, make sure that you can capture the output of talosctl command, as it will contain the new CA certificate and key. Record a list of Talos API users to make sure they can all be updated with new talosconfig.

Run the following command to rotate the Talos API CA:

$ talosctl -n <CONTROLPLANE> rotate-ca --dry-run=false --talos=true --kubernetes=false
> Starting Talos API PKI rotation, dry-run mode false...
> Using config context: "talos-default-268"
> Using Talos API endpoints: ["172.20.0.2"]
> Cluster topology:
  - control plane nodes: ["172.20.0.2"]
  - worker nodes: ["172.20.0.3"]
> Current Talos CA:
...
> New Talos CA:
...
> Generating new talosconfig:
context: talos-default
contexts:
    talos-default:
        ....
> Verifying connectivity with existing PKI:
  - 172.20.0.2: OK (version v1.9.2)
  - 172.20.0.3: OK (version v1.9.2)
> Adding new Talos CA as accepted...
  - 172.20.0.2: OK
  - 172.20.0.3: OK
> Verifying connectivity with new client cert, but old server CA:
2024/04/17 21:26:07 retrying error: rpc error: code = Unavailable desc = connection error: desc = "error reading server preface: remote error: tls: unknown certificate authority"
  - 172.20.0.2: OK (version v1.9.2)
  - 172.20.0.3: OK (version v1.9.2)
> Making new Talos CA the issuing CA, old Talos CA the accepted CA...
  - 172.20.0.2: OK
  - 172.20.0.3: OK
> Verifying connectivity with new PKI:
2024/04/17 21:26:08 retrying error: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"x509: Ed25519 verification failure\" while trying to verify candidate authority certificate \"talos\")"
  - 172.20.0.2: OK (version v1.9.2)
  - 172.20.0.3: OK (version v1.9.2)
> Removing old Talos CA from the accepted CAs...
  - 172.20.0.2: OK
  - 172.20.0.3: OK
> Verifying connectivity with new PKI:
  - 172.20.0.2: OK (version v1.9.2)
  - 172.20.0.3: OK (version v1.9.2)
> Writing new talosconfig to "talosconfig"

Once the rotation is done, stash the new Talos CA, update secrets.yaml (if using that for machine configuration generation) with new CA key and certificate.

The new client talosconfig is written to the current directory as talosconfig. You can merge it to the default location with talosctl config merge ./talosconfig.

If other client access talosconfig files needs to be generated, use talosctl config new with new talosconfig.

Note: if using Talos API access from Kubernetes feature, pods might need to be restarted manually to pick up new talosconfig.

Manual Steps for Talos API CA Rotation

  1. Generate new Talos CA (e.g. use talosctl gen secrets and use Talos CA).
  2. Patch machine configuration on all nodes updating .machine.acceptedCAs with new CA certificate.
  3. Generate talosconfig with client certificate generated with new CA, but still using old CA as server CA, verify connectivity, Talos should accept new client certificate.
  4. Patch machine configuration on all nodes updating .machine.ca with new CA certificate and key, and keeping old CA certificate in .machine.acceptedCAs (on worker nodes .machine.ca doesn’t have the key).
  5. Generate talosconfig with both client certificate and server CA using new CA PKI, verify connectivity.
  6. Remove old CA certificate from .machine.acceptedCAs on all nodes.
  7. Verify connectivity.

Kubernetes API

Automated Kubernetes API CA Rotation

The automated process only rotates Kubernetes API CA, used by the kube-apiserver, kubelet, etc. Other Kubernetes secrets might need to be rotated manually as required. Kubernetes pods might need to be restarted to handle changes, and communication within the cluster might be disrupted during the rotation process.

Run the following command in dry-run mode to see the steps which will be taken:

$ talosctl -n <CONTROLPLANE> rotate-ca --dry-run=true --talos=false --kubernetes=true
> Starting Kubernetes API PKI rotation, dry-run mode true...
> Cluster topology:
  - control plane nodes: ["172.20.0.2"]
  - worker nodes: ["172.20.0.3"]
> Building current Kubernetes client...
> Current Kubernetes CA:
...

Before proceeding, make sure that you can capture the output of talosctl command, as it will contain the new CA certificate and key. As Talos API access will not be disrupted, the changes can be reverted back if needed by reverting machine configuration.

Run the following command to rotate the Kubernetes API CA:

$ talosctl -n <CONTROLPLANE> rotate-ca --dry-run=false --talos=false --kubernetes=true
> Starting Kubernetes API PKI rotation, dry-run mode false...
> Cluster topology:
  - control plane nodes: ["172.20.0.2"]
  - worker nodes: ["172.20.0.3"]
> Building current Kubernetes client...
> Current Kubernetes CA:
...
> New Kubernetes CA:
...
> Verifying connectivity with existing PKI...
 - OK (2 nodes ready)
> Adding new Kubernetes CA as accepted...
  - 172.20.0.2: OK
  - 172.20.0.3: OK
> Making new Kubernetes CA the issuing CA, old Kubernetes CA the accepted CA...
  - 172.20.0.2: OK
  - 172.20.0.3: OK
> Building new Kubernetes client...
> Verifying connectivity with new PKI...
2024/04/17 21:45:52 retrying error: Get "https://172.20.0.1:6443/api/v1/nodes": EOF
 - OK (2 nodes ready)
> Removing old Kubernetes CA from the accepted CAs...
  - 172.20.0.2: OK
  - 172.20.0.3: OK
> Verifying connectivity with new PKI...
 - OK (2 nodes ready)
> Kubernetes CA rotation done, new 'kubeconfig' can be fetched with `talosctl kubeconfig`.

At the end of the process, Kubernetes control plane components will be restarted to pick up CA certificate changes. Each node kubelet will re-join the cluster with new client certficiate.

New kubeconfig can be fetched with talosctl kubeconfig command from the cluster.

Kubernetes pods might need to be restarted manually to pick up changes to the Kubernetes API CA.

Manual Steps for Kubernetes API CA Rotation

Steps are similar to the Talos API CA rotation, but use:

  • .cluster.acceptedCAs in place of .machine.acceptedCAs;
  • .cluster.ca in place of .machine.ca;
  • kubeconfig in place of talosconfig.

5 - Cgroups Resource Analysis

How to use talosctl cgroups to monitor resource usage on the node.

Talos provides a way to monitor resource usage of the control groups on the machine. This feature is useful to understand how much resources are being used by the containers and processes running on the machine.

Talos creates several system cgroups:

  • init (contains machined PID 1)
  • system (contains system services, and extension services)
  • podruntime (contains CRI containerd, kubelet, etcd)

Kubelet creates a tree of cgroups for each pod, and each container in the pod, starting with kubepods as the root group.

Talos Linux might set some default limits for the cgroups, and these are not configurable at the moment. Kubelet is configured by default to reserve some amount of RAM and CPU for system processes to prevent the system from becoming unresponsive under extreme resource pressure.

Note: this feature is only available in cgroupsv2 mode which is Talos default.

The talosctl cgroups command provides a way to monitor the resource usage of the cgroups on the machine, it has a set of presets which are described below.

Presets

cpu

$ talosctl cgroups --preset=cpu
NAME                                                                          CpuWeight   CpuNice   CpuMax            CpuUser        User/%    CpuSystem      System/%   Throttled
.                                                                              unset       unset    []                 7m42.43755s   -         8m51.855608s   -                    0s
├──init                                                                           79           1    [   max 100000]     35.061148s     7.58%     41.027589s     7.71%              0s
├──kubepods                                                                       77           1    [   max 100000]   3m29.902395s    45.39%   4m41.033592s    52.84%              0s
│   ├──besteffort                                                                  1          19    [   max 100000]      1.297303s     0.62%      960.152ms     0.34%              0s
│   │   └──kube-system/kube-proxy-6r5bz                                            1          19    [   max 100000]      1.297441s   100.01%      960.014ms    99.99%              0s
│   │       ├──kube-proxy                                                          1          19    [   max 100000]      1.289143s    99.36%      958.587ms    99.85%              0s
│   │       └──sandbox                                                             1          19    [   max 100000]        9.724ms     0.75%             0s     0.00%              0s
│   └──burstable                                                                  14           9    [   max 100000]   3m28.653931s    99.41%   4m40.024231s    99.64%              0s
│       ├──kube-system/kube-apiserver-talos-default-controlplane-1                 8          11    [   max 100000]   2m22.458603s    68.28%   2m22.983949s    51.06%              0s
│       │   ├──kube-apiserver                                                      8          11    [   max 100000]   2m22.440159s    99.99%   2m22.976538s    99.99%              0s
│       │   └──sandbox                                                             1          19    [   max 100000]       14.774ms     0.01%       11.081ms     0.01%              0s
│       ├──kube-system/kube-controller-manager-talos-default-controlplane-1        2          18    [   max 100000]     17.314271s     8.30%      3.014955s     1.08%              0s
│       │   ├──kube-controller-manager                                             2          18    [   max 100000]     17.303941s    99.94%      3.001934s    99.57%              0s
│       │   └──sandbox                                                             1          19    [   max 100000]       11.675ms     0.07%       11.675ms     0.39%              0s
│       ├──kube-system/kube-flannel-jzx6m                                          4          14    [   max 100000]     38.986678s    18.68%   1m47.717143s    38.47%              0s
│       │   ├──kube-flannel                                                        4          14    [   max 100000]     38.962703s    99.94%   1m47.690508s    99.98%              0s
│       │   └──sandbox                                                             1          19    [   max 100000]       14.228ms     0.04%        7.114ms     0.01%              0s
│       └──kube-system/kube-scheduler-talos-default-controlplane-1                 1          19    [   max 100000]     20.103563s     9.63%     16.099219s     5.75%              0s
│           ├──kube-scheduler                                                      1          19    [   max 100000]     20.092317s    99.94%     16.086603s    99.92%              0s
│           └──sandbox                                                             1          19    [   max 100000]        11.93ms     0.06%        11.93ms     0.07%              0s
├──podruntime                                                                     79           1    [   max 100000]   4m59.707084s    64.81%    5m4.010222s    57.16%              0s
│   ├──etcd                                                                       79           1    [   max 100000]   2m38.215322s    52.79%    3m7.812204s    61.78%              0s
│   ├──kubelet                                                                    39           4    [   max 100000]   1m29.026444s    29.70%   1m23.112332s    27.34%              0s
│   └──runtime                                                                    39           4    [   max 100000]     48.501668s    16.18%     37.049334s    12.19%              0s
└──system                                                                         59           2    [   max 100000]     32.395345s     7.01%     12.176964s     2.29%              0s
    ├──apid                                                                       20           7    [   max 100000]      1.261381s     3.89%      756.827ms     6.22%              0s
    ├──dashboard                                                                   8          11    [   max 100000]     22.231337s    68.63%      5.328927s    43.76%              0s
    ├──runtime                                                                    20           7    [   max 100000]      7.282253s    22.48%      5.924559s    48.65%              0s
    ├──trustd                                                                     10          10    [   max 100000]      1.254353s     3.87%      220.698ms     1.81%              0s
    └──udevd                                                                      10          10    [   max 100000]       78.726ms     0.24%      233.244ms     1.92%              0s

In the CPU view, the following columns are displayed:

  • CpuWeight: the CPU weight of the cgroup (relative, controls the CPU shares/bandwidth)
  • CpuNice: the CPU nice value (direct translation of the CpuWeight to the nice value)
  • CpuMax: the maximum CPU time allowed for the cgroup
  • CpuUser: the total CPU time consumed by the cgroup and its children in user mode
  • User/%: the percentage of CPU time consumed by the cgroup and its children in user mode relative to the parent cgroup
  • CpuSystem: the total CPU time consumed by the cgroup and its children in system mode
  • System/%: the percentage of CPU time consumed by the cgroup and its children in system mode relative to the parent cgroup
  • Throttled: the total time the cgroup has been throttled on CPU

cpuset

$ talosctl cgroups --preset=cpuset
NAME                                                                          CpuSet         CpuSet(Eff)    Mems           Mems(Eff)
.                                                                                                     0-1                             0
├──init                                                                                               0-1                             0
├──kubepods                                                                                           0-1                             0
│   ├──besteffort                                                                                     0-1                             0
│   │   └──kube-system/kube-proxy-6r5bz                                                               0-1                             0
│   │       ├──kube-proxy                                                                             0-1                             0
│   │       └──sandbox                                                                                0-1                             0
│   └──burstable                                                                                      0-1                             0
│       ├──kube-system/kube-apiserver-talos-default-controlplane-1                                    0-1                             0
│       │   ├──kube-apiserver                                                                         0-1                             0
│       │   └──sandbox                                                                                0-1                             0
│       ├──kube-system/kube-controller-manager-talos-default-controlplane-1                           0-1                             0
│       │   ├──kube-controller-manager                                                                0-1                             0
│       │   └──sandbox                                                                                0-1                             0
│       ├──kube-system/kube-flannel-jzx6m                                                             0-1                             0
│       │   ├──kube-flannel                                                                           0-1                             0
│       │   └──sandbox                                                                                0-1                             0
│       └──kube-system/kube-scheduler-talos-default-controlplane-1                                    0-1                             0
│           ├──kube-scheduler                                                                         0-1                             0
│           └──sandbox                                                                                0-1                             0
├──podruntime                                                                                         0-1                             0
│   ├──etcd                                                                                           0-1                             0
│   ├──kubelet                                                                                        0-1                             0
│   └──runtime                                                                                        0-1                             0
└──system                                                                                             0-1                             0
    ├──apid                                                                                           0-1                             0
    ├──dashboard                                                                                      0-1                             0
    ├──runtime                                                                                        0-1                             0
    ├──trustd                                                                                         0-1                             0
    └──udevd                                                                                          0-1                             0

This preset shows information about the CPU and memory sets of the cgroups, it is mostly useful with kubelet CPU manager.

  • CpuSet: the CPU set of the cgroup
  • CpuSet(Eff): the effective CPU set of the cgroup
  • Mems: the memory set of the cgroup (NUMA nodes)
  • Mems(Eff): the effective memory set of the cgroup

io

$ talosctl cgroups --preset=io
NAME                                                                          Bytes Read/Written                         ios Read/Write             PressAvg10   PressAvg60   PressTotal
.                                                                             loop0: 94 MiB/0 B vda: 700 MiB/803 MiB                                  0.12         0.37       2m12.512921s
├──init                                                                       loop0: 231 KiB/0 B vda: 4.9 MiB/4.3 MiB    loop0: 6/0 vda: 206/37       0.00         0.00          232.446ms
├──kubepods                                                                   vda: 282 MiB/16 MiB                        vda: 3195/3172               0.00         0.00          383.858ms
│   ├──besteffort                                                             vda: 58 MiB/0 B                            vda: 678/0                   0.00         0.00           86.833ms
│   │   └──kube-system/kube-proxy-6r5bz                                       vda: 58 MiB/0 B                            vda: 678/0                   0.00         0.00           86.833ms
│   │       ├──kube-proxy                                                     vda: 58 MiB/0 B                            vda: 670/0                   0.00         0.00           86.554ms
│   │       └──sandbox                                                        vda: 692 KiB/0 B                           vda: 8/0                     0.00         0.00              467µs
│   └──burstable                                                              vda: 224 MiB/16 MiB                        vda: 2517/3172               0.00         0.00          308.616ms
│       ├──kube-system/kube-apiserver-talos-default-controlplane-1            vda: 76 MiB/16 MiB                         vda: 870/3171                0.00         0.00          151.677ms
│       │   ├──kube-apiserver                                                 vda: 76 MiB/16 MiB                         vda: 870/3171                0.00         0.00          156.375ms
│       │   └──sandbox                                                                                                                                0.00         0.00                 0s
│       ├──kube-system/kube-controller-manager-talos-default-controlplane-1   vda: 62 MiB/0 B                            vda: 670/0                   0.00         0.00           95.432ms
│       │   ├──kube-controller-manager                                        vda: 62 MiB/0 B                            vda: 670/0                   0.00         0.00          100.197ms
│       │   └──sandbox                                                                                                                                0.00         0.00                 0s
│       ├──kube-system/kube-flannel-jzx6m                                     vda: 36 MiB/4.0 KiB                        vda: 419/1                   0.00         0.00           64.203ms
│       │   ├──kube-flannel                                                   vda: 35 MiB/0 B                            vda: 399/0                   0.00         0.00            55.26ms
│       │   └──sandbox                                                                                                                                0.00         0.00                 0s
│       └──kube-system/kube-scheduler-talos-default-controlplane-1            vda: 50 MiB/0 B                            vda: 558/0                   0.00         0.00           64.331ms
│           ├──kube-scheduler                                                 vda: 50 MiB/0 B                            vda: 558/0                   0.00         0.00           62.821ms
│           └──sandbox                                                                                                                                0.00         0.00                 0s
├──podruntime                                                                 vda: 379 MiB/764 MiB                       vda: 3802/287674             0.39         0.39       2m13.409399s
│   ├──etcd                                                                   vda: 308 MiB/759 MiB                       vda: 2598/286420             0.50         0.41       2m15.407179s
│   ├──kubelet                                                                vda: 69 MiB/62 KiB                         vda: 834/13                  0.00         0.00          122.371ms
│   └──runtime                                                                vda: 76 KiB/3.9 MiB                        vda: 19/1030                 0.00         0.00          164.984ms
└──system                                                                     loop0: 18 MiB/0 B vda: 3.2 MiB/0 B         loop0: 590/0 vda: 116/0      0.00         0.00          153.609ms
    ├──apid                                                                   loop0: 1.9 MiB/0 B                         loop0: 103/0                 0.00         0.00            3.345ms
    ├──dashboard                                                              loop0: 16 MiB/0 B                          loop0: 487/0                 0.00         0.00           11.596ms
    ├──runtime                                                                                                                                        0.00         0.00           28.957ms
    ├──trustd                                                                                                                                         0.00         0.00                 0s
    └──udevd                                                                  vda: 3.2 MiB/0 B                           vda: 116/0                   0.00         0.00          135.586ms

In the IO (input/output) view, the following columns are displayed:

  • Bytes Read/Written: the total number of bytes read and written by the cgroup and its children, per each blockdevice
  • ios Read/Write: the total number of I/O operations read and written by the cgroup and its children, per each blockdevice
  • PressAvg10: the average IO pressure of the cgroup and its children over the last 10 seconds
  • PressAvg60: the average IO pressure of the cgroup and its children over the last 60 seconds
  • PressTotal: the total IO pressure of the cgroup and its children (see PSI for more information)

memory

$ talosctl cgroups --preset=memory
NAME                                                                          MemCurrent   MemPeak    MemLow     Peak/Low   MemHigh    MemMin     Current/Min   MemMax
.                                                                                unset        unset      unset    unset%       unset      unset    unset%          unset
├──init                                                                        133 MiB      133 MiB    192 MiB    69.18%         max     96 MiB   138.35%            max
├──kubepods                                                                    494 MiB      505 MiB        0 B      max%         max        0 B      max%        1.4 GiB
│   ├──besteffort                                                               70 MiB       74 MiB        0 B      max%         max        0 B      max%            max
│   │   └──kube-system/kube-proxy-6r5bz                                         70 MiB       74 MiB        0 B      max%         max        0 B      max%            max
│   │       ├──kube-proxy                                                       69 MiB       73 MiB        0 B      max%         max        0 B      max%            max
│   │       └──sandbox                                                         872 KiB      2.2 MiB        0 B      max%         max        0 B      max%            max
│   └──burstable                                                               424 MiB      435 MiB        0 B      max%         max        0 B      max%            max
│       ├──kube-system/kube-apiserver-talos-default-controlplane-1             233 MiB      242 MiB        0 B      max%         max        0 B      max%            max
│       │   ├──kube-apiserver                                                  232 MiB      242 MiB        0 B      max%         max        0 B      max%            max
│       │   └──sandbox                                                         208 KiB      3.3 MiB        0 B      max%         max        0 B      max%            max
│       ├──kube-system/kube-controller-manager-talos-default-controlplane-1     78 MiB       80 MiB        0 B      max%         max        0 B      max%            max
│       │   ├──kube-controller-manager                                          78 MiB       80 MiB        0 B      max%         max        0 B      max%            max
│       │   └──sandbox                                                         212 KiB      3.3 MiB        0 B      max%         max        0 B      max%            max
│       ├──kube-system/kube-flannel-jzx6m                                       48 MiB       50 MiB        0 B      max%         max        0 B      max%            max
│       │   ├──kube-flannel                                                     46 MiB       48 MiB        0 B      max%         max        0 B      max%            max
│       │   └──sandbox                                                         216 KiB      3.1 MiB        0 B      max%         max        0 B      max%            max
│       └──kube-system/kube-scheduler-talos-default-controlplane-1              66 MiB       67 MiB        0 B      max%         max        0 B      max%            max
│           ├──kube-scheduler                                                   66 MiB       67 MiB        0 B      max%         max        0 B      max%            max
│           └──sandbox                                                         208 KiB      3.4 MiB        0 B      max%         max        0 B      max%            max
├──podruntime                                                                  549 MiB      647 MiB        0 B      max%         max        0 B      max%            max
│   ├──etcd                                                                    382 MiB      482 MiB    256 MiB   188.33%         max        0 B      max%            max
│   ├──kubelet                                                                 103 MiB      104 MiB    192 MiB    54.31%         max     96 MiB   107.57%            max
│   └──runtime                                                                  64 MiB       71 MiB    392 MiB    18.02%         max    196 MiB    32.61%            max
└──system                                                                      229 MiB      232 MiB    192 MiB   120.99%         max     96 MiB   239.00%            max
    ├──apid                                                                     26 MiB       28 MiB     32 MiB    88.72%         max     16 MiB   159.23%         40 MiB
    ├──dashboard                                                               113 MiB      113 MiB        0 B      max%         max        0 B      max%        196 MiB
    ├──runtime                                                                  74 MiB       77 MiB     96 MiB    79.89%         max     48 MiB   154.57%            max
    ├──trustd                                                                   10 MiB       11 MiB     16 MiB    69.85%         max    8.0 MiB   127.78%         24 MiB
    └──udevd                                                                   6.8 MiB       14 MiB     16 MiB    86.87%         max    8.0 MiB    84.67%            max

In the memory view, the following columns are displayed:

  • MemCurrent: the current memory usage of the cgroup and its children
  • MemPeak: the peak memory usage of the cgroup and its children
  • MemLow: the low memory reservation of the cgroup
  • Peak/Low: the ratio of the peak memory usage to the low memory reservation
  • MemHigh: the high memory limit of the cgroup
  • MemMin: the minimum memory reservation of the cgroup
  • Current/Min: the ratio of the current memory usage to the minimum memory reservation
  • MemMax: the maximum memory limit of the cgroup

swap

$ talosctl cgroups --preset=swap
NAME                                                                          SwapCurrent   SwapPeak   SwapHigh   SwapMax
.                                                                                unset         unset      unset      unset
├──init                                                                            0 B           0 B        max        max
├──kubepods                                                                        0 B           0 B        max        max
│   ├──besteffort                                                                  0 B           0 B        max        max
│   │   └──kube-system/kube-proxy-6r5bz                                            0 B           0 B        max        max
│   │       ├──kube-proxy                                                          0 B           0 B        max        0 B
│   │       └──sandbox                                                             0 B           0 B        max        max
│   └──burstable                                                                   0 B           0 B        max        max
│       ├──kube-system/kube-apiserver-talos-default-controlplane-1                 0 B           0 B        max        max
│       │   ├──kube-apiserver                                                      0 B           0 B        max        0 B
│       │   └──sandbox                                                             0 B           0 B        max        max
│       ├──kube-system/kube-controller-manager-talos-default-controlplane-1        0 B           0 B        max        max
│       │   ├──kube-controller-manager                                             0 B           0 B        max        0 B
│       │   └──sandbox                                                             0 B           0 B        max        max
│       ├──kube-system/kube-flannel-jzx6m                                          0 B           0 B        max        max
│       │   ├──kube-flannel                                                        0 B           0 B        max        0 B
│       │   └──sandbox                                                             0 B           0 B        max        max
│       └──kube-system/kube-scheduler-talos-default-controlplane-1                 0 B           0 B        max        max
│           ├──kube-scheduler                                                      0 B           0 B        max        0 B
│           └──sandbox                                                             0 B           0 B        max        max
├──podruntime                                                                      0 B           0 B        max        max
│   ├──etcd                                                                        0 B           0 B        max        max
│   ├──kubelet                                                                     0 B           0 B        max        max
│   └──runtime                                                                     0 B           0 B        max        max
└──system                                                                          0 B           0 B        max        max
    ├──apid                                                                        0 B           0 B        max        max
    ├──dashboard                                                                   0 B           0 B        max        max
    ├──runtime                                                                     0 B           0 B        max        max
    ├──trustd                                                                      0 B           0 B        max        max
    └──udevd                                                                       0 B           0 B        max        max

In the swap view, the following columns are displayed:

  • SwapCurrent: the current swap usage of the cgroup and its children
  • SwapPeak: the peak swap usage of the cgroup and its children
  • SwapHigh: the high swap limit of the cgroup
  • SwapMax: the maximum swap limit of the cgroup

Custom Schemas

The talosctl cgroups command allows you to define custom schemas to display the cgroups information in a specific way. The schema is defined in a YAML file with the following structure:

columns:
  - name: Bytes Read/Written
    template: '{{ range $disk, $v := .IOStat }}{{ if $v }}{{ $disk }}: {{ $v.rbytes.HumanizeIBytes }}/{{ $v.wbytes.HumanizeIBytes }} {{ end }}{{ end }}'
  - name: ios Read/Write
    template: '{{ if .Parent }}{{ range $disk, $v := .IOStat }}{{ $disk }}: {{ $v.rios }}/{{ $v.wios }} {{ end }}{{ end }}'
  - name: PressAvg10
    template: '{{ .IOPressure.some.avg10 | printf "%6s" }}'
  - name: PressAvg60
    template: '{{ .IOPressure.some.avg60 | printf "%6s" }}'
  - name: PressTotal
    template: '{{ .IOPressure.some.total.UsecToDuration | printf "%12s" }}'

The schema file can be passed to the talosctl cgroups command with the --schema-file flag:

talosctl cgroups --schema-file=schema.yaml

In the schema, for each column, you can define a name and a template which is a Go template that will be executed with the cgroups data. In the template, there’s a . variable that contains the cgroups data, and .Parent variable which is a parent cgroup (if available). Each cgroup node contains information parsed from the cgroup filesystem, with field names matching the filenames adjusted for Go naming conventions, e.g. io.stat becomes .IOStat in the template.

The schemas for the presets above can be found in the source code.

6 - Customizing the Kernel

Guide on how to customize the kernel used by Talos Linux.

Talos Linux configures the kernel to allow loading only cryptographically signed modules. The signing key is generated during the build process, it is unique to each build, and it is not available to the user. The public key is embedded in the kernel, and it is used to verify the signature of the modules. So if you want to use a custom kernel module, you will need to build your own kernel, and all required kernel modules in order to get the signature in sync with the kernel.

Overview

In order to build a custom kernel (or a custom kernel module), the following steps are required:

  • build a new Linux kernel and modules, push the artifacts to a registry
  • build a new Talos base artifacts: kernel and initramfs image
  • produce a new Talos boot artifact (ISO, installer image, disk image, etc.)

We will go through each step in detail.

Building a Custom Kernel

First, you might need to prepare the build environment, follow the Building Custom Images guide.

Checkout the siderolabs/pkgs repository:

git clone https://github.com/siderolabs/pkgs.git
cd pkgs
git checkout release-1.9

The kernel configuration is located in the files kernel/build/config-ARCH files. It can be modified using the text editor, or by using the Linux kernel menuconfig tool:

make kernel-menuconfig

The kernel configuration can be cleaned up by running:

make kernel-olddefconfig

Both commands will output the new configuration to the kernel/build/config-ARCH files.

Once ready, build the kernel any out-of-tree modules (if required, e.g. zfs) and push the artifacts to a registry:

make kernel REGISTRY=127.0.0.1:5005 PUSH=true

By default, this command will compile and push the kernel both for amd64 and arm64 architectures, but you can specify a single architecture by overriding a variable PLATFORM:

make kernel REGISTRY=127.0.0.1:5005 PUSH=true PLATFORM=linux/amd64

This will create a container image 127.0.0.1:5005/siderolabs/kernel:$TAG with the kernel and modules.

Building Talos Base Artifacts

Follow the Building Custom Images guide to set up the Talos source code checkout.

If some new kernel modules were introduced, adjust the list of the default modules compiled into the Talos initramfs by editing the file hack/modules-ARCH.txt.

Try building base Talos artifacts:

make kernel initramfs PKG_KERNEL=127.0.0.1:5005/siderolabs/kernel:$TAG PLATFORM=linux/amd64

This should create a new image of the kernel and initramfs in _out/vmlinuz-amd64 and _out/initramfs-amd64.xz respectively.

Note: if building for arm64, replace amd64 with arm64 in the commands above.

As a final step, produce the new imager container image which can generate Talos boot assets:

make imager PKG_KERNEL=127.0.0.1:5005/siderolabs/kernel:$TAG PLATFORM=linux/amd64 INSTALLER_ARCH=targetarch

Note: if you built the kernel for both amd64 and arm64, a multi-arch imager container can be built as well by specifying INSTALLER_ARCH=all and PLATFORM=linux/amd64,linux/arm64.

Building Talos Boot Assets

Follow the Boot Assets guide to build Talos boot assets you might need to boot Talos: ISO, installer image, etc. Replace the reference to the imager in guide with the reference to the imager container built above.

Note: if you update the imager container, don’t forget to docker pull it, as docker caches pulled images and won’t pull the updated image automatically.

7 - Developing Talos

Learn how to set up a development environment for local testing and hacking on Talos itself!

This guide outlines steps and tricks to develop Talos operating systems and related components. The guide assumes Linux operating system on the development host. Some steps might work under Mac OS X, but using Linux is highly advised.

Prepare

Check out the Talos repository.

Try running make help to see available make commands. You would need Docker and buildx installed on the host.

Note: Usually it is better to install up to date Docker from Docker apt repositories, e.g. Ubuntu instructions.

If buildx plugin is not available with OS docker packages, it can be installed as a plugin from GitHub releases.

Set up a builder with access to the host network:

 docker buildx create --driver docker-container  --driver-opt network=host --name local1 --buildkitd-flags '--allow-insecure-entitlement security.insecure' --use

Note: network=host allows buildx builder to access host network, so that it can push to a local container registry (see below).

Make sure the following steps work:

  • make talosctl
  • make initramfs kernel

Set up a local docker registry:

docker run -d -p 5005:5000 \
    --restart always \
    --name local registry:2

Try to build and push to local registry an installer image:

make installer IMAGE_REGISTRY=127.0.0.1:5005 PUSH=true

Record the image name output in the step above.

Note: it is also possible to force a stable image tag by using TAG variable: make installer IMAGE_REGISTRY=127.0.0.1:5005 TAG=v1.0.0-alpha.1 PUSH=true.

Running Talos cluster

Set up local caching docker registries (this speeds up Talos cluster boot a lot), script is in the Talos repo:

bash hack/start-registry-proxies.sh

Start your local cluster with:

sudo --preserve-env=HOME _out/talosctl-linux-amd64 cluster create \
    --provisioner=qemu \
    --cidr=172.20.0.0/24 \
    --registry-mirror docker.io=http://172.20.0.1:5000 \
    --registry-mirror registry.k8s.io=http://172.20.0.1:5001  \
    --registry-mirror gcr.io=http://172.20.0.1:5003 \
    --registry-mirror ghcr.io=http://172.20.0.1:5004 \
    --registry-mirror 127.0.0.1:5005=http://172.20.0.1:5005 \
    --install-image=127.0.0.1:5005/siderolabs/installer:<RECORDED HASH from the build step> \
    --controlplanes 3 \
    --workers 2 \
    --with-bootloader=false
  • --provisioner selects QEMU vs. default Docker
  • custom --cidr to make QEMU cluster use different network than default Docker setup (optional)
  • --registry-mirror uses the caching proxies set up above to speed up boot time a lot, last one adds your local registry (installer image was pushed to it)
  • --install-image is the image you built with make installer above
  • --controlplanes & --workers configure cluster size, choose to match your resources; 3 controlplanes give you HA control plane; 1 controlplane is enough, never do 2 controlplanes
  • --with-bootloader=false disables boot from disk (Talos will always boot from _out/vmlinuz-amd64 and _out/initramfs-amd64.xz). This speeds up development cycle a lot - no need to rebuild installer and perform install, rebooting is enough to get new code.

Note: as boot loader is not used, it’s not necessary to rebuild installer each time (old image is fine), but sometimes it’s needed (when configuration changes are done and old installer doesn’t validate the config).

talosctl cluster create derives Talos machine configuration version from the install image tag, so sometimes early in the development cycle (when new minor tag is not released yet), machine config version can be overridden with --talos-version=v1.9.

If the --with-bootloader=false flag is not enabled, for Talos cluster to pick up new changes to the code (in initramfs), it will require a Talos upgrade (so new installer should be built). With --with-bootloader=false flag, Talos always boots from initramfs in _out/ directory, so simple reboot is enough to pick up new code changes.

If the installation flow needs to be tested, --with-bootloader=false shouldn’t be used.

Console Logs

Watching console logs is easy with tail:

tail -F ~/.talos/clusters/talos-default/talos-default-*.log

Interacting with Talos

Once talosctl cluster create finishes successfully, talosconfig and kubeconfig will be set up automatically to point to your cluster.

Start playing with talosctl:

talosctl -n 172.20.0.2 version
talosctl -n 172.20.0.3,172.20.0.4 dashboard
talosctl -n 172.20.0.4 get members

Same with kubectl:

kubectl get nodes -o wide

You can deploy some Kubernetes workloads to the cluster.

You can edit machine config on the fly with talosctl edit mc --immediate, config patches can be applied via --config-patch flags, also many features have specific flags in talosctl cluster create.

Quick Reboot

To reboot whole cluster quickly (e.g. to pick up a change made in the code):

for socket in ~/.talos/clusters/talos-default/talos-default-*.monitor; do echo "q" | sudo socat - unix-connect:$socket; done

Sending q to a single socket allows to reboot a single node.

Note: This command performs immediate reboot (as if the machine was powered down and immediately powered back up), for normal Talos reboot use talosctl reboot.

Development Cycle

Fast development cycle:

  • bring up a cluster
  • make code changes
  • rebuild initramfs with make initramfs
  • reboot a node to pick new initramfs
  • verify code changes
  • more code changes…

Some aspects of Talos development require to enable bootloader (when working on installer itself), in that case quick development cycle is no longer possible, and cluster should be destroyed and recreated each time.

Running Integration Tests

If integration tests were changed (or when running them for the first time), first rebuild the integration test binary:

rm -f  _out/integration-test-linux-amd64; make _out/integration-test-linux-amd64

Running short tests against QEMU provisioned cluster:

_out/integration-test-linux-amd64 \
    -talos.provisioner=qemu \
    -test.v \
    -test.short \
    -talos.talosctlpath=$PWD/_out/talosctl-linux-amd64

Whole test suite can be run removing -test.short flag.

Specfic tests can be run with -test.run=TestIntegration/api.ResetSuite.

Build Flavors

make <something> WITH_RACE=1 enables Go race detector, Talos runs slower and uses more memory, but memory races are detected.

make <something> WITH_DEBUG=1 enables Go profiling and other debug features, useful for local development.

make initramfs WITH_DEBUG_SHELL=true adds bash and minimal utilities for debugging purposes. Combine with --with-debug-shell flag when creating cluster to obtain shell access. This is uncommonly used as in this case the bash shell will run in place of machined.

Destroying Cluster

sudo --preserve-env=HOME ../talos/_out/talosctl-linux-amd64 cluster destroy --provisioner=qemu

This command stops QEMU and helper processes, tears down bridged network on the host, and cleans up cluster state in ~/.talos/clusters.

Note: if the host machine is rebooted, QEMU instances and helpers processes won’t be started back. In that case it’s required to clean up files in ~/.talos/clusters/<cluster-name> directory manually.

Optional

Set up cross-build environment with:

docker run --rm --privileged multiarch/qemu-user-static --reset -p yes

Note: the static qemu binaries which come with Ubuntu 21.10 seem to be broken.

Unit tests

Unit tests can be run in buildx with make unit-tests, on Ubuntu systems some tests using loop devices will fail because Ubuntu uses low-index loop devices for snaps.

Most of the unit-tests can be run standalone as well, with regular go test, or using IDE integration:

go test -v ./internal/pkg/circular/

This provides much faster feedback loop, but some tests require either elevated privileges (running as root) or additional binaries available only in Talos rootfs (containerd tests).

Running tests as root can be done with -exec flag to go test, but this is risky, as test code has root access and can potentially make undesired changes:

go test -exec sudo  -v ./internal/app/machined/pkg/controllers/network/...

Go Profiling

Build initramfs with debug enabled: make initramfs WITH_DEBUG=1.

Launch Talos cluster with bootloader disabled, and use go tool pprof to capture the profile and show the output in your browser:

go tool pprof http://172.20.0.2:9982/debug/pprof/heap

The IP address 172.20.0.2 is the address of the Talos node, and port :9982 depends on the Go application to profile:

  • 9981: apid
  • 9982: machined
  • 9983: trustd

Testing Air-gapped Environments

There is a hidden talosctl debug air-gapped command which launches two components:

  • HTTP proxy capable of proxying HTTP and HTTPS requests
  • HTTPS server with a self-signed certificate

The command also writes down Talos machine configuration patch to enable the HTTP proxy and add a self-signed certificate to the list of trusted certificates:

$ talosctl debug air-gapped --advertised-address 172.20.0.1
2022/08/04 16:43:14 writing config patch to air-gapped-patch.yaml
2022/08/04 16:43:14 starting HTTP proxy on :8002
2022/08/04 16:43:14 starting HTTPS server with self-signed cert on :8001

The --advertised-address should match the bridge IP of the Talos node.

Generated machine configuration patch looks like:

machine:
    files:
        - content: |
            -----BEGIN CERTIFICATE-----
            MIIBijCCAS+gAwIBAgIBATAKBggqhkjOPQQDAjAUMRIwEAYDVQQKEwlUZXN0IE9u
            bHkwHhcNMjIwODA0MTI0MzE0WhcNMjIwODA1MTI0MzE0WjAUMRIwEAYDVQQKEwlU
            ZXN0IE9ubHkwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAAQfOJdaOFSOI1I+EeP1
            RlMpsDZJaXjFdoo5zYM5VYs3UkLyTAXAmdTi7JodydgLhty0pwLEWG4NUQAEvip6
            EmzTo3IwcDAOBgNVHQ8BAf8EBAMCBaAwHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsG
            AQUFBwMCMA8GA1UdEwEB/wQFMAMBAf8wHQYDVR0OBBYEFCwxL+BjG0pDwaH8QgKW
            Ex0J2mVXMA8GA1UdEQQIMAaHBKwUAAEwCgYIKoZIzj0EAwIDSQAwRgIhAJoW0z0D
            JwpjFcgCmj4zT1SbBFhRBUX64PHJpAE8J+LgAiEAvfozZG8Or6hL21+Xuf1x9oh4
            /4Hx3jozbSjgDyHOLk4=
            -----END CERTIFICATE-----            
          permissions: 0o644
          path: /etc/ssl/certs/ca-certificates
          op: append
    env:
        http_proxy: http://172.20.0.1:8002
        https_proxy: http://172.20.0.1:8002
        no_proxy: 172.20.0.1/24
cluster:
    extraManifests:
        - https://172.20.0.1:8001/debug.yaml

The first section appends a self-signed certificate of the HTTPS server to the list of trusted certificates, followed by the HTTP proxy setup (in-cluster traffic is excluded from the proxy). The last section adds an extra Kubernetes manifest hosted on the HTTPS server.

The machine configuration patch can now be used to launch a test Talos cluster:

talosctl cluster create ... --config-patch @air-gapped-patch.yaml

The following lines should appear in the output of the talosctl debug air-gapped command:

  • CONNECT discovery.talos.dev:443: the HTTP proxy is used to talk to the discovery service
  • http: TLS handshake error from 172.20.0.2:53512: remote error: tls: bad certificate: an expected error on Talos side, as self-signed cert is not written yet to the file
  • GET /debug.yaml: Talos successfully fetches the extra manifest successfully

There might be more output depending on the registry caches being used or not.

Running Upgrade Integration Tests

Talos has a separate set of provision upgrade tests, which create a cluster on older versions of Talos, perform an upgrade, and verify that the cluster is still functional.

Build the test binary:

rm -f  _out/integration-test-provision-linux-amd64; make _out/integration-test-provision-linux-amd64

Prepare the test artifacts for the upgrade test:

make release-artifacts

Build and push an installer image for the development version of Talos:

make installer IMAGE_REGISTRY=127.0.0.1:5005 PUSH=true

Run the tests (the tests will create the cluster on the older version of Talos, perform an upgrade, and verify that the cluster is still functional):

sudo --preserve-env=HOME _out/integration-test-provision-linux-amd64 \
    -test.v \
    -talos.talosctlpath _out/talosctl-linux-amd64 \
    -talos.provision.target-installer-registry=127.0.0.1:5005 \
    -talos.provision.registry-mirror 127.0.0.1:5005=http://172.20.0.1:5005,docker.io=http://172.20.0.1:5000,registry.k8s.io=http://172.20.0.1:5001,quay.io=http://172.20.0.1:5002,gcr.io=http://172.20.0.1:5003,ghcr.io=http://172.20.0.1:5004 \
    -talos.provision.cidr 172.20.0.0/24

8 - Disaster Recovery

Procedure for snapshotting etcd database and recovering from catastrophic control plane failure.

etcd database backs Kubernetes control plane state, so if the etcd service is unavailable, the Kubernetes control plane goes down, and the cluster is not recoverable until etcd is recovered. etcd builds around the consensus protocol Raft, so highly-available control plane clusters can tolerate the loss of nodes so long as more than half of the members are running and reachable. For a three control plane node Talos cluster, this means that the cluster tolerates a failure of any single node, but losing more than one node at the same time leads to complete loss of service. Because of that, it is important to take routine backups of etcd state to have a snapshot to recover the cluster from in case of catastrophic failure.

Backup

Snapshotting etcd Database

Create a consistent snapshot of etcd database with talosctl etcd snapshot command:

$ talosctl -n <IP> etcd snapshot db.snapshot
etcd snapshot saved to "db.snapshot" (2015264 bytes)
snapshot info: hash c25fd181, revision 4193, total keys 1287, total size 3035136

Note: filename db.snapshot is arbitrary.

This database snapshot can be taken on any healthy control plane node (with IP address <IP> in the example above), as all etcd instances contain exactly same data. It is recommended to configure etcd snapshots to be created on some schedule to allow point-in-time recovery using the latest snapshot.

Disaster Database Snapshot

If the etcd cluster is not healthy (for example, if quorum has already been lost), the talosctl etcd snapshot command might fail. In that case, copy the database snapshot directly from the control plane node:

talosctl -n <IP> cp /var/lib/etcd/member/snap/db .

This snapshot might not be fully consistent (if the etcd process is running), but it allows for disaster recovery when latest regular snapshot is not available.

Machine Configuration

Machine configuration might be required to recover the node after hardware failure. Backup Talos node machine configuration with the command:

talosctl -n IP get mc v1alpha1 -o yaml | yq eval '.spec' -

Recovery

Before starting a disaster recovery procedure, make sure that etcd cluster can’t be recovered:

  • get etcd cluster member list on all healthy control plane nodes with talosctl -n IP etcd members command and compare across all members.
  • query etcd health across control plane nodes with talosctl -n IP service etcd.

If the quorum can be restored, restoring quorum might be a better strategy than performing full disaster recovery procedure.

Latest Etcd Snapshot

Get hold of the latest etcd database snapshot. If a snapshot is not fresh enough, create a database snapshot (see above), even if the etcd cluster is unhealthy.

Init Node

Make sure that there are no control plane nodes with machine type init:

$ talosctl -n <IP1>,<IP2>,... get machinetype
NODE         NAMESPACE   TYPE          ID             VERSION   TYPE
172.20.0.2   config      MachineType   machine-type   2         controlplane
172.20.0.4   config      MachineType   machine-type   2         controlplane
172.20.0.3   config      MachineType   machine-type   2         controlplane

Init node type is deprecated, and are incompatible with etcd recovery procedure. init node can be converted to controlplane type with talosctl edit mc --mode=staged command followed by node reboot with talosctl reboot command.

Preparing Control Plane Nodes

If some control plane nodes experienced hardware failure, replace them with new nodes.

Use machine configuration backup to re-create the nodes with the same secret material and control plane settings to allow workers to join the recovered control plane.

If a control plane node is up but etcd isn’t, wipe the node’s EPHEMERAL partition to remove the etcd data directory (make sure a database snapshot is taken before doing this):

talosctl -n <IP> reset --graceful=false --reboot --system-labels-to-wipe=EPHEMERAL

At this point, all control plane nodes should boot up, and etcd service should be in the Preparing state.

The Kubernetes control plane endpoint should be pointed to the new control plane nodes if there were changes to the node addresses.

Recovering from the Backup

Make sure all etcd service instances are in Preparing state:

$ talosctl -n <IP> service etcd
NODE     172.20.0.2
ID       etcd
STATE    Preparing
HEALTH   ?
EVENTS   [Preparing]: Running pre state (17s ago)
         [Waiting]: Waiting for service "cri" to be "up", time sync (18s ago)
         [Waiting]: Waiting for service "cri" to be "up", service "networkd" to be "up", time sync (20s ago)

Execute the bootstrap command against any control plane node passing the path to the etcd database snapshot:

$ talosctl -n <IP> bootstrap --recover-from=./db.snapshot
recovering from snapshot "./db.snapshot": hash c25fd181, revision 4193, total keys 1287, total size 3035136

Note: if database snapshot was copied out directly from the etcd data directory using talosctl cp, add flag --recover-skip-hash-check to skip integrity check on restore.

Talos node should print matching information in the kernel log:

recovering etcd from snapshot: hash c25fd181, revision 4193, total keys 1287, total size 3035136
{"level":"info","msg":"restoring snapshot","path":"/var/lib/etcd.snapshot","wal-dir":"/var/lib/etcd/member/wal","data-dir":"/var/lib/etcd","snap-dir":"/var/li}
{"level":"info","msg":"restored last compact revision","meta-bucket-name":"meta","meta-bucket-name-key":"finishedCompactRev","restored-compact-revision":3360}
{"level":"info","msg":"added member","cluster-id":"a3390e43eb5274e2","local-member-id":"0","added-peer-id":"eb4f6f534361855e","added-peer-peer-urls":["https:/}
{"level":"info","msg":"restored snapshot","path":"/var/lib/etcd.snapshot","wal-dir":"/var/lib/etcd/member/wal","data-dir":"/var/lib/etcd","snap-dir":"/var/lib/etcd/member/snap"}

Now etcd service should become healthy on the bootstrap node, Kubernetes control plane components should start and control plane endpoint should become available. Remaining control plane nodes join etcd cluster once control plane endpoint is up.

Single Control Plane Node Cluster

This guide applies to the single control plane clusters as well. In fact, it is much more important to take regular snapshots of the etcd database in single control plane node case, as loss of the control plane node might render the whole cluster irrecoverable without a backup.

9 - Egress Domains

Allowing outbound access for installing Talos

For some more constrained environments, it is important to whitelist only specific domains for outbound internet access. These rules will need to be updated to allow for certain domains if the user wishes to still install and bootstrap Talos from public sources. That said, users should also note that all of the following components can be mirrored locally with an internal registry, as well as a self-hosted discovery service and image factory.

The following list of egress domains was tested using a Fortinet FortiGate Next-Generation Firewall to confirm that Talos was installed, bootstrapped, and Kubernetes was fully up and running. The FortiGate allows for passing in wildcard domains and will handle resolution of those domains to defined IPs automatically. All traffic is HTTPS over port 443.

Discovery Service:

  • discovery.talos.dev

Image Factory:

  • factory.talos.dev
  • *.azurefd.net (Azure Front Door for serving cached assets)

Google Container Registry / Google Artifact Registry (GCR/GAR):

  • gcr.io
  • storage.googleapis.com (backing blob storage for images)
  • *.pkg.dev (backing blob storage for images)

Github Container Registry (GHCR)

  • ghcr.io
  • *.githubusercontent.com (backing blob storage for images)

Kubernetes Registry (k8s.io)

  • registry.k8s.io
  • *.s3.dualstack.us-east-1.amazonaws.com (backing blob storage for images)

Note: In this testing, DNS and NTP servers were updated to use those services that are built-in to the FortiGate. These may also need to be allowed if the user cannot make use of internal services. Additionally,these rules only cover that which is required for Talos to be fully installed and running. There may be other domains like docker.io that must be allowed for non-default CNIs or workload container images.

10 - etcd Maintenance

Operational instructions for etcd database.

etcd database backs Kubernetes control plane state, so etcd health is critical for Kubernetes availability.

Space Quota

etcd default database space quota is set to 2 GiB by default. If the database size exceeds the quota, etcd will stop operations until the issue is resolved.

This condition can be checked with talosctl etcd alarm list command:

$ talosctl -n <IP> etcd alarm list
NODE         MEMBER             ALARM
172.20.0.2   a49c021e76e707db   NOSPACE

If the Kubernetes database contains lots of resources, space quota can be increased to match the actual usage. The recommended maximum size is 8 GiB.

To increase the space quota, edit the etcd section in the machine configuration:

cluster:
  etcd:
    extraArgs:
      quota-backend-bytes: 4294967296 # 4 GiB

Once the node is rebooted with the new configuration, use talosctl etcd alarm disarm to clear the NOSPACE alarm.

Defragmentation

etcd database can become fragmented over time if there are lots of writes and deletes. Kubernetes API server performs automatic compaction of the etcd database, which marks deleted space as free and ready to be reused. However, the space is not actually freed until the database is defragmented.

If the database is heavily fragmented (in use/db size ratio is less than 0.5), defragmentation might increase the performance. If the database runs over the space quota (see above), but the actual in use database size is small, defragmentation is required to bring the on-disk database size below the limit.

Current database size can be checked with talosctl etcd status command:

$ talosctl -n <CP1>,<CP2>,<CP3> etcd status
NODE         MEMBER             DB SIZE   IN USE            LEADER             RAFT INDEX   RAFT TERM   RAFT APPLIED INDEX   LEARNER   ERRORS
172.20.0.3   ecebb05b59a776f1   21 MB     6.0 MB (29.08%)   ecebb05b59a776f1   53391        4           53391                false
172.20.0.2   a49c021e76e707db   17 MB     4.5 MB (26.10%)   ecebb05b59a776f1   53391        4           53391                false
172.20.0.4   eb47fb33e59bf0e2   20 MB     5.9 MB (28.96%)   ecebb05b59a776f1   53391        4           53391                false

If any of the nodes are over database size quota, alarms will be printed in the ERRORS column.

To defragment the database, run talosctl etcd defrag command:

talosctl -n <CP1> etcd defrag

Note: defragmentation is a resource-intensive operation, so it is recommended to run it on a single node at a time. Defragmentation to a live member blocks the system from reading and writing data while rebuilding its state.

Once the defragmentation is complete, the database size will match closely to the in use size:

$ talosctl -n <CP1> etcd status
NODE         MEMBER             DB SIZE   IN USE             LEADER             RAFT INDEX   RAFT TERM   RAFT APPLIED INDEX   LEARNER   ERRORS
172.20.0.2   a49c021e76e707db   4.5 MB    4.5 MB (100.00%)   ecebb05b59a776f1   56065        4           56065                false

Snapshotting

Regular backups of etcd database should be performed to ensure that the cluster can be restored in case of a failure. This procedure is described in the disaster recovery guide.

11 - Extension Services

Use extension services in Talos Linux.

Talos provides a way to run additional system services early in the Talos boot process. Extension services should be included into the Talos root filesystem (e.g. using system extensions). Extension services run as privileged containers with ephemeral root filesystem located in the Talos root filesystem.

Extension services can be used to use extend core features of Talos in a way that is not possible via static pods or Kubernetes DaemonSets.

Potential extension services use-cases:

  • storage: Open iSCSI, software RAID, etc.
  • networking: BGP FRR, etc.
  • platform integration: VMWare open VM tools, etc.

Configuration

Talos on boot scans directory /usr/local/etc/containers for *.yaml files describing the extension services to run. Format of the extension service config:

name: hello-world
container:
  entrypoint: ./hello-world
  environment:
    - XDG_RUNTIME_DIR=/run
  args:
     - -f
  mounts:
     - # OCI Mount Spec
depends:
   - configuration: true
   - service: cri
   - path: /run/machined/machined.sock
   - network:
       - addresses
       - connectivity
       - hostname
       - etcfiles
   - time: true
restart: never|always|untilSuccess
logToConsole: true|false

name

Field name sets the service name, valid names are [a-z0-9-_]+. The service container root filesystem path is derived from the name: /usr/local/lib/containers/<name>. The extension service will be registered as a Talos service under an ext-<name> identifier.

container

  • entrypoint defines the container entrypoint relative to the container root filesystem (/usr/local/lib/containers/<name>)
  • environmentFile (deprecated) defines the path to a file containing environment variables, the service waits for the file to exist before starting. Use ExtensionServiceConfig instead.
  • environment defines the container environment variables.
  • args defines the additional arguments to pass to the entrypoint
  • mounts defines the volumes to be mounted into the container root

container.mounts

The section mounts uses the standard OCI spec:

- source: /var/log/audit
  destination: /var/log/audit
  type: bind
  options:
    - rshared
    - bind
    - ro

All requested directories will be mounted into the extension service container mount namespace. If the source directory doesn’t exist in the host filesystem, it will be created (only for writable paths in the Talos root filesystem).

container.security

The section security follows this example:

maskedPaths:
  - "/should/be/masked"
readonlyPaths:
  - "/path/that/should/be/readonly"
  - "/another/readonly/path"
writeableRootfs: true
writeableSysfs: true
rootfsPropagation: shared
  • The rootfs is readonly by default unless writeableRootfs: true is set.
  • The sysfs is readonly by default unless writeableSysfs: true is set.
  • Masked paths if not set defaults to containerd defaults. Masked paths will be mounted to /dev/null. To set empty masked paths use:
container:
  security:
    maskedPaths: []
  • Read Only paths if not set defaults to containerd defaults. Read-only paths will be mounted to /dev/null. To set empty read only paths use:
container:
  security:
    readonlyPaths: []
  • Rootfs propagation is not set by default (container mounts are private).

depends

The depends section describes extension service start dependencies: the service will not be started until all dependencies are met.

Available dependencies:

  • service: <name>: wait for the service <name> to be running and healthy
  • path: <path>: wait for the <path> to exist
  • network: [addresses, connectivity, hostname, etcfiles]: wait for the specified network readiness checks to succeed
  • time: true: wait for the NTP time sync
  • configuration: true: wait for ExtensionServiceConfig resource with a name matching the extension name to be available. The mounts specified in the ExtensionServiceConfig will be added as extra mounts to the extension service.

restart

Field restart defines the service restart policy, it allows to either configure an always running service or a one-shot service:

  • always: restart service always
  • never: start service only once and never restart
  • untilSuccess: restart failing service, stop restarting on successful run

logToConsole

Field logToConsole defines whether the service logs should also be written to the console, i.e., to kernel log buffer (or to the container logs in container mode).

This feature is particularly useful for debugging extensions that operate in maintenance mode or early in the boot process when service logs cannot be accessed yet.

Example

Example layout of the Talos root filesystem contents for the extension service:

/
└── usr
    └── local
        ├── etc
        │   └── containers
        │       └── hello-world.yaml
        └── lib
            └── containers
                └── hello-world
                    ├── hello
                    └── config.ini

Talos discovers the extension service configuration in /usr/local/etc/containers/hello-world.yaml:

name: hello-world
container:
  entrypoint: ./hello
  args:
    - --config
    - config.ini
depends:
  - network:
    - addresses
restart: always

Talos starts the container for the extension service with container root filesystem at /usr/local/lib/containers/hello-world:

/
├── hello
└── config.ini

Extension service is registered as ext-hello-world in talosctl services:

$ talosctl service ext-hello-world
NODE     172.20.0.5
ID       ext-hello-world
STATE    Running
HEALTH   ?
EVENTS   [Running]: Started task ext-hello-world (PID 1100) for container ext-hello-world (2m47s ago)
         [Preparing]: Creating service runner (2m47s ago)
         [Preparing]: Running pre state (2m47s ago)
         [Waiting]: Waiting for service "containerd" to be "up" (2m48s ago)
         [Waiting]: Waiting for service "containerd" to be "up", network (2m49s ago)

An extension service can be started, restarted and stopped using talosctl service ext-hello-world start|restart|stop. Use talosctl logs ext-hello-world to get the logs of the service.

Complete example of the extension service can be found in the extensions repository.

12 - Install KubeVirt on Talos

This is a guide on how to get started with KubeVirt on Talos

KubeVirt allows you to run virtual machines on Kubernetes. It runs with QEMU and KVM to provide a seamless virtual machine experience and can be mixed with containerized workloads. This guide explains on how to install KubeVirt on Talos.

Prerequisites

For KubeVirt and Talos to work you have to enable certain configurations in the BIOS and configure Talos properly for it to work.

Enable virtualization in your BIOS

On many new PCs and servers, virtualization is enabled by default. Please consult your manufacturer on how to enable this in the BIOS. You can also run KubeVirt from within a virtual machine. For that to work you have to enable Nested Virtualization. This can also be done in the BIOS.

Configure your network interface in bridge mode (optional)

When you want to leverage Multus to give your virtual machines direct access to your node network, your bridge needs to be configured properly. This can be done by setting your network interface in bridge mode. You can look up the network interface name by using the following command:

$ talosctl get links -n 10.99.101.9
NODE          NAMESPACE   TYPE         ID             VERSION   TYPE       KIND     HW ADDR                                           OPER STATE   LINK STATE
10.99.101.9   network     LinkStatus   bond0          1         ether      bond     52:62:01:53:5b:a7                                 down         false
10.99.101.9   network     LinkStatus   br0            3         ether      bridge   bc:24:11:a1:98:fc                                 up           true
10.99.101.9   network     LinkStatus   cni0           9         ether      bridge   1e:5e:99:8f:1e:19                                 up           true
10.99.101.9   network     LinkStatus   dummy0         1         ether      dummy    62:1c:3e:d5:72:11                                 down         false
10.99.101.9   network     LinkStatus   eth0           5         ether               bc:24:11:a1:98:fc

In this case, this network interface is called eth0. Now you can configure your bridge properly. This can be done in the machine config of your node:

machine:
      interfaces:
      - interface: br0
        addresses:
          - 10.99.101.9/24
        bridge:
          stp:
            enabled: true
          interfaces:
              - eth0 # This must be changed to your matching interface name
        routes:
            - network: 0.0.0.0/0 # The route's network (destination).
              gateway: 10.99.101.254 # The route's gateway (if empty, creates link scope route).
              metric: 1024 # The optional metric for the route.

Install the local-path-provisioner

When we are using KubeVirt, we are also installing the CDI (containerized data importer) operator. For this to work properly, we have to install the local-path-provisioner. This CNI can be used to write scratch space when importing images with the CDI.

You can install the local-path-provisioner by following this guide.

Configure storage

If you would like to use features such as LiveMigration shared storage is neccesary. You can either choose to install a CSI that connects to NFS or you can install Longhorn, for example. For more information on how to install Longhorn on Talos you can follow this link.

To install the NFS-CSI driver, you can follow This guide.

After the installation of the NFS-CSI driver is done, you can create a storage class for the NFS CSI driver to work:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-csi
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: nfs.csi.k8s.io
parameters:
  server: 10.99.102.253
  share: /mnt/data/nfs/kubernetes_csi
reclaimPolicy: Delete
volumeBindingMode: Immediate
mountOptions:
  - nfsvers=3
  - nolock

Note that this is just an example. Make sure to set the nolock option. If not, the nfs-csi storageclass won’t work, because talos doesn’t have a rpc.statd daemon running.

Install virtctl

virtctl is needed for communication between the CLI and the KubeVirt api server.

You can install the virtctl client directly by running:

export VERSION=$(curl https://storage.googleapis.com/kubevirt-prow/release/kubevirt/kubevirt/stable.txt)
wget https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/virtctl-${VERSION}-linux-amd64

Or you can use krew to integrate it nicely in kubectl:

kubectl krew install virt

Installing KubeVirt

After the neccesary preperations are done, you can now install KubeVirt. This can either be done through the Operator Lifecycle Manager or by just simply applying a YAML file. We will keep this simple and do the following:

# Point at latest release
export RELEASE=$(curl https://storage.googleapis.com/kubevirt-prow/release/kubevirt/kubevirt/stable.txt)
# Deploy the KubeVirt operator
kubectl apply -f https://github.com/kubevirt/kubevirt/releases/download/${RELEASE}/kubevirt-operator.yaml

After the operator is installed, it is time to apply the Custom Resource (CR) for the operator to fully deploy KubeVirt.

---
apiVersion: kubevirt.io/v1
kind: KubeVirt
metadata:
  name: kubevirt
  namespace: kubevirt
spec:
  configuration:
    developerConfiguration:
      featureGates:
        - LiveMigration
        - NetworkBindingPlugins
    smbios:
      sku: "TalosCloud"
      version: "v0.1.0"
      manufacturer: "Talos Virtualization"
      product: "talosvm"
      family: "ccio"
  workloadUpdateStrategy:
    workloadUpdateMethods:
    - LiveMigrate # enable if you have deployed either Longhorn or NFS-CSI for shared storage.

KubeVirt configuration options

In this yaml file we specified certain configurations:

featureGates

KubeVirt has a set of features that are not mature enough to be enabled by default. As such, they are protected by a Kubernetes concept called feature gates. More information about the feature gates can be found in the KubeVirt documentation.

In this example we enable:

  • LiveMigration – For live migration of virtual machines to other nodes
  • NetworkBindingPlugins – This is needed for Multus to work.

smbios

Here we configure a specific smbios configuration. This can be useful when you want to give your virtual machines a own sku, manufacturer name etc.

workloadUpdateStrategy

If this is configured, virtual machines will be live migrated to other nodes when KubeVirt is updated.

Installing CDI

The CDI (containerized data importer) is needed to import virtual disk images in your KubeVirt cluster. The CDI can do the following:

  • Import images of type:
    • qcow2
    • raw
    • iso
  • Import disks from either:
    • http/https
    • uploaded through virtctl
    • Container registry
    • Another PVC

You can either import these images by creating a DataVolume CR or by integrating this in your VirtualMachine CR.

When applying either the DataVolume CR or the VirtualMachine CR with a dataVolumeTemplates, the CDI kicks in and will do the following:

  • creates a PVC with the requirements from either the DataVolume or the dataVolumeTemplates
  • starts a pod
  • writes temporary scratch space to local disk
  • downloads the image
  • extracts it to the temporary scratch space
  • copies the image to the PVC

Installing the CDI is very simple:

# Point to latest release
export TAG=$(curl -s -w %{redirect_url} \
https://github.com/kubevirt/containerized-data-importer/releases/latest)

export VERSION=$(echo ${TAG##*/})

# install operator
kubectl create -f \
https://github.com/kubevirt/containerized-data-importer/releases/download/$VERSION/cdi-operator.yaml

After that, you can apply a CDI CR for the CDI operator to fully deploy CDI:

apiVersion: cdi.kubevirt.io/v1beta1
kind: CDI
metadata:
  name: cdi
spec:
  config:
    scratchSpaceStorageClass: local-path
    podResourceRequirements:
      requests:
        cpu: "100m"
        memory: "60M"
      limits:
        cpu: "750m"
        memory: "2Gi"

This CR has some special settings that are needed for CDI to work properly:

scratchSpaceStorageClass

This is the storage class that we installed earlier with the local-path-provisioner. This is needed for the CDI to write scratch space to local disk before importing the image

podResourceRequirements

In many cases the default resource requests and limits are not sufficient for the importer pod to import the image. This will result in a crash of the importer pod.

After applying this yaml file, the CDI operator is ready.

Creating your first virtual machine

Now it is time to create your first virtual machine in KubeVirt. Below we will describe two examples:

  • A virtual machine with the default CNI
  • A virtual machine with Multus

Basic virtual machine example with default CNI

---
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: fedora-vm
spec:
  running: false
  template:
    metadata:
      labels:
        kubevirt.io/vm: fedora-vm
      annotations:
        kubevirt.io/allow-pod-bridge-network-live-migration: "true"

    spec:
      evictionStrategy: LiveMigrate
      domain:
        cpu:
          cores: 2
        resources:
          requests:
            memory: 4G
        devices:
          disks:
            - name: fedora-vm-pvc
              disk:
                bus: virtio
            - name: cloudinitdisk
              disk:
                bus: virtio
          interfaces:
          - name: podnet
            masquerade: {}
      networks:
        - name: podnet
          pod: {}
      volumes:
        - name: fedora-vm-pvc
          persistentVolumeClaim:
            claimName: fedora-vm-pvc
        - name: cloudinitdisk
          cloudInitNoCloud:
            networkData: |
              network:
                version: 1
                config:
                  - type: physical
                    name: eth0
                    subnets:
                      - type: dhcp              
            userData: |-
              #cloud-config
              users:
                - name: cloud-user
                  ssh_authorized_keys:
                    - ssh-rsa ....
                  sudo: ['ALL=(ALL) NOPASSWD:ALL']
                  groups: sudo
                  shell: /bin/bash
              runcmd:
                - "sudo touch /root/installed"
                - "sudo dnf update"
                - "sudo dnf install httpd fastfetch -y"
                - "sudo systemctl daemon-reload"
                - "sudo systemctl enable httpd"
                - "sudo systemctl start --no-block httpd"              

  dataVolumeTemplates:
  - metadata:
      name: fedora-vm-pvc
    spec:
      storage:
        resources:
          requests:
            storage: 35Gi
        accessModes:
          - ReadWriteMany
        storageClassName: "nfs-csi"
      source:
        http:
          url: "https://fedora.mirror.wearetriple.com/linux/releases/40/Cloud/x86_64/images/Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2"

In this examples we install a basic Fedora 40 virtual machine and install a webserver.

After applying this YAML, the CDI will import the image and create a Datavolume. You can monitor this process by running:

kubectl get dv -w

After the DataVolume is created, you can start the virtual machine:

kubectl virt start fedora-vm

By starting the virtual machine, KubeVirt will create a instance of that VirtualMachine called VirtualMachineInstance:

kubectl get virtualmachineinstance
NAME        AGE   PHASE     IP            NODENAME   READY
fedora-vm   13s   Running   10.244.4.92   kube1      True

You can view the console of the virtual machine by running:

kubectl virt console fedora-vm

or by running:

kubectl virt vnc fedora-vm

with the console command it will open a terminal to the virtual machine. With the vnc command, it will open vncviewer. Note that a vncviewer needs to installed for it to work.

Now you can create a Service object to expose the virtual machine to the outside. In this example we will use MetalLB as a LoadBalancer.

apiVersion: v1
kind: Service
metadata:
  labels:
    kubevirt.io/vm: fedora-vm
  name: fedora-vm
spec:
  ipFamilyPolicy: PreferDualStack
  externalTrafficPolicy: Local
  ports:
  - name: ssh
    port: 22
    protocol: TCP
    targetPort: 22
  - name: httpd
    port: 80
    protocol: TCP
    targetPort: 80
  selector:
    kubevirt.io/vm: fedora-vm
  type: LoadBalancer
$ kubectl get svc
NAME             TYPE           CLUSTER-IP     EXTERNAL-IP                        PORT(S)                     AGE
fedora-vm        LoadBalancer   10.96.14.253   10.99.50.1                         22:31149/TCP,80:31445/TCP   2s

And we can reach the server with either ssh or http:

$ nc -zv 10.99.50.1 22
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Connected to 10.99.50.1:22.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.

$ nc -zv 10.99.50.1 80
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Connected to 10.99.50.1:80.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.

Basic virtual machine example with Multus

---
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: fedora-vm
spec:
  running: false
  template:
    metadata:
      labels:
        kubevirt.io/vm: fedora-vm
      annotations:
        kubevirt.io/allow-pod-bridge-network-live-migration: "true"

    spec:
      evictionStrategy: LiveMigrate
      domain:
        cpu:
          cores: 2
        resources:
          requests:
            memory: 4G
        devices:
          disks:
            - name: fedora-vm-pvc
              disk:
                bus: virtio
            - name: cloudinitdisk
              disk:
                bus: virtio
          interfaces:
          - name: external
            bridge: {} # We use the bridge interface.
      networks:
        - name: external
          multus:
            networkName: namespace/networkattachmentdefinition # This is the NetworkAttachmentDefinition. See multus docs for more info.
      volumes:
        - name: fedora-vm-pvc
          persistentVolumeClaim:
            claimName: fedora-vm-pvc
        - name: cloudinitdisk
          cloudInitNoCloud:
            networkData: |
              network:
                version: 1
                config:
                  - type: physical
                    name: eth0
                    subnets:
                      - type: dhcp              
            userData: |-
              #cloud-config
              users:
                - name: cloud-user
                  ssh_authorized_keys:
                    - ssh-rsa ....
                  sudo: ['ALL=(ALL) NOPASSWD:ALL']
                  groups: sudo
                  shell: /bin/bash
              runcmd:
                - "sudo touch /root/installed"
                - "sudo dnf update"
                - "sudo dnf install httpd fastfetch -y"
                - "sudo systemctl daemon-reload"
                - "sudo systemctl enable httpd"
                - "sudo systemctl start --no-block httpd"              

  dataVolumeTemplates:
  - metadata:
      name: fedora-vm-pvc
    spec:
      storage:
        resources:
          requests:
            storage: 35Gi
        accessModes:
          - ReadWriteMany
        storageClassName: "nfs-csi"
      source:
        http:
          url: "https://fedora.mirror.wearetriple.com/linux/releases/40/Cloud/x86_64/images/Fedora-Cloud-Base-Generic.x86_64-40-1.14.qcow2"

In this example we will create a virtual machine that is bound to the bridge interface with the help of Multus. You can start the machine with kubectl virt start fedora-vm. After that you can look up the ip address of the virtual machine with

kubectl get vmi -owide

NAME        AGE    PHASE     IP            NODENAME   READY   LIVE-MIGRATABLE   PAUSED
fedora-vm   6d9h   Running   10.99.101.53   kube1      True    True

Other forms of management

There is a project called KubeVirt-Manager for managing virtual machines with KubeVirt through a nice web interface. You can also choose to deploy virtual machines with ArgoCD of Flux.

Documentation

KubeVirt has a huge documentation page where you can check out everything on running virtual machines with KubeVirt. The documentation can be found here.

13 - Machine Configuration OAuth2 Authentication

How to authenticate Talos machine configuration download (talos.config=) on metal platform using OAuth.

Talos Linux when running on the metal platform can be configured to authenticate the machine configuration download using OAuth2 device flow. The machine configuration is fetched from the URL specified with talos.config kernel argument, and by default this HTTP request is not authenticated. When the OAuth2 authentication is enabled, Talos will authenticate the request using OAuth device flow first, and then pass the token to the machine configuration download endpoint.

Prerequisites

Obtain the following information:

  • OAuth client ID (mandatory)
  • OAuth client secret (optional)
  • OAuth device endpoint
  • OAuth token endpoint
  • OAuth scopes, audience (optional)
  • OAuth client secret (optional)
  • extra Talos variables to send to the device auth endpoint (optional)

Configuration

Set the following kernel parameters on the initial Talos boot to enable the OAuth flow:

  • talos.config set to the URL of the machine configuration endpoint (which will be authenticated using OAuth)
  • talos.config.oauth.client_id set to the OAuth client ID (required)
  • talos.config.oauth.client_secret set to the OAuth client secret (optional)
  • talos.config.oauth.scope set to the OAuth scopes (optional, repeat the parameter for multiple scopes)
  • talos.config.oauth.audience set to the OAuth audience (optional)
  • talos.config.oauth.device_auth_url set to the OAuth device endpoint (if not set defaults to talos.config URL with the path /device/code)
  • talos.config.oauth.token_url set to the OAuth token endpoint (if not set defaults to talos.config URL with the path /token)
  • talos.config.oauth.extra_variable set to the extra Talos variables to send to the device auth endpoint (optional, repeat the parameter for multiple variables)

The list of variables supported by the talos.config.oauth.extra_variable parameter is same as the list of variables supported by the talos.config parameter.

Flow

On the initial Talos boot, when machine configuration is not available, Talos will print the following messages:

[talos] downloading config {"component": "controller-runtime", "controller": "config.AcquireController", "platform": "metal"}
[talos] waiting for network to be ready
[talos] [OAuth] starting the authentication device flow with the following settings:
[talos] [OAuth]  - client ID: "<REDACTED>"
[talos] [OAuth]  - device auth URL: "https://oauth2.googleapis.com/device/code"
[talos] [OAuth]  - token URL: "https://oauth2.googleapis.com/token"
[talos] [OAuth]  - extra variables: ["uuid" "mac"]
[talos] waiting for variables: [uuid mac]
[talos] waiting for variables: [mac]
[talos] [OAuth] please visit the URL https://www.google.com/device and enter the code <REDACTED>
[talos] [OAuth] waiting for the device to be authorized (expires at 14:46:55)...

If the OAuth service provides the complete verification URL, the QR code to scan is also printed to the console:

[talos] [OAuth] or scan the following QR code:
█████████████████████████████████
█████████████████████████████████
████ ▄▄▄▄▄ ██▄▀▀    ▀█ ▄▄▄▄▄ ████
████ █   █ █▄  ▀▄██▄██ █   █ ████
████ █▄▄▄█ ██▀▄██▄  ▀█ █▄▄▄█ ████
████▄▄▄▄▄▄▄█ ▀ █ ▀ █▄█▄▄▄▄▄▄▄████
████   ▀ ▄▄ ▄█  ██▄█   ███▄█▀████
████▀█▄  ▄▄▀▄▄█▀█▄██ ▄▀▄██▄ ▄████
████▄██▀█▄▄▄███▀ ▀█▄▄  ██ █▄ ████
████▄▀▄▄▄ ▄███ ▄ ▀ ▀▀▄▀▄▀█▄ ▄████
████▄█████▄█  █ ██ ▀ ▄▄▄  █▀▀████
████ ▄▄▄▄▄ █ █ ▀█▄█▄ █▄█  █▄ ████
████ █   █ █▄ ▄▀ ▀█▀▄▄▄   ▀█▄████
████ █▄▄▄█ █ ██▄ ▀  ▀███ ▀█▀▄████
████▄▄▄▄▄▄▄█▄▄█▄██▄▄▄▄█▄███▄▄████
█████████████████████████████████

Once the authentication flow is complete on the OAuth provider side, Talos will print the following message:

[talos] [OAuth] device authorized
[talos] fetching machine config from: "http://example.com/config.yaml"
[talos] machine config loaded successfully {"component": "controller-runtime", "controller": "config.AcquireController", "sources": ["metal"]}

14 - Metal Network Configuration

How to use META-based network configuration on Talos metal platform.

Note: This is an advanced feature which requires deep understanding of Talos and Linux network configuration.

Talos Linux when running on a cloud platform (e.g. AWS or Azure), uses the platform-provided metadata server to provide initial network configuration to the node. When running on bare-metal, there is no metadata server, so there are several options to provide initial network configuration (before machine configuration is acquired):

  • use automatic network configuration via DHCP (Talos default)
  • use initial boot kernel command line parameters to configure networking
  • use automatic network configuration via DHCP just enough to fetch machine configuration and then use machine configuration to set desired advanced configuration.

If DHCP option is available, it is by far the easiest way to configure networking. The initial boot kernel command line parameters are not very flexible, and they are not persisted after initial Talos installation.

Talos starting with version 1.4.0 offers a new option to configure networking on bare-metal: META-based network configuration.

Note: META-based network configuration is only available on Talos Linux metal platform.

Talos dashboard provides a way to configure META-based network configuration for a machine using the console, but it doesn’t support all kinds of network configuration.

Network Configuration Format

Talos META-based network configuration is a YAML file with the following format:

addresses:
    - address: 147.75.61.43/31
      linkName: bond0
      family: inet4
      scope: global
      flags: permanent
      layer: platform
    - address: 2604:1380:45f2:6c00::1/127
      linkName: bond0
      family: inet6
      scope: global
      flags: permanent
      layer: platform
    - address: 10.68.182.1/31
      linkName: bond0
      family: inet4
      scope: global
      flags: permanent
      layer: platform
links:
    - name: eth0
      up: true
      masterName: bond0
      slaveIndex: 0
      layer: platform
    - name: eth1
      up: true
      masterName: bond0
      slaveIndex: 1
      layer: platform
    - name: bond0
      logical: true
      up: true
      mtu: 0
      kind: bond
      type: ether
      bondMaster:
        mode: 802.3ad
        xmitHashPolicy: layer3+4
        lacpRate: slow
        arpValidate: none
        arpAllTargets: any
        primaryReselect: always
        failOverMac: 0
        miimon: 100
        updelay: 200
        downdelay: 200
        resendIgmp: 1
        lpInterval: 1
        packetsPerSlave: 1
        numPeerNotif: 1
        tlbLogicalLb: 1
        adActorSysPrio: 65535
      layer: platform
routes:
    - family: inet4
      gateway: 147.75.61.42
      outLinkName: bond0
      table: main
      priority: 1024
      scope: global
      type: unicast
      protocol: static
      layer: platform
    - family: inet6
      gateway: '2604:1380:45f2:6c00::'
      outLinkName: bond0
      table: main
      priority: 2048
      scope: global
      type: unicast
      protocol: static
      layer: platform
    - family: inet4
      dst: 10.0.0.0/8
      gateway: 10.68.182.0
      outLinkName: bond0
      table: main
      scope: global
      type: unicast
      protocol: static
      layer: platform
hostnames:
    - hostname: ci-blue-worker-amd64-2
      layer: platform
resolvers: []
timeServers: []

Every section is optional, so you can configure only the parts you need. The format of each section matches the respective network *Spec resource .spec part, e.g the addresses: section matches the .spec of AddressSpec resource:

# talosctl get addressspecs bond0/10.68.182.1/31 -o yaml | yq .spec
address: 10.68.182.1/31
linkName: bond0
family: inet4
scope: global
flags: permanent
layer: platform

So one way to prepare the network configuration file is to boot Talos Linux, apply necessary network configuration using Talos machine configuration, and grab the resulting resources from the running Talos instance.

In this guide we will briefly cover the most common examples of the network configuration.

Addresses

The addresses configured are usually routable IP addresses assigned to the machine, so the scope: should be set to global and flags: to permanent. Additionally, family: should be set to either inet4 or init6 depending on the address family.

The linkName: property should match the name of the link the address is assigned to, it might be a physical link, e.g. en9sp0, or the name of a logical link, e.g. bond0, created in the links: section.

Example, IPv4 address:

addresses:
    - address: 147.75.61.43/31
      linkName: bond0
      family: inet4
      scope: global
      flags: permanent
      layer: platform

Example, IPv6 address:

addresses:
    - address: 2604:1380:45f2:6c00::1/127
      linkName: bond0
      family: inet6
      scope: global
      flags: permanent
      layer: platform

For physical network interfaces (links), the most usual configuration is to bring the link up:

links:
    - name: en9sp0
      up: true
      layer: platform

This will bring the link up, and it will also disable Talos auto-configuration (disables running DHCP on the link).

Another common case is to set a custom MTU:

links:
    - name: en9sp0
      up: true
      mtu: 9000
      layer: platform

The order of the links in the links: section is not important.

Bonds

For bonded links, there should be a link resource for the bond itself, and a link resource for each enslaved link:

links:
    - name: bond0
      logical: true
      up: true
      kind: bond
      type: ether
      bondMaster:
        mode: 802.3ad
        xmitHashPolicy: layer3+4
        lacpRate: slow
        arpValidate: none
        arpAllTargets: any
        primaryReselect: always
        failOverMac: 0
        miimon: 100
        updelay: 200
        downdelay: 200
        resendIgmp: 1
        lpInterval: 1
        packetsPerSlave: 1
        numPeerNotif: 1
        tlbLogicalLb: 1
        adActorSysPrio: 65535
      layer: platform
    - name: eth0
      up: true
      masterName: bond0
      slaveIndex: 0
      layer: platform
    - name: eth1
      up: true
      masterName: bond0
      slaveIndex: 1
      layer: platform

The name of the bond can be anything supported by Linux kernel, but the following properties are important:

  • logical: true - this is a logical link, not a physical one
  • kind: bond - this is a bonded link
  • type: ether - this is an Ethernet link
  • bondMaster: - defines bond configuration, please see Linux documentation on the available options

For each enslaved link, the following properties are important:

  • masterName: bond0 - the name of the bond this link is enslaved to
  • slaveIndex: 0 - the index of the enslaved link, starting from 0, controls the order of bond slaves

VLANs

VLANs are logical links which have a parent link, and a VLAN ID and protocol:

links:
    - name: bond0.35
      logical: true
      up: true
      kind: vlan
      type: ether
      parentName: bond0
      vlan:
        vlanID: 35
        vlanProtocol: 802.1ad

The name of the VLAN link can be anything supported by Linux kernel, but the following properties are important:

  • logical: true - this is a logical link, not a physical one
  • kind: vlan - this is a VLAN link
  • type: ether - this is an Ethernet link
  • parentName: bond0 - the name of the parent link
  • vlan: - defines VLAN configuration: vlanID and vlanProtocol

Routes

For route configuration, most of the time table: main, scope: global, type: unicast and protocol: static are used.

The route most important fields are:

  • dst: defines the destination network, if left empty means “default gateway”
  • gateway: defines the gateway address
  • priority: defines the route priority (metric), lower values are preferred for the same dst: network
  • outLinkName: defines the name of the link the route is associated with
  • src: sets the source address for the route (optional)

Additionally, family: should be set to either inet4 or init6 depending on the address family.

Example, IPv6 default gateway:

routes:
    - family: inet6
      gateway: '2604:1380:45f2:6c00::'
      outLinkName: bond0
      table: main
      priority: 2048
      scope: global
      type: unicast
      protocol: static
      layer: platform

Example, IPv4 route to 10/8 via 10.68.182.0 gateway:

routes:
    - family: inet4
      dst: 10.0.0.0/8
      gateway: 10.68.182.0
      outLinkName: bond0
      table: main
      scope: global
      type: unicast
      protocol: static
      layer: platform

Hostnames

Even though the section supports multiple hostnames, only a single one should be used:

hostnames:
    - hostname: host
      domainname: some.org
      layer: platform

The domainname: is optional.

If the hostname is not set, Talos will use default generated hostname.

Resolvers

The resolvers: section is used to configure DNS resolvers, only single entry should be used:

resolvers:
    - dnsServers:
        - 8.8.8.8
        - 1.1.1.1
      layer: platform

If the dnsServers: is not set, Talos will use default DNS servers.

Time Servers

The timeServers: section is used to configure NTP time servers, only single entry should be used:

timeServers:
    - timeServers:
        - 169.254.169.254
      layer: platform

If the timeServers: is not set, Talos will use default NTP servers.

Supplying META Network Configuration

Once the network configuration YAML document is ready, it can be supplied to Talos in one of the following ways:

  • for a running Talos machine, using Talos API (requires already established network connectivity)
  • for Talos disk images, it can be embedded into the image
  • for ISO/PXE boot methods, it can be supplied via kernel command line parameters as an environment variable

The metal network configuration is stored in Talos META partition under the key 0xa (decimal 10).

In this guide we will assume that the prepared network configuration is stored in the file network.yaml.

Note: as JSON is a subset of YAML, the network configuration can be also supplied as a JSON document.

Supplying Network Configuration to a Running Talos Machine

Use the talosctl to write a network configuration to a running Talos machine:

talosctl meta write 0xa "$(cat network.yaml)"

Supplying Network Configuration to a Talos Disk Image

Following the boot assets guide, create a disk image passing the network configuration as a --meta flag:

docker run --rm -t -v $PWD/_out:/out -v /dev:/dev --privileged ghcr.io/siderolabs/imager:v1.9.2 metal --meta "0xa=$(cat network.yaml)"

Supplying Network Configuration to a Talos ISO/PXE Boot

As there is no META partition created yet before Talos Linux is installed, META values can be set as an environment variable INSTALLER_META_BASE64 passed to the initial boot of Talos. The supplied value will be used immediately, and also it will be written to the META partition once Talos is installed.

When using imager to create the ISO, the INSTALLER_META_BASE64 environment variable will be automatically generated from the --meta flag:

$ docker run --rm -t -v $PWD/_out:/out ghcr.io/siderolabs/imager:v1.9.2 iso --meta "0xa=$(cat network.yaml)"
...
kernel command line: ... talos.environment=INSTALLER_META_BASE64=MHhhPWZvbw==

When PXE booting, the value of INSTALLER_META_BASE64 should be set manually:

echo -n "0xa=$(cat network.yaml)" | gzip -9 | base64

The resulting base64 string should be passed as an environment variable INSTALLER_META_BASE64 to the initial boot of Talos: talos.environment=INSTALLER_META_BASE64=<base64-encoded value>.

Getting Current META Network Configuration

Talos exports META keys as resources:

# talosctl get meta 0x0a -o yaml
...
spec:
    value: '{"addresses": ...}'

15 - Migrating from Kubeadm

Migrating Kubeadm-based clusters to Talos.

It is possible to migrate Talos from a cluster that is created using kubeadm to Talos.

High-level steps are the following:

  1. Collect CA certificates and a bootstrap token from a control plane node.
  2. Create a Talos machine config with the CA certificates with the ones you collected.
  3. Update control plane endpoint in the machine config to point to the existing control plane (i.e. your load balancer address).
  4. Boot a new Talos machine and apply the machine config.
  5. Verify that the new control plane node is ready.
  6. Remove one of the old control plane nodes.
  7. Repeat the same steps for all control plane nodes.
  8. Verify that all control plane nodes are ready.
  9. Repeat the same steps for all worker nodes, using the machine config generated for the workers.

Remarks on kube-apiserver load balancer

While migrating to Talos, you need to make sure that your kube-apiserver load balancer is in place and keeps pointing to the correct set of control plane nodes.

This process depends on your load balancer setup.

If you are using an LB that is external to the control plane nodes (e.g. cloud provider LB, F5 BIG-IP, etc.), you need to make sure that you update the backend IPs of the load balancer to point to the control plane nodes as you add Talos nodes and remove kubeadm-based ones.

If your load balancing is done on the control plane nodes (e.g. keepalived + haproxy on the control plane nodes), you can do the following:

  1. Add Talos nodes and remove kubeadm-based ones while updating the haproxy backends to point to the newly added nodes except the last kubeadm-based control plane node.
  2. Turn off keepalived to drop the virtual IP used by the kubeadm-based nodes (introduces kube-apiserver downtime).
  3. Set up a virtual-IP based new load balancer on the new set of Talos control plane nodes. Use the previous LB IP as the LB virtual IP.
  4. Verify apiserver connectivity over the Talos-managed virtual IP.
  5. Migrate the last control-plane node.

Prerequisites

  • Admin access to the kubeadm-based cluster
  • Access to the /etc/kubernetes/pki directory (e.g. SSH & root permissions) on the control plane nodes of the kubeadm-based cluster
  • Access to kube-apiserver load-balancer configuration

Step-by-step guide

  1. Download /etc/kubernetes/pki directory from a control plane node of the kubeadm-based cluster.

  2. Create a new join token for the new control plane nodes:

    # inside a control plane node
    kubeadm token create --ttl 0
    
  3. Create Talos secrets from the PKI directory you downloaded on step 1 and the token you generated on step 2:

    talosctl gen secrets --kubernetes-bootstrap-token <TOKEN> --from-kubernetes-pki <PKI_DIR>
    
  4. Create a new Talos config from the secrets:

    talosctl gen config --with-secrets secrets.yaml <CLUSTER_NAME> https://<EXISTING_CLUSTER_LB_IP>
    
  5. Collect the information about the kubeadm-based cluster from the kubeadm configmap:

    kubectl get configmap -n kube-system kubeadm-config -oyaml
    

    Take note of the following information in the ClusterConfiguration:

    • .controlPlaneEndpoint
    • .networking.dnsDomain
    • .networking.podSubnet
    • .networking.serviceSubnet
  6. Replace the following information in the generated controlplane.yaml:

    • .cluster.network.cni.name with none
    • .cluster.network.podSubnets[0] with the value of the networking.podSubnet from the previous step
    • .cluster.network.serviceSubnets[0] with the value of the networking.serviceSubnet from the previous step
    • .cluster.network.dnsDomain with the value of the networking.dnsDomain from the previous step
  7. Go through the rest of controlplane.yaml and worker.yaml to customize them according to your needs, especially :

    • .cluster.secretboxEncryptionSecret should be either removed if you don’t currently use EncryptionConfig on your kube-apiserver or set to the correct value
  8. Make sure that, on your current Kubeadm cluster, the first --service-account-issuer= parameter in /etc/kubernetes/manifests/kube-apiserver.yaml is equal to the value of .cluster.controlPlane.endpoint in controlplane.yaml. If it’s not, add a new --service-account-issuer= parameter with the correct value before your current one in /etc/kubernetes/manifests/kube-apiserver.yaml on all of your control planes nodes, and restart the kube-apiserver containers.

  9. Bring up a Talos node to be the initial Talos control plane node.

  10. Apply the generated controlplane.yaml to the Talos control plane node:

    talosctl --nodes <TALOS_NODE_IP> apply-config --insecure --file controlplane.yaml
    
  11. Wait until the new control plane node joins the cluster and is ready.

    kubectl get node -owide --watch
    
  12. Update your load balancer to point to the new control plane node.

  13. Drain the old control plane node you are replacing:

    kubectl drain <OLD_NODE> --delete-emptydir-data --force --ignore-daemonsets --timeout=10m
    
  14. Remove the old control plane node from the cluster:

    kubectl delete node <OLD_NODE>
    
  15. Destroy the old node:

    # inside the node
    sudo kubeadm reset --force
    
  16. Repeat the same steps, starting from step 7, for all control plane nodes.

  17. Repeat the same steps, starting from step 7, for all worker nodes while applying the worker.yaml instead and skipping the LB step:

    talosctl --nodes <TALOS_NODE_IP> apply-config --insecure --file worker.yaml
    
  18. Your kubeadm kube-proxy configuration may not be compatible with the one generated by Talos, which will make the Talos Kubernetes upgrades impossible (labels may not be the same, and selector.matchLabels is an immutable field). To be sure, export your current kube-proxy daemonset manifest, check the labels, they have to be:

    tier: node
    k8s-app: kube-proxy
    

    If the are not, modify all the labels fields, save the file, delete your current kube-proxy daemonset, and apply the one you modified.

16 - OCI Base Runtime Specification

Adjusting OCI base runtime specification for CRI containers.

Every container initiated by the Container Runtime Interface (CRI) adheres to the OCI runtime specification. While certain aspects of this specification can be modified through Kubernetes pod and container configurations, others remain fixed.

Talos Linux provides the capability to adjust the OCI base runtime specification for all containers managed by the CRI. However, it is important to note that the Kubernetes/CRI plugin may still override some settings, meaning changes to the base runtime specification are not always guaranteed to take effect.

Getting Current OCI Base Runtime Specification

To get the current OCI base runtime specification, you can use the following command (yq -P . is used to pretty-print the output):

$ talosctl read /etc/cri/conf.d/base-spec.json | yq -P .
ociVersion: 1.2.0
process:
  user:
    uid: 0
    gid: 0
  cwd: /
  capabilities:
    bounding:
      - CAP_CHOWN
...

The output might depend on a specific Talos (containerd) version.

Adjusting OCI Base Runtime Specification

To adjust the OCI base runtime specification, the following machine configuration patch can be used:

machine:
  baseRuntimeSpecOverrides:
    process:
      rlimits:
        - type: RLIMIT_NOFILE
          hard: 1024
          soft: 1024

In this example, the number of open files is adjusted to be 1024 for all containers (OCI default is unset, so it inherits the Talos default of 1048576 open files). The contents of the baseRuntimeSpecOverrides field are merged with the current base runtime specification, so only the fields that need to be adjusted should be included.

This configuration change will be applied with a machine reboot, and OCI base runtime specification will only affect new containers created after the change on the node.

17 - Overlays

Overlays

Overlays provide a way to customize Talos Linux boot image. Overlays hook into the Talos install steps and can be used to provide additional boot assets (in the case of single board computers), extra kernel arguments or some custom configuration that is not part of the default Talos installation and specific to a particular overlay.

Overlays v/s Extensions

Overlays are similar to extensions, but they are used to customize the installation process, while extensions are used to customize the root filesystem.

Official Overlays

The list of official overlays can be found in the Overlays GitHub repository.

Using Overlays

Overlays can be used to generate a modified metal image or installer image with the overlay applied.

The process of generating boot assets with overlays is described in the boot assets guide.

Example: Booting a Raspberry Pi 4 with an Overlay

Follow the board specific guide for Raspberry Pi to download or generate the metal disk image and write to an SD card.

Boot the machine with the boot media and apply the machine configuration with the installer image that has the overlay applied.

# Talos machine configuration patch
machine:
  install:
    image: factory.talos.dev/installer/fc1cceeb5711cd263877b6b808fbf4942a8deda65e8804c114a0b5bae252dc50:v1.9.2

Note: The schematic id shown in the above patch is for a vanilla rpi_generic overlay. Replace it with the schematic id of the overlay you want to apply.

Authoring Overlays

An Overlay is a container image with the specific folder structure. Overlays can be built and managed using any tool that produces container images, e.g. docker build.

Sidero Labs maintains a repository of overlays.

Developing An Overlay

Let’s assume that you would like to contribute an overlay for a specific board, e.g. by contributing to the sbc-rockchip repository. Clone the repositry and insepct the existing overlays to understand the structure.

Usually an overlay consist of a few key components:

  • firmware: contains the firmware files required for the board
  • bootloader: contains the bootloader, e.g. u-boot for the board
  • dtb: contains the device tree blobs for the board
  • installer: contains the installer that will be used to install this overlay on the node
  • profile: contains information
  1. For the new overlay, create any needed folders and pkg.yaml files.
  2. If your board introduces a new chipset that is not supported yet, make sure to add the firmware build for it.
  3. Add the necessary u-boot and dtb build steps to the pkg.yaml files.
  4. Proceed to add an installer, which is a small go binary that will be used to install the overlay on the node. Here you need to add the go src/ as well as the pkg.yaml file.
  5. Lastly, add the profile information in the profiles folder.

You are now ready to attempt building the overlay. It’s recommend to push the build to a container registry to test the overlay with the Talos installer.

The default settings are:

  • REGISTRY is set to ghcr.io
  • USERNAME is set to the siderolabs (or value of environment variable USERNAME if it is set)
make sbc-rockchip PUSH=true

If using a custom registry, the REGISTRY and USERNAME variables can be set:

make sbc-rockchip PUSH=true REGISTRY=<registry> USERNAME=<username>

After building the overlay, take note of the pushed image tag, e.g. 664638a, because you will need it for the next step. You can now build a flashable image using the command below.

export TALOS_VERSION=v1.7.6
export USERNAME=octocat
export BOARD=nanopi-r5s
export TAG=664638a

docker run --rm -t -v ./_out:/out -v /dev:/dev --privileged ghcr.io/siderolabs/imager:${TALOS_VERSION} \
    metal --arch arm64 \
    --base-installer-image="ghcr.io/siderolabs/installer:${TALOS_VERSION}" \
    --overlay-name="${BOARD}" \
    --overlay-image="ghcr.io/${USERNAME}/sbc-rockchip:${TAG}" \

–overlay-option

--overlay-option can be used to pass additional options to the overlay installer if they are implemented by the overlay. An example can be seen in the sbc-raspberrypi overlay repository. It supports passing multiple options by repeating the flag or can be read from a yaml document by passing --overlay-option=@<path to file>.

Note: In some cases you need to build a custom imager. In this case, refer to this guide on how to build a custom images using an imager.

Troubleshooting

IMPORTANT: If this does not succeed, have a look at the documentation of the external dependecies you are pulling in and make sure that the pkg.yaml files are correctly configured. In some cases it may be required to update the dependencies to an appropriate version via the Pkgfile.

18 - Proprietary Kernel Modules

Adding a proprietary kernel module to Talos Linux
  1. Patching and building the kernel image

    1. Clone the pkgs repository from Github and check out the revision corresponding to your version of Talos Linux

      git clone https://github.com/talos-systems/pkgs pkgs && cd pkgs
      git checkout v0.8.0
      
    2. Clone the Linux kernel and check out the revision that pkgs uses (this can be found in kernel/kernel-prepare/pkg.yaml and it will be something like the following: https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-x.xx.x.tar.xz)

      git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git && cd linux
      git checkout v5.15
      
    3. Your module will need to be converted to be in-tree. The steps for this are different depending on the complexity of the module to port, but generally it would involve moving the module source code into the drivers tree and creating a new Makefile and Kconfig.

    4. Stage your changes in Git with git add -A.

    5. Run git diff --cached --no-prefix > foobar.patch to generate a patch from your changes.

    6. Copy this patch to kernel/kernel/patches in the pkgs repo.

    7. Add a patch line in the prepare segment of kernel/kernel/pkg.yaml:

      patch -p0 < /pkg/patches/foobar.patch
      
    8. Build the kernel image. Make sure you are logged in to ghcr.io before running this command, and you can change or omit PLATFORM depending on what you want to target.

      make kernel PLATFORM=linux/amd64 USERNAME=your-username PUSH=true
      
    9. Make a note of the image name the make command outputs.

  2. Building the installer image

    1. Copy the following into a new Dockerfile:

      FROM scratch AS customization
      COPY --from=ghcr.io/your-username/kernel:<kernel version> /lib/modules /lib/modules
      
      FROM ghcr.io/siderolabs/installer:<talos version>
      COPY --from=ghcr.io/your-username/kernel:<kernel version> /boot/vmlinuz /usr/install/${TARGETARCH}/vmlinuz
      
    2. Run to build and push the installer:

      INSTALLER_VERSION=<talos version>
      IMAGE_NAME="ghcr.io/your-username/talos-installer:$INSTALLER_VERSION"
      DOCKER_BUILDKIT=0 docker build --build-arg RM="/lib/modules" -t "$IMAGE_NAME" . && docker push "$IMAGE_NAME"
      
  3. Deploying to your cluster

    talosctl upgrade --image ghcr.io/your-username/talos-installer:<talos version> --preserve=true
    

19 - Static Pods

Using Talos Linux to set up static pods in Kubernetes.

Static Pods

Static pods are run directly by the kubelet bypassing the Kubernetes API server checks and validations. Most of the time DaemonSet is a better alternative to static pods, but some workloads need to run before the Kubernetes API server is available or might need to bypass security restrictions imposed by the API server.

See Kubernetes documentation for more information on static pods.

Configuration

Static pod definitions are specified in the Talos machine configuration:

machine:
  pods:
    - apiVersion: v1
       kind: Pod
       metadata:
         name: nginx
       spec:
         containers:
           - name: nginx
             image: nginx

Talos renders static pod definitions to the kubelet using a local HTTP server, kubelet picks up the definition and launches the pod.

Talos accepts changes to the static pod configuration without a reboot.

To see a full list of static pods, use talosctl get staticpods, and to see the status of the static pods (as reported by the kubelet), use talosctl get staticpodstatus.

Usage

Kubelet mirrors pod definition to the API server state, so static pods can be inspected with kubectl get pods, logs can be retrieved with kubectl logs, etc.

$ kubectl get pods
NAME                           READY   STATUS    RESTARTS   AGE
nginx-talos-default-controlplane-2   1/1     Running   0          17s

If the API server is not available, status of the static pod can also be inspected with talosctl containers --kubernetes:

$ talosctl containers --kubernetes
NODE         NAMESPACE   ID                                                                                      IMAGE                                                   PID    STATUS
172.20.0.3   k8s.io      default/nginx-talos-default-controlplane-2                                              registry.k8s.io/pause:3.6                               4886   SANDBOX_READY
172.20.0.3   k8s.io      └─ default/nginx-talos-default-controlplane-2:nginx:4183a7d7a771                        docker.io/library/nginx:latest
...

Logs of static pods can be retrieved with talosctl logs --kubernetes:

$ talosctl logs --kubernetes default/nginx-talos-default-controlplane-2:nginx:4183a7d7a771
172.20.0.3: 2022-02-10T15:26:01.289208227Z stderr F 2022/02/10 15:26:01 [notice] 1#1: using the "epoll" event method
172.20.0.3: 2022-02-10T15:26:01.2892466Z stderr F 2022/02/10 15:26:01 [notice] 1#1: nginx/1.21.6
172.20.0.3: 2022-02-10T15:26:01.28925723Z stderr F 2022/02/10 15:26:01 [notice] 1#1: built by gcc 10.2.1 20210110 (Debian 10.2.1-6)

Troubleshooting

Talos doesn’t perform any validation on the static pod definitions. If the pod isn’t running, use kubelet logs (talosctl logs kubelet) to find the problem:

$ talosctl logs kubelet
172.20.0.2: {"ts":1644505520281.427,"caller":"config/file.go:187","msg":"Could not process manifest file","path":"/etc/kubernetes/manifests/talos-default-nginx-gvisor.yaml","err":"invalid pod: [spec.containers: Required value]"}

Resource Definitions

Static pod definitions are available as StaticPod resources combined with Talos-generated control plane static pods:

$ talosctl get staticpods
NODE         NAMESPACE   TYPE        ID                        VERSION
172.20.0.3   k8s         StaticPod   default-nginx             1
172.20.0.3   k8s         StaticPod   kube-apiserver            1
172.20.0.3   k8s         StaticPod   kube-controller-manager   1
172.20.0.3   k8s         StaticPod   kube-scheduler            1

Talos assigns ID <namespace>-<name> to the static pods specified in the machine configuration.

On control plane nodes status of the running static pods is available in the StaticPodStatus resource:

$ talosctl get staticpodstatus
NODE         NAMESPACE   TYPE              ID                                                           VERSION   READY
172.20.0.3   k8s         StaticPodStatus   default/nginx-talos-default-controlplane-2                         2         True
172.20.0.3   k8s         StaticPodStatus   kube-system/kube-apiserver-talos-default-controlplane-2            2         True
172.20.0.3   k8s         StaticPodStatus   kube-system/kube-controller-manager-talos-default-controlplane-2   3         True
172.20.0.3   k8s         StaticPodStatus   kube-system/kube-scheduler-talos-default-controlplane-2            3         True

20 - Talos API access from Kubernetes

How to access Talos API from within Kubernetes.

In this guide, we will enable the Talos feature to access the Talos API from within Kubernetes.

Enabling the Feature

Edit the machine configuration to enable the feature, specifying the Kubernetes namespaces from which Talos API can be accessed and the allowed Talos API roles.

talosctl -n 172.20.0.2 edit machineconfig

Configure the kubernetesTalosAPIAccess like the following:

spec:
  machine:
    features:
      kubernetesTalosAPIAccess:
        enabled: true
        allowedRoles:
          - os:reader
        allowedKubernetesNamespaces:
          - default

Injecting Talos ServiceAccount into manifests

Create the following manifest file deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: talos-api-access
spec:
  selector:
    matchLabels:
      app: talos-api-access
  template:
    metadata:
      labels:
        app: talos-api-access
    spec:
      containers:
        - name: talos-api-access
          image: alpine:3
          command:
            - sh
            - -c
            - |
              wget -O /usr/local/bin/talosctl https://github.com/siderolabs/talos/releases/download/v1.9.2/talosctl-linux-amd64
              chmod +x /usr/local/bin/talosctl
              while true; talosctl -n 172.20.0.2 version; do sleep 1; done              

Note: make sure that you replace the IP 172.20.0.2 with a valid Talos node IP.

Use talosctl inject serviceaccount command to inject the Talos ServiceAccount into the manifest.

talosctl inject serviceaccount -f deployment.yaml > deployment-injected.yaml

Inspect the generated manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  name: talos-api-access
spec:
  selector:
    matchLabels:
      app: talos-api-access
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: talos-api-access
    spec:
      containers:
      - command:
        - sh
        - -c
        - |
          wget -O /usr/local/bin/talosctl https://github.com/siderolabs/talos/releases/download/v1.9.2/talosctl-linux-amd64
          chmod +x /usr/local/bin/talosctl
          while true; talosctl -n 172.20.0.2 version; do sleep 1; done          
        image: alpine:3
        name: talos-api-access
        resources: {}
        volumeMounts:
        - mountPath: /var/run/secrets/talos.dev
          name: talos-secrets
      tolerations:
      - operator: Exists
      volumes:
      - name: talos-secrets
        secret:
          secretName: talos-api-access-talos-secrets
status: {}
---
apiVersion: talos.dev/v1alpha1
kind: ServiceAccount
metadata:
    name: talos-api-access-talos-secrets
spec:
    roles:
        - os:reader
---

As you can notice, your deployment manifest is now injected with the Talos ServiceAccount.

Testing API Access

Apply the new manifest into default namespace:

kubectl apply -n default -f deployment-injected.yaml

Follow the logs of the pods belong to the deployment:

kubectl logs -n default -f -l app=talos-api-access

You’ll see a repeating output similar to the following:

Client:
    Tag:         <talos version>
    SHA:         ....
    Built:
    Go version:  go1.18.4
    OS/Arch:     linux/amd64
Server:
    NODE:        172.20.0.2
    Tag:         <talos version>
    SHA:         ...
    Built:
    Go version:  go1.18.4
    OS/Arch:     linux/amd64
    Enabled:     RBAC

This means that the pod can talk to Talos API of node 172.20.0.2 successfully.

21 - Verifying Images

Verifying Talos container image signatures.

Sidero Labs signs the container images generated for the Talos release with cosign:

  • ghcr.io/siderolabs/installer (Talos installer)
  • ghcr.io/siderolabs/talos (Talos image for container runtime)
  • ghcr.io/siderolabs/talosctl (talosctl client packaged as a container image)
  • ghcr.io/siderolabs/imager (Talos install image generator)
  • all system extension images

Verifying Container Image Signatures

The cosign tool can be used to verify the signatures of the Talos container images:

$ cosign verify --certificate-identity-regexp '@siderolabs\.com$' --certificate-oidc-issuer https://accounts.google.com ghcr.io/siderolabs/installer:v1.4.0

Verification for ghcr.io/siderolabs/installer:v1.4.0 --
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - Existence of the claims in the transparency log was verified offline
  - The code-signing certificate was verified using trusted certificate authority certificates

[{"critical":{"identity":{"docker-reference":"ghcr.io/siderolabs/installer"},"image":{"docker-manifest-digest":"sha256:f41795cc88f40eb1bc6b3c638c4a3123f6ef3c90627bfc35c04ebab82581e3ee"},"type":"cosign container image signature"},"optional":{"1.3.6.1.4.1.57264.1.1":"https://accounts.google.com","Bundle":{"SignedEntryTimestamp":"MEQCIERkQpgEnPWnfjUHIWO9QxC9Ute3/xJOc7TO5GUnu59xAiBKcFvrDWHoUYChT0/+gaazTrI+r0/GWSbi+Q+sEQ5AKA==","Payload":{"body":"eyJhcGlWZXJzaW9uIjoiMC4wLjEiLCJraW5kIjoiaGFzaGVkcmVrb3JkIiwic3BlYyI6eyJkYXRhIjp7Imhhc2giOnsiYWxnb3JpdGhtIjoic2hhMjU2IiwidmFsdWUiOiJkYjhjYWUyMDZmODE5MDlmZmI4NjE4ZjRkNjIzM2ZlYmM3NzY5MzliOGUxZmZkMTM1ODA4ZmZjNDgwNjYwNGExIn19LCJzaWduYXR1cmUiOnsiY29udGVudCI6Ik1FVUNJUURQWXhiVG5vSDhJTzBEakRGRE9rNU1HUjRjMXpWMys3YWFjczNHZ2J0TG1RSWdHczN4dVByWUgwQTAvM1BSZmZydDRYNS9nOUtzQVdwdG9JbE9wSDF0NllrPSIsInB1YmxpY0tleSI6eyJjb250ZW50IjoiTFMwdExTMUNSVWRKVGlCRFJWSlVTVVpKUTBGVVJTMHRMUzB0Q2sxSlNVTXhha05EUVd4NVowRjNTVUpCWjBsVlNIbEhaRTFQVEhkV09WbFFSbkJYUVRKb01qSjRVM1ZIZVZGM2QwTm5XVWxMYjFwSmVtb3dSVUYzVFhjS1RucEZWazFDVFVkQk1WVkZRMmhOVFdNeWJHNWpNMUoyWTIxVmRWcEhWakpOVWpSM1NFRlpSRlpSVVVSRmVGWjZZVmRrZW1SSE9YbGFVekZ3WW01U2JBcGpiVEZzV2tkc2FHUkhWWGRJYUdOT1RXcE5kMDVFUlRSTlZHZDZUbXBWTlZkb1kwNU5hazEzVGtSRk5FMVVaekJPYWxVMVYycEJRVTFHYTNkRmQxbElDa3R2V2tsNmFqQkRRVkZaU1V0dldrbDZhakJFUVZGalJGRm5RVVZaUVdKaVkwbDZUVzR3ZERBdlVEZHVUa0pNU0VscU1rbHlORTFQZGpoVVRrVjZUemNLUkVadVRXSldVbGc0TVdWdmExQnVZblJHTVZGMmRWQndTVm95VkV3NFFUUkdSMWw0YldFeGJFTk1kMkk0VEZOVWMzRlBRMEZZYzNkblowWXpUVUUwUndwQk1WVmtSSGRGUWk5M1VVVkJkMGxJWjBSQlZFSm5UbFpJVTFWRlJFUkJTMEpuWjNKQ1owVkdRbEZqUkVGNlFXUkNaMDVXU0ZFMFJVWm5VVlZqYWsweUNrbGpVa1lyTkhOVmRuRk5ia3hsU0ZGMVJIRkdRakZqZDBoM1dVUldVakJxUWtKbmQwWnZRVlV6T1ZCd2VqRlphMFZhWWpWeFRtcHdTMFpYYVhocE5Ga0tXa1E0ZDB0M1dVUldVakJTUVZGSUwwSkRSWGRJTkVWa1dWYzFhMk50VmpWTWJrNTBZVmhLZFdJeldrRmpNbXhyV2xoS2RtSkhSbWxqZVRWcVlqSXdkd3BMVVZsTFMzZFpRa0pCUjBSMmVrRkNRVkZSWW1GSVVqQmpTRTAyVEhrNWFGa3lUblprVnpVd1kzazFibUl5T1c1aVIxVjFXVEk1ZEUxRGMwZERhWE5IQ2tGUlVVSm5OemgzUVZGblJVaFJkMkpoU0ZJd1kwaE5Oa3g1T1doWk1rNTJaRmMxTUdONU5XNWlNamx1WWtkVmRWa3lPWFJOU1VkTFFtZHZja0puUlVVS1FXUmFOVUZuVVVOQ1NIZEZaV2RDTkVGSVdVRXpWREIzWVhOaVNFVlVTbXBIVWpSamJWZGpNMEZ4U2t0WWNtcGxVRXN6TDJnMGNIbG5Remh3TjI4MFFRcEJRVWRJYkdGbVp6Um5RVUZDUVUxQlVucENSa0ZwUVdKSE5tcDZiVUkyUkZCV1dUVXlWR1JhUmtzeGVUSkhZVk5wVW14c1IydHlSRlpRVXpsSmJGTktDblJSU1doQlR6WlZkbnBFYVVOYVFXOXZSU3RLZVdwaFpFdG5hV2xLT1RGS00yb3ZZek5CUTA5clJIcFhOamxaVUUxQmIwZERRM0ZIVTAwME9VSkJUVVFLUVRKblFVMUhWVU5OUVZCSlRUVjJVbVpIY0VGVWNqQTJVR1JDTURjeFpFOXlLMHhFSzFWQ04zbExUVWRMWW10a1UxTnJaMUp5U3l0bGNuZHdVREp6ZGdvd1NGRkdiM2h0WlRkM1NYaEJUM2htWkcxTWRIQnpjazFJZGs5cWFFSmFTMVoxVG14WmRXTkJaMVF4V1VWM1ZuZHNjR2QzYTFWUFdrWjRUemRrUnpONkNtVnZOWFJ3YVdoV1kyTndWMlozUFQwS0xTMHRMUzFGVGtRZ1EwVlNWRWxHU1VOQlZFVXRMUzB0TFFvPSJ9fX19","integratedTime":1681843022,"logIndex":18304044,"logID":"c0d23d6ad406973f9559f3ba2d1ca01f84147d8ffc5b8445c224f98b9591801d"}},"Issuer":"https://accounts.google.com","Subject":"andrey.smirnov@siderolabs.com"}}]

The image should be signed using cosign certificate authority flow by a Sidero Labs employee with and email from siderolabs.com domain.

Reproducible Builds

Talos builds for kernel, initramfs, talosctl, ISO image, and container images are reproducible. So you can verify that the build is the same as the one as provided on GitHub releases page.

See building Talos images for more details.

22 - Watchdog Timers

Using hardware watchdogs to workaround hardware/software lockups.

Talos Linux now supports hardware watchdog timers configuration. Hardware watchdog timers allow to reset (reboot) the system if the software stack becomes unresponsive. Please consult your hardware/VM documentation for the availability of the hardware watchdog timers.

Configuration

To discover the available watchdog devices, run:

$ talosctl ls /sys/class/watchdog/
NODE         NAME
172.20.0.2   .
172.20.0.2   watchdog0
172.20.0.2   watchdog1

The implementation of the watchdog device can be queried with:

$ talosctl read /sys/class/watchdog/watchdog0/identity
i6300ESB timer

To enable the watchdog timer, patch the machine configuration with the following:

# watchdog.yaml
apiVersion: v1alpha1
kind: WatchdogTimerConfig
device: /dev/watchdog0
timeout: 5m
talosctl patch mc -p @watchdog.yaml

Talos Linux will set up the watchdog time with a 5-minute timeout, and it will keep resetting the timer to prevent the system from rebooting. If the software becomes unresponsive, the watchdog timer will expire, and the system will be reset by the watchdog hardware.

Inspection

To inspect the watchdog timer configuration, run:

$ talosctl get watchdogtimerconfig
NODE         NAMESPACE   TYPE                  ID      VERSION   DEVICE           TIMEOUT
172.20.0.2   runtime     WatchdogTimerConfig   timer   1         /dev/watchdog0   5m0s

To inspect the watchdog timer status, run:

$ talosctl get watchdogtimerstatus
NODE         NAMESPACE   TYPE                  ID      VERSION   DEVICE           TIMEOUT
172.20.0.2   runtime     WatchdogTimerStatus   timer   1         /dev/watchdog0   5m0s

Current status of the watchdog timer can also be inspected via Linux sysfs:

$ talosctl read /sys/class/watchdog/watchdog0/state
active