Kubernetes runtime security with Tetragon

Container runtime security has traditionally been reactive - detecting malicious behavior after it happens. Tools like Falco and…

Alexandr Ivenin

~8 min read · April 30, 2026 (Updated: April 30, 2026) · Free: Yes

Container runtime security has traditionally been reactive - detecting malicious behavior after it happens. Tools like Falco and traditional runtime detection systems can alert you when someone executes /bin/bash in your production pod, but by then the damage may already be done.

Tetragon takes a different approach: kernel-level enforcement. Built on eBPF technology, it can observe every system call in your cluster and can intercept and terminate unauthorized processes during system calls, preventing malicious actions from completing. No userspace delays, no race conditions - just instant enforcement at the kernel level.

Initially created by Isovalent and now a CNCF-backed project, Tetragon provides capabilities similar to AppArmor or SELinux but with a Kubernetes-native API. Instead of wrestling with profile files on individual nodes, you define security policies as TracingPolicy CRDs and let Tetragon handle the kernel-level enforcement.

Prerequisites

Before we dive in, you'll need:

A Kubernetes cluster (GKE, EKS, or any cluster with Linux kernel 4.19+)
Helm 3.x installed
kubectl access to the cluster
For observability: The tetra CLI (we'll install this later)

Important note on kernel compatibility: Tetragon uses eBPF, which requires BTF (BPF Type Format) support. Modern GKE and EKS clusters enable this by default. If you're on a custom cluster, verify BTF availability with ls /sys/kernel/btf/vmlinux.

How Tetragon works: the three-stage workflow

Tetragon's approach is different from traditional "block everything then whitelist" security models. Instead, you:

Deploy Tetragon to your cluster
Observe your application's actual behavior
Enforce policies based on that observed behavior

This "observe-then-enforce" model drastically reduces false positives and ensures your policies match reality, not assumptions.

Let's dive into the implementation.

Deploy Tetragon

Tetragon runs as a DaemonSet, installing an eBPF agent on every node in your cluster. Installation is straightforward with Helm:

helm repo add cilium https://helm.cilium.io
helm repo update
helm install tetragon cilium/tetragon -n kube-system

By default, Tetragon enables TracingPolicy support and auto-detects BTF (BPF Type Format) for your kernel.

Verify the deployment:

kubectl get pods -n kube-system -l app.kubernetes.io/name=tetragon

You should see output like:

NAME             READY   STATUS    RESTARTS   AGE
tetragon-2gqrg   2/2     Running   0          7s
tetragon-45t6z   2/2     Running   0          7s
tetragon-6q2xk   2/2     Running   0          7s

Each node gets its own Tetragon pod that hooks into the kernel.

Install the TracingPolicy CRDs

Before you can create TracingPolicy resources, you need to install the Custom Resource Definitions:

kubectl apply -f https://raw.githubusercontent.com/cilium/tetragon/main/pkg/k8s/apis/cilium.io/client/crds/v1alpha1/cilium.io_tracingpolicies.yaml
kubectl apply -f https://raw.githubusercontent.com/cilium/tetragon/main/pkg/k8s/apis/cilium.io/client/crds/v1alpha1/cilium.io_tracingpoliciesnamespaced.yaml

Verify the CRDs are installed:

kubectl get crd | grep tracingpolicies
tracingpolicies.cilium.io                              2026-04-29T11:57:07Z
tracingpoliciesnamespaced.cilium.io                    2026-04-29T11:57:13Z

Observe your application

Before you block anything, you need to understand what your application actually does. This is where TracingPolicies come in.

Start with process execution monitoring

Create a basic audit policy that logs all process executions without blocking anything:

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "audit-all-execs"
spec:
  kprobes:
  - call: "sys_execve"
    syscall: true
    args:
    - index: 0
      type: "string"

Apply it with kubectl apply -f audit-policy.yaml.

This policy hooks the sys_execve syscall, which fires every time a process starts. The index: 0 argument captures the binary path.

Expand observability (Optional)

For more comprehensive monitoring, you can track file access, network connections, privilege escalation, and kernel module loads:

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "audit-comprehensive"
spec:
  kprobes:
  - call: "fd_install"          # File operations
    syscall: false
    args:
    - index: 0
      type: "int"
    - index: 1
      type: "file"
  - call: "tcp_connect"         # Outbound connections
    syscall: false
    args:
    - index: 0
      type: "sock"
  - call: "sys_setuid"          # Privilege changes
    syscall: true
    args:
    - index: 0
      type: "int"
  - call: "sys_init_module"     # Kernel module loads
    syscall: true
    args:
    - index: 0
      type: "string"

Note. By default, Tetragon already collects process_exec and process_exit events without any custom TracingPolicy.

While this article focuses primarily on file-based operations and process execution, Tetragon's capabilities extend beyond files to network monitoring as well. The tcp_connect hook shown above enables tracking outbound network connections, which is crucial for detecting lateral movement, data exfiltration, and unauthorized external communication. Network-level observability and enforcement with Tetragon deserves separate article, for sure.

View the events

Tetragon exports events as JSON. You can view them directly:

kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon -c export-stdout -f

But raw JSON is hard to read. Install the tetra CLI for human-readable output:

GOOS=$(go env GOOS)
GOARCH=$(go env GOARCH)
curl -L --remote-name-all https://github.com/cilium/tetragon/releases/latest/download/tetra-${GOOS}-${GOARCH}.tar.gz{,.sha256sum}
sha256sum --check tetra-${GOOS}-${GOARCH}.tar.gz.sha256sum
sudo tar -C /usr/local/bin -xzvf tetra-${GOOS}-${GOARCH}.tar.gz
rm tetra-${GOOS}-${GOARCH}.tar.gz{,.sha256sum}

Now stream logs in compact format:

kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon -c export-stdout -f | tetra getevents -o compact

You'll see output like:

❓ syscall default/nginx-controller-66c7d78958-rg2vr /nginx-ingress-controller __x64_sys_execve
🚀 process default/nginx-controller-66c7d78958-rg2vr /usr/bin/nginx -c /etc/nginx/nginx.conf -s reload
📬 open    default/nginx-controller-66c7d78958-rg2vr /usr/bin/nginx 
📬 open    default/nginx-controller-66c7d78958-rg2vr /usr/bin/nginx /etc/nginx/mime.types
💥 exit    default/nginx-controller-66c7d78958-rg2vr /usr/bin/nginx -c /etc/nginx/nginx.conf -s reload 0

Note. The examples throughout this article show different workload types (nginx, nodejs-app, etc) to demonstrate how Tetragon works across various applications. The policies are workload-agnostic - the same approach applies regardless of your tech stack.

Important: limit scope to avoid log overload

Collecting everything generates gigabytes of logs per hour in busy clusters. Use podSelector to target specific workloads:

spec:
  podSelector:
    matchLabels:
      app: my-critical-app
  kprobes:
  - call: "sys_execve"
    # ... rest of policy

This focuses observability on one application at a time, keeping overhead low.

Best practice. Run in audit mode for at least 24 hours in staging. Restart your pods during this period to capture startup-only processes (package managers, init scripts, etc.).

Enforce security policies

Now that you know what your application actually does, you can build enforcement policies.

Option A: Manual policy creation

Review your logs and identify the binaries your app legitimately needs. For a Node.js app, you might see only /usr/bin/node and /usr/local/bin/npm.

Create a policy that allows those and blocks everything else:

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "lockdown-nodejs-app"
spec:
  podSelector:
    matchLabels:
      app: my-nodejs-app
  kprobes:
  - call: "sys_execve"
    syscall: true
    selectors:
    - matchBinaries:
      - operator: "NotIn"
        values:
          - "/usr/bin/node"
          - "/usr/local/bin/npm"
      matchActions:
      - action: Sigkill

This policy uses the NotIn operator: if the binary is not in the allowed list, Tetragon kills the process.

You can monitor kprobe-based enforcement actions in real-time:

kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon -c export-stdout -f | jq 'select(.process_kprobe.action == "KPROBE_ACTION_SIGKILL")'

For a production workload, policies get more complex. Here's one that locks down a DaemonSet using a reactive enforcement strategy - it lets processes run but kills them during the exit syscall if they weren't whitelisted:

Note. This example uses TracingPolicyNamespaced instead of TracingPolicy. Use the namespaced version when you want policies scoped to a single namespace rather than cluster-wide. This gives teams autonomy to manage their own security policies without cluster-admin privileges.

apiVersion: cilium.io/v1alpha1
kind: TracingPolicyNamespaced
metadata:
  name: daemonset-apparmor-loader
  namespace: apparmor
spec:
  podSelector:
    matchLabels:
      daemon: apparmor-loader
  tracepoints:
  - event: sys_exit
    subsystem: raw_syscalls
    selectors:
    - matchActions:
      - action: Sigkill
      matchBinaries:
      - operator: NotIn
        values:
        - /sbin/apparmor_parser
        - /usr/bin/loader

This policy uses exit-time enforcement — it hooks the sys_exit tracepoint and checks the binary path when processes terminate. If the process wasn't whitelisted, Tetragon delivers SIGKILL. This reactive approach lets you observe full process behavior before enforcement, useful when validating edge cases.

Important note on monitoring enforcement. Tracepoint-based enforcement actions may not always generate visible events in the export-stdout logs. The enforcement happens at the kernel level and works (you'll see processes being killed), but the kill events themselves might not appear in the JSON export stream. This is a known limitation when using tracepoint policies with enforcement actions.

To verify enforcement is working, test it directly:

$ kubectl exec -it apparmor-loader-xxxxx -n apparmor -- bash
error: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "abc123...xyz789": OCI runtime exec failed: exec failed: unable to start container process: error writing config to pipe: write init-p: broken pipe

The "broken pipe" error confirms that Tetragon killed the bash process before it could even initialize. This is kernel-level enforcement in action.

For kprobe-based policies (like the sys_execve example above), enforcement events are more reliably exported and can be monitored.

Option B: Automated policy generation

Manually reviewing thousands of log lines is tedious. The community-built Tetragon Policy Builder automates this by providing a web interface for policy generation.

Installation:

git clone https://github.com/camptocamp/tetragon-policy-builder.git
cd tetragon-policy-builder
helm install -n kube-system policy-builder helm/tetragon-policy-builder

Access the web UI:

kubectl port-forward -n kube-system deploy/policy-builder-tetragon-policy-builder 5000:5000

The interface groups observed processes by workload, letting you select which binaries to allow and auto-generate enforcement policies:

The Policy Builder runs as a deployment that automatically connects to Tetragon's event stream via the gRPC API. It collects events, analyzes process execution patterns, and provides a web UI at http://localhost:5000 where you can:

1. View observed process executions grouped by pod/namespace

2. Select which binaries to allow

3. Generate TracingPolicy YAML with enforcement rules

4. Download the generated policy for kubectl apply

This is how it might look:

Testing your enforcement policy

Let's test the runtime blocking. First, apply your enforcement policy

kubectl apply -f lockdown-policy.yaml

Now try to run an unauthorized command

kubectl exec -it my-nodejs-app-xxxxx -- /bin/bash

Expected result: command terminated with exit code 137.

Exit code 137 = SIGKILL. Tetragon killed the process in the kernel before /bin/bash could even initialize. If an attacker compromised your app and tried to run reverse shells, crypto miners, or exfiltration tools, they'd hit the same instant block.

For comparison, let's try an allowed binary:

kubectl exec -it my-nodejs-app-xxxxx -- /usr/bin/node --version

This works because /usr/bin/node is in the whitelist.

Real-world considerations

How Tetragon compares to alternatives:

vs. AppArmor/SELinux: Tetragon is Kubernetes-native (no node-level profile management) and works without modifying container images.
vs. Falco: Falco detects threats; Tetragon blocks them. You can run both - use Falco for detection/alerting and Tetragon for enforcement.
vs. OPA/Gatekeeper: Those prevent bad deployments, Tetragon prevents bad runtime behavior.

Performance impact:

eBPF is remarkably efficient, but enforcement does add overhead:

Observation mode: ~1–2% CPU overhead per node
Enforcement mode: ~2–5% CPU overhead per node

For me, this is a worthwhile tradeoff. The kernel-level enforcement stops entire classes of attacks that would otherwise succeed even with perfect image scanning and admission control.

Rollout strategy

1. Start in staging: Run audit mode for a week

2. Review patterns: Look for unexpected processes (they might be legitimate!)

3. Test on non-critical workloads: Validate enforcement without risking production

4. Gradually expand: Add enforcement to more workloads as confidence grows

Conclusion

Tetragon brings kernel-level runtime enforcement to Kubernetes without requiring changes to your applications or container images. By following the three-stage workflow - deploy, observe, enforce - you can build security policies that block real attacks while allowing legitimate application behavior.

The observe-first approach is key. Rather than guessing what your app needs and blocking everything else, you let the application tell you what it does, then enforce those boundaries. This dramatically reduces false positives and makes runtime security practical for real-world workloads.

If you're serious about defense-in-depth for Kubernetes, Tetragon is worth evaluating. It's not a replacement for other security layers, but it fills a critical gap: stopping malicious processes at the kernel level, at the moment they try to execute.

Coming next: In an upcoming article, I'll explore Hubble, another eBPF-powered tool from the Cilium project. While Tetragon focuses on runtime security enforcement, Hubble provides deep network observability for Kubernetes - giving you visibility into service dependencies, network flows, and application behavior at the kernel level. Stay tuned!

#kubernetes #security #zero-trust #cilium #ebpf

< Go to the original