By default, Tetragon monitors process lifecycle, learn more about that in the
dedicated use cases.
For more advanced use cases, Tetragon can observe tracepoints and arbitrary
kernel calls via kprobes. For that, Tetragon must be extended and configured
with custom resources objects named TracingPolicy.
It can then generates process_tracepoint and process_kprobes events.
1 - Process lifecycle
Tetragon observes by default the process lifecycle via exec and exit
Tetragon observes process creation and termination with default configuration
and generates process_exec and process_exit events:
The process_exec events include useful information about the execution of
binaries and related process information. This includes the binary image that
was executed, command-line arguments, the UID context the process was
executed with, the process parent information, the capabilities that a
process had while executed, the process start time, the Kubernetes Pod,
labels and more.
The process_exit events, as the process_exec event shows how and when a
process started, indicate how and when a process is removed. The information
in the event includes the binary image that was executed, command-line
arguments, the UID context the process was executed with, process parent
information, process start time, the status codes and signals on process
exit. Understanding why a process exited and with what status code helps
understand the specifics of that exit.
Both these events include Linux-level metadata (UID, parents, capabilities,
start time, etc.) but also Kubernetes-level metadata (Kubernetes namespace,
labels, name, etc.). This data make the connection between node-level concepts,
the processes, and Kubernetes or container environments.
These events enable a full lifecycle view into a process that can aid an
incident investigation, for example, we can determine if a suspicious process
is still running in a particular environment. For concrete examples of such
events, see the next use case on process execution.
1.1 - Process execution
Monitor process lifecycle with process_exec and process_exit
This first use case is monitoring process execution, which can be observed with
the Tetragon process_exec and process_exit JSON events.
These events contain the full lifecycle of processes, from fork/exec to
exit, including metadata such as:
Binary name: Defines the name of an executable file
Parent process: Helps to identify process execution anomalies (e.g., if a nodejs app forks a shell, this is suspicious)
Command-line argument: Defines the program runtime behavior
Current working directory: Helps to identify hidden malware execution from a temporary folder, which is a common pattern used in malwares
Kubernetes metadata: Contains pods, labels, and Kubernetes namespaces, which are critical to identify service owners, particularly in a multitenant environments
exec_id: A unique process identifier that correlates all recorded activity of a process
As a first step, let’s start monitoring the events from the xwing pod:
Then in another terminal, let’s kubectl exec into the xwing pod and execute
some example commands:
kubectl exec -it xwing -- /bin/bash
whoami
If you observe, the output in the first terminal should be:
🚀 process default/xwing /bin/bash
🚀 process default/xwing /usr/bin/whoami
💥 exit default/xwing /usr/bin/whoami 0
Here you can see the binary names along with its arguments, the pod info, and
return codes in a compact one-line view of the events.
For more details use the raw JSON events to get detailed information, you can stop
the Tetragon CLI by Crl-C and parse the tetragon.log file by executing:
If we want to monitor execution of Executable and Linkable Format (ELF) or flat binaries
before they are actually executed. Then the process-exec-elf-begin tracing policy is a good first choice.
Note
The process-exec-elf-begin tracing policy, will not report the
different binary format handlers or scripts being executed, but will report
the final ELF or flat binary, like the shebang handler.
To report those another tracing policy can be used.
Before going forward, verify that all pods are up and running, ensure you
deploy our Demo Application to explore the Security Observability Events:
In addition to the Kubernetes Identity and process metadata,
ProcessKprobe
events contain the binary being executed. In the above case they are:
function_name: where we are hooking into the kernel to read the binary that is being executed.
file_arg: that includes the path being executed, and here it is /bin/busybox that is the real
binary being executed, since on the xwing pod the container is running busybox.
The binary /usr/bin/id -> /bin/busybox points to busybox.
Monitor process capabilities and kernel namespace access
Tetragon also provides the ability to check process capabilities and kernel
namespaces access.
This information would help us determine which process or Kubernetes pod has
started or gained access to privileges or host namespaces that it should not
have. This would help us answer questions like:
Which Kubernetes pods are running with CAP_SYS_ADMIN in my cluster?
Which Kubernetes pods have host network or pid namespace access in my
cluster?
Step 1: Enabling Process Credential and Namespace Monitoring
You should observe Tetragon generating events similar to these, indicating the privileged container start:
🚀 process default/privileged-nginx /nginx -g daemon off; 🛑 CAP_SYS_ADMIN
2 - Filename access
Monitor filename access using kprobe hooks
This page shows how you can create a tracing policy to monitor filename access. For general
information about tracing policies, see the tracing policy page.
There are two aspects of the tracing policy: (i) what hooks you can use to monitor specific types of
access, and (ii) how you can filter at the kernel level for only specific events.
Hooks
There are different ways applications can access and modify files, and for this tracing policy we
focus in three different types.
The first is read and write accesses, which is the most common way that applications access files. Applications can perform this type of accesses with a variety of different system
calls: read and write, optimized system calls such as copy_file_range and sendfile, as well
as asynchronous I/O system call families such as the ones provided by aio and io_uring. Instead
of monitoring every system call, we opt to hook into the security_file_permission hook, which is a
common execution point for all the above system calls.
Applications can also access files by mapping them directly into their virtual address space. Since
it is difficult to catch the accesses themselves in this case, our policy will instead monitor the
point when the files are mapped into the application’s virtual memory. To do so, we use the
security_mmap_file hook.
Lastly, there is a family of system calls (e.g,. truncate) that allow to indirectly modify the
contents of the file by changing its size. To catch these types of access we will hook into
security_path_truncate.
Filtering
Using the hooks above, you can monitor all accesses in the system. However, this will create a large number
of events, and it is frequently the case that you are only interested in a specific subset
those events. It is possible to filter the events after their generation, but this induces
unnecessary overhead. Tetragon, using BPF, allows filtering these events directly in the kernel.
For example, the following snippet shows how you can limit the events from the
security_file_permission hook only for the /etc/passwd file. For this, you need to specify the
arguments of the function that you hooking into, as well as their type.
- call:"security_file_permission"syscall:falseargs: - index:0type:"file"# (struct file *) used for getting the path - index:1type:"int"# 0x04 is MAY_READ, 0x02 is MAY_WRITEselectors: - matchArgs: - index:0operator:"Equal"values: - "/etc/passwd"# filter by filename (/etc/passwd) - index:1operator:"Equal"values: - "2"# filter by type of access (MAY_WRITE)
The previous example uses the Equal operator. Similarly, you can use the Prefix operator to
filter events based on the prefix of a filename.
Examples
In this example, we monitor if a process inside a Kubernetes workload performs a read or write in
the /etc/ directory. The policy may be extended with additional directories or specific files if
needed.
As a first step, we apply the following policy that uses the three hooks mentioned previously as
well as appropriate filtering:
Note that read and writes are only generated for /etc/ files based on BPF in-kernel filtering
specified in the policy. The default CRD additionally filters events associated with the pod init
process to filter init noise from pod start.
Similarly to the previous example, reviewing the JSON events provides additional data. An example
process_kprobe event observing a write can be:
In addition to the Kubernetes Identity
and process metadata from exec events, process_kprobe events contain
the arguments of the observed system call. In the above case they are
file_arg.path: the observed file path
int_arg: is the type of the operation (2 for a write and 4 for a read)
To delete the file-access Pod from the interactive bash session, type:
exit
Another example of a similar
policy
can be found in our examples folder.
Limitations
Note that this policy has certain limitations because it matches on the filename that the
application uses to access. If an application accesses the same file via a hard link or a
different bind mount, no event will be generated.
3 - Network observability
Monitor TCP connect using kprobe hooks
To view TCP connect events, apply the example TCP connect TracingPolicy:
On Linux each process has various associated user, group IDs, capabilities,
secure management flags, keyring, LSM security that are used part of the
security checks upon acting on other objects. These are called the task
privileges or
process credentials.
Changing the process credentials is a standard operation to perform privileged
actions or to execute commands as another user. The obvious example is
sudo that allows to gain high privileges and run commands
as root or another user. An other example is services or containers that can
gain high privileges during execution to perform restricted operations.
Composition of Linux process credentials
Traditional UNIX credentials
Real User ID
Real Group ID
Effective, Saved and FS User ID
Effective, Saved and FS Group ID
Supplementary groups
Linux Capabilities
Set of permitted capabilities: a limiting superset for the effective
capabilities.
Set of inheritable capabilities: the set that may get passed across
execve(2).
Set of effective capabilities: the set of capabilities a task is actually
allowed to make use of itself.
Set of bounding capabilities: limits the capabilities that may be inherited
across execve(2), especially when a binary is executed that will execute as
UID 0.
Secure management flags (securebits).
These govern the way the UIDs/GIDs and capabilities are manipulated and
inherited over certain operations such as execve(2).
Linux Security Module (LSM)
The LSM framework
provides a mechanism for various security checks to be hooked by new kernel
extensions. Tasks can have extra controls part of LSM on what operations they
are allowed to perform.
Tetragon Process Credentials monitoring
Monitoring Linux process credentials is a good practice to idenfity programs
running with high privileges. Tetragon allows retrieving Linux process credentials
as a process_credentials object.
4.1 - Monitor Process Credentials changes at the System Call layer
Monitor system calls that change Process Credentials
Tetragon can hook at the system calls that directly manipulate the credentials.
This allows us to determine which process is trying to change its credentials
and the new credentials that could be applied by the kernel.
This answers the questions:
Which process or container is trying to change its UIDs/GIDs in my cluster?
Which process or container is trying to change its capabilities in my
cluster?
Before going forward, verify that all pods are up and running, ensure you
deploy our Demo Application to explore the Security Observability Events:
In addition to the Kubernetes Identity and process metadata from exec events,
ProcessKprobe
events contain the arguments of the observed system call. In the above case
they are:
function_name: the system call, __x64_sys_setuid or
__x64_sys_setgid
int_arg: the uid or gid to use, in our case it’s 0 which corresponds to
the root user.
4.2 - Monitor Process Credentials changes at the Kernel layer
Monitor Process Credentials changes at the kernel layer
Monitoring Process Credentials changes at the kernel layer is also possible.
This allows to capture the new process_credentials that should be applied.
In addition to the Kubernetes Identity and process metadata from exec events, ProcessKprobe events contain the arguments of the observed system call. In the above case they are:
function_name: the kernel commit_creds() function to install new credentials.
process_credentials_arg: the new process_credentials to be installed
on the current process. It includes the UIDs/GIDs, the capabilities and the target user namespace.
Here we can clearly see that the suid binary is being executed by a user ID 11 in order to elevate its privileges to user ID 0 including capabilities.
Some pods need to change the host system or kernel parameters in order to
perform administrative tasks, obvious examples are pods loading a kernel
module to extend the operating system functionality, or pods managing the
network.
However, there are also other cases where a compromised container may want to
load a kernel module to hide its behaviour.
In this aspect, monitoring such host system changes helps to identify pods
and containers that affect the host system.
Monitor Linux kernel modules
A kernel module is a code that can be loaded into the kernel image at runtime,
without rebooting. These modules, which can be loaded by pods and containers,
can modify the host system. The
Monitor Linux kernel modules
guide will assist you in observing such events.
5.1 - Monitor Linux Kernel Modules
Monitor Linux Kernel Modules operations
Monitoring kernel modules helps to identify processes that load kernel modules to add features,
to the operating system, to alter host system functionality or even hide their behaviour. This
can be used to answer the following questions:
Which process or container is changing the kernel?
Which process or container is loading or unloading kernel modules in the cuslter?
Which process or container requested a feature that triggered the kernel to automatically load a module?
Are the loaded kernel modules signed?
Monitor Loading kernel modules
Kubernetes Environments
After deploying Tetragon, use the monitor-kernel-modules tracing policy which generates ProcessKprobe events
to trace kernel module operations.
do_init_module: the function call where the module is finaly loaded.
module_arg: the kernel module information, it contains:
name: the name of the kernel module as a string.
tainted: the module tainted flags that will be applied on the kernel. In the example above, it indicates we are loading an out-of-tree module, that is unsigned module which may compromise the integrity of our system.
Monitor Kernel Modules Signature
Kernels compiled with CONFIG_MODULE_SIG option will check if the modules being loaded were cryptographically signed.
This allows to assert that:
If the module being loaded is signed, the kernel has its key and the signature verification succeeded.
The integrity of the system or the kernel was not compromised.
Note Module signing increases security by identifying malicious modules loaded into the kernel. It is also possible to
deny loading such modules if the signature verification fails.
Before going forward, deploy the test-pod into the demo-app namespace, which has its security context set to privileged.
This allows to run the demo by mountig an xfs file system inside the test-pod which requires privileges,
but will also trigger an automatic xfs module loading operation.
In addition to the process metadata from exec events, ProcessKprobe event contains the arguments of the observed call. In the above case they are:
security_kernel_module_request: the kernel security hook where modules are loaded on-demand.
string_arg: the name of the kernel module. When modules are automatically loaded, for security reasons,
the kernel prefixes the module with the name of the subsystem that requested it. In our case, it’s requested
by the file system subsystem, hence the name is fs-xfs.
2. Kernel calls modprobe to load the kernel module
The kernel will then call user space modprobe to load the kernel module.
The ProcessExec event where modprobe tries to load the xfs module.
Note Here modprobe is started in the initial Linux host namespaces, outside of the container namespaces. When kernel
modules are loaded on-demand, the kernel will spawn a user space process modprobe that finds and load the appropriate
module from the host file system. This is done on behalf of the container and since its originate from the kernel then
the inherited Linux namespaces including the file system are eventually from the host.
3. Reading the kernel module from the file system
modprobe will read the passed xfs kernel module from the host file system.
This ProcessKprobe event contains the module argument.
find_module_sections: the function call where the kernel parses the module sections.
module_arg: the kernel module information, it contains:
name: the name of the kernel module as a string.
signature_ok: a boolean value, if set to true then module signature was successfully verified by the kernel. If it is false
or missing then the signature verification was not performed or probably failed. In all cases this means the integrity of the system has been compromised. Depends on kernels compiled with CONFIG_MODULE_SIG option.
Monitor Unloading of kernel modules
Using the same monitor-kernel-modules tracing policy allows to monitor unloading of kernel modules.
The following ProcessKprobe event will be generated:
Note Please note that some kernel module rootkits hide themselves by deleting their
entries from the kernel internal module lists while continuing to run in the background.
Monitoring module load operations allows to detect such cases
Tetragon is able to observe various security events and even enforce security
policies.
The Record Linux Capabilities Usage guide
shows how to monitor and record Capabilities checks
conducted by the kernel on behalf of applications during privileged operations. This can be used to inspect
and produce security profiles for pods and containers.
6.1 - Record Linux Capabilities Usage
Record a capability profile of pods and containers
When the kernel needs to perform a privileged operation on behalf of a process, it checks
the Capabilities of the process
and issues a verdict to allow or deny the operation.
Tetragon is able to record these checks performed by the kernel. This can be used to answer
the following questions:
What is the capabilities profile of pods or containters running in the cluster?
In addition to the Kubernetes Identity and process metadata from exec events, ProcessKprobe events contain the arguments of the observed system call. In the above case they are:
function_name: that is the cap_capable kernel function.
user_ns_arg: is the user namespace where the capability is required.
level: is the nested level of the user namespace. Here it is zero which indicates the initial user namespace.
uid: is the user ID of the owner of the user namespace.
gid: is the group ID of the owner of the user namespace.
ns: details the information about the namespace. is_host indicates that the target user namespace where the capability is required is the host namespace.
capability_arg: is the capability required to perform the operation. In this example reading the kernel ring buffer.
value: is the integer number of the required capability.
name: is the name of the required capability. Here it is the CAP_SYSLOG.
return: indicates via the int_arg if the capability check succeeded or failed. 0 means it succeeded and the access was granted while -1 means it failed and the operation was denied.