There are many file integrity monitoring solutions available today. But when it comes to monitoring real-time file events in production environments, it is not easy to find a solution that combines a large coverage of file events with a low overhead. There are two main approaches: user space solutions that come with a huge overhead or approaches based on kernel modules, which are hard to operate. In this blog post we will advocate for a third approach, based on eBPF. eBPF is a kernel feature that allows developers to monitor kernel-level activity without needing to add kernel modules or patches.
What is eBPF and how does it compare to existing approaches?
eBPF is a quite recent addition to the Linux kernel. BPF was originally introduced to speed up packet filtering by pushing most of the processing in kernel space. BPF was then extended, and eBPF was introduced so that it is now possible to hook almost anywhere into the kernel. While kernel modules are usually the go-to solution to modify or monitor the kernel, they introduce a significant safety risk that most organizations will not want to take. eBPF, on the other hand, runs in a dedicated virtual machine within the kernel in order to ensure that its programs cannot cause a kernel panic.
Just as BPF sped up packet processing, eBPF: 1) can speed up, 2) lower the overhead, and 3) improve the overall File Integrity Monitoring (FIM) software, by pushing logic into kernel space.
Most of the existing FIM solutions struggle with at least one of the following points:
- performance and overhead
- lack of process and container context
- signal to noise ratio
For example, a solution based on inotify lacks process and container context. While auditd is infamously known for its kernel overhead, it is noisy, and its output is hard to process. Perf and other kernel tracing solutions struggle with filtering as well, thus ultimately struggling with performance at scale. By using eBPF maps and the process context provided by the eBPF helper functions, it is possible to implement a powerful filtering mechanism, which we have to thank for no longer requiring a user space program to filter out noise. For example, it is expected for a service like
sshd to open a file such as
/home/myuser/.ssh/authorized_keys. However, you’d want to be alerted immediately if this file was opened by a web server. Filtering in kernel space also means less context switching, and therefore, less kernel overhead.
With the rapid adoption of eBPF for multiple monitoring use cases, new features are constantly added in the Linux kernel. For example, KRSI (Kernel Runtime Security Instrumentation) is an ongoing effort at Google, which will allow eBPF programs to be dynamically attached to LSM (Linux Security Modules) hook points and implement a MAC (Mandatory Access Control). For our FIM use case, this means enforcing file accesses based on user defined and context-aware policies.
That being said, developing a FIM solution in eBPF brings quite a few challenges.
Why is it hard?
The paths that are given to the kernels through syscalls may be relative to the current directory or contain the
“..” pattern. Resolving this path is a complex process, handled by the kernel before doing the actual work on the files. This means that we cannot simply hook on system calls but also need symbols deeper in the kernel. These symbols may be added, renamed, or even removed from one kernel version to another. This can become very challenging if you want to support a wide range of kernels (CentOS/RHEL 7 is using a 3.11 based kernel, released in 2013).
In addition to that, the configuration options used to compile the kernel also vary. For instance, on Red Hat kernels, the
CONFIG_SECURITY_PATH is not enabled which prohibits the use of the security_path_* function family.
Compiler optimization—such as function inlining, which is pretty common on small functions—can also make a symbol unstable, even though it seems to be present in /proc/kallsyms: the compiler can choose to inline only some calls to a method, dependending on the size of the calling function.
Hard links and mounts:
To identify files, mere mortals use paths—such as
/etc/shadow—but the kernel uses inodes. The very same file can be accessed using different paths in the case of hard links. If we want to fully monitor a file, we need to monitor all the hard links to this file. It would be simple if we could have all the paths that identify the same file but unfortunately, this is not the case, and we can only know the number of hard links that exist for a file.
The filesystem hierarchy is composed of folders and files. Mount points allow replacing folders or files with another hierarchy, which can be a disk or a folder from somewhere else in the hierarchy—the latter being called bind mounts. Let’s consider this example: you are monitoring the file
/etc/shadow. If the
/etcfolder is bind mounted to a different location, for instance /tmp/etc, the same file can be accessed from both
Considering the inodes instead of the paths in eBPF programs may help to solve both problems.
The code handling the syscalls is highly optimized. Even though eBPF bytecode is turned into native code at runtime, and only a limited number of instructions are allowed, placing kprobes can still hurt the performance of the system. Therefore, one needs to take extreme care of what’s done inside the eBPF programs—for example, limiting the number of eBPF maps that are accessed in the program.
Thousands of syscalls can be issued in a second, so passing all these events to userspace through a perf ring buffer for processing may be not an option: if the user space reader is too slow, the ring buffer will fill up, and events will be dropped. This is a problem for a security solution.
Implementing some kind of filtering in-kernel is a must-have to achieve good performance.
We have seen that the eBPF approach can address most of the problems that legacy solutions encounter. The eBPF constraints ensure safety while limiting the in-kernel overhead. Of course there is still the challenge of collecting and analyzing the information coming from the kernel, but we have seen that a naive in-kernel pre-filtering can help in this area. Finally, being able to run code that collects the events and context information in a safe way is something that only eBPF based solutions can provide. For more information on how such a solution is implemented, you can check out the File Integrity Monitoring feature implemented in the Datadog Agent.