Towards improving code integrity on Linux
Herein we present a recipe to improve the security posture and prevent attacks on code integrity of Azure Linux Instances that are used to host container workloads.
Managing deployments at Scale
When operating on a large scale in Azure, we used image-based deployments to limit the number of bespoke instance configurations prevalent in the fleet especially at the O/S level. The primary components of a Linux based Virtual Machine image are the Kernel, Root Filesystem and several filesystems that are created over various disk partitions. We established the provenance and trust of all the binaries deployed in the fleet by requiring secure-boot that is engaged through Azure Trusted Launch. We implemented signature validation of the OCI containers through containerd.
Protecting the Root Filesystem
We implemented DM-Verity Protection on the Root Filesystem which rendered it immutable at runtime. Any offline attacks including modifications to its contents would be made tamper resistant by the DM-Verity feature. We protected the DM Verity root-hash by saving it as part of the kernel command line within the initramfs and enclosing it in a signed Unified Kernel Image (UKI). The secure-boot feature verified the UKI signature which effectively is a verification for the enclosed kernel, the kernel command line including the DM-Verity root hash and the contents of the initramfs.
Runtime Code Integrity
We engaged the Integrity Protection Enforcement (IPE) feature which operates in the form of a Linux Security Module within the Linux Kernel. IPE leverages the immutable security properties of system components when making security decisions. It places itself in the control path of file access in the Linux Kernel and checks the provenance of the files to either being protected by DM-Verity, FS-Verity or their origin in the initramfs (which is verified by the bootloader) and grants access according to a configurable policy. By leveraging provenance to DM-Verity and FS-Verity, IPE also protects from offline attacks on the system.
IPE Policy
The IPE Policy is stored in the securityfs (which is used by security modules) and is expected to be signed by a key in the kernel’s trusted key ring.
Protecting OCI container layers using IPE
Our chosen problem space consists of system workloads that are exclusively in the form of OCI containers. We enforce signature validation through containerd for all the OCI containers being inducted into the system. Additionally, it is required to protect the container layers from offline tampering. Therefore, we must enforce DM-Verity support on container layers. All the OCI containers share the same Linux Kernel from their host system which includes the single IPE policy that by default encompasses all the OCI containers as well as the host components. Therefore, the IPE LSM can be leveraged to protect code integrity for our OCI container workloads.
Chain of Trust for IPE
For performance reasons, IPE functionality relies on the provenance established by other frameworks such as DM-Verity. Secure boot is a necessity for IPE to trust content stored in the initramfs which is verified by the bootloader which in turn is verified by the firmware (UEFI). The UEFI DB contains the keys trusted by the cloud provider. When an IPE policy is updated, it is expected to have been signed by a key that is chained to the kernel’s trusted keyrings.
Trusting 3rd Party Signed out-of-tree kernel modules through IPE
The Machine Owner Key (MOK) provides a way for users to add their keys to the system’s trusted key database to enable the kernel to securely load their modules that are not built by the entity that owned and signed the kernel. However, it is not scalable for a cloud provider to provision users’ keys through MOK on their platform that comprises thousands of hosts and millions of Virtual Machine instances.
The keys used to sign the out-of-tree drivers can be counter-signed by a key that will be provisioned in the kernel’s trusted key ring and the resultant key added to the kernel’s secondary key ring. This would allow the out-of-tree drivers to be loaded by the kernel in a trustworthy manner.
Finally, if a volume protected by DM-Verity on the system and therefore trusted by IPE through the secure boot chain contains out-of-tree drivers being loaded by the kernel, we are contemplating if these drivers can be trusted by the kernel even if they are not signed by a key rooted to the kernel’s key ring.
SELinux
SELinux is another Linux Security Module that executes as part of the Linux Kernel and provides extensive Mandatory Access Control. Whereas SELinux cannot function while the system is offline and therefore cannot protect against offline attacks on code integrity, it complements IPE’s inability to enforce access control beyond code integrity. For instance, when SELinux is enforced, the CAP_MAC_ADMIN capability will be required to change the IPE Policy file. Whereas IPE can enhance code integrity through provenance to verified and immutable system components, it cannot partition access between different roles on the system from accessing resources with no intent to modify.
Lifecycle
Whereas some scenarios are amenable to replacement, some deployments prefer to be updated in-place. Software will definitely need to be patched for security issues and bugs. Since we made the root file system immutable, one easy approach to patch it is to replace it using an A/B update with a patched version of it. This would require a partition design that would lend itself to such A/B updates.