Live Patching: a Down in the Trenches View
Live patching is is an interesting answer to the question of how to minimize downtime, and sometimes risk, when performing a security update compared to other options. Typically live patching is only done to operating systems and hypervisors. Everyone should use live patches from their software vendor rather than making their own. But how does live patching work under the hood? How are they made? And if you don't have a software vendor, is it an impossibly difficult task?
Trampolines and editing a function in place are two different methods of performing a live patch. Trampolines have fewer constraints and can use affordances present for probes or tracing.
Blindly generating a live patch may result in a change that is ineffective or actively corrupts data, even if it can be naively compiled and applied. The ease and practicality of making a live patch depend on when the code being modified is executed, if and how data structures are modified, and how those data structures are used. Some security updates also rely on updating processor microcode. Hooks can be used to check the safety of applying a patch or to edit existing data structures.
Live patch tools, at least for Xen, work by comparing the object code for a function pre and post patch. Using the same compiler and compiler options when making a livepatch as with the original executable is safest. There can also be changes to a function's object code unrelated to changing its source code. Comparing the pre and post assembly is a useful tool for reviewing these changes and also as a cross-check for the safety and correctness of the intended update. Some specific examples and mitigations will be discussed.
Sometimes bugs in live patching code mean a live patch works on one processor family but not another. Live patches can also be stacked on top of each other. Live patches may exceed some payload size, or they may modify functions already modified by an earlier live patch. Care must be taken to avoid race conditions or invalid state if a logical patch cannot be applied as part of a single operation.
Some familiarity with C and what assembly and machine code is will be helpful for understanding this presentation. Live patching in the Xen hypervisor, which is simpler software than Linux, will be the primary example though Linux will be discussed at a high level.
This presentation is from the perspective of someone who generated live patches as one task among many, who sometimes opted not to live patch a change even if a live patch was hypothetically possible, and will not exhaustively cover all failure cases or methodology. Any use of information from this presentation is at your own risk. Presenter will not be responsible for any crashes, corrupted data, or devices set on fire from applying custom-built live patches.