Rambles around computer science

Diverting trains of thought, wasting precious time

Tue, 08 Apr 2014

Dynamic linking and security

The thoughts in this post were provoked after reading Tim Brown's very interesting Breaking the Links article.

Dynamic linkers are notable for privilege escalation bugs. The reason is their interaction with the setuid mechanism, and indeed any mechanism that associates privileges with an executable. Unix's original model where executables are trusted in their entirety is fundamentally flawed on modern platforms that have shared libraries, where executables usually link in other code, some of which can be supplied by the user. Rather than getting rid of the now-flawed setuid mechanism, currently dynamic linkers instead impose a raft of ad-hoc restrictions, lashed together in the hope of closing off any paths by which user-supplied code can get into a setuid process. They must also balance this against another goal: to avoid the unwanted side-effect of ruling out some perfectly trustworthy compositions. Unfortunately, these ad-hoc measures invariably fail on both counts.

What does setuid mean? It means that the invoking user has access to any behaviour allowed by the setuid program, as executing with the program owner's effective uid. Attackers escalate their privileges by introducing unanticipated code which widens that set of behaviours. Can we take a better approach? One naive idea would be to construct the process as normal, and then check that it includes only trusted code; at that point, we decide whether it runs with elevated privileges or not. (A wart is that we also have to account for code loading after the start of execution; let's ignore that for now.)

Does this break anything? Certainly it will do. I might run a job that spawns a process tree in which some subprocess is setuid. Normally I can run the tree with some other library LD_PRELOADed, expecting that although my library won't get preloaded into the setuid process, that process will still run with elevated privileges. Under our proposed new model, if we do the preloading then discover that the preloaded library is not trustworthy, we will run it with lower privileges, and likely break the process tree (assuming the process really needed to be setuid).

This is a feature interaction, and what we need is a policy for resolving the interaction. Current Unices have the policy that “setuid clobbers LD_PRELOAD”. The alternative we just considered is that “LD_PRELOAD clobbers setuid”. Neither of these seems adequate. Perhaps instead we can evolve things towards a more subtle mechanism that can avoid the interaction in the first place, perhaps by selecting among untrusted and trusted libraries. For example, if there are multiple available versions of some library, we might use the more trustworthy one instead of the one that a fixed set of name lookup rules guides us towards. In general, we can see this as resolving ambiguity among a set of depended-on library specifications in a way that maximises value (both trust and functionality).

Doing so requires a way to designate what code is trusted, not just what executables are trusted. We also need a sense of what alternative ways there are of satisfying “the same” link requirement. I have been using the example of LD_PRELOAD so far, but on ELF platforms, link requirements (beyond the executable) are specified as either a PRELOAD or (more often) a NEEDED, a.k.a the DT_NEEDED header of ELF's .dynamic section.

To find alternative implementations of “the same” requirement, we can mine the redundancy inherent in RUNPATH, LD_LIBRARY_PATH and perhaps the multiple notions of ORIGIN that can created by hard-linking. Each of these might provide multiple candidate libraries. Setting up a fake ORIGIN is a trick familiar to crackers, but we can turn it around by enumerating all possible ORIGINs of a given shared object and considering all the libraries we find there. (Sadly this requires a scan over all directories in the filesystem, and in the common case will yield only one library. But this approach will defeat link-based attacks, since even after hard-linking, we will still find the original library, and any sensible trust heuristic will select it in preference.) The ABI tag matching (modified by LD_ASSUME_KERNEL) is another instance of how the linker will look for libraries in particular places satisfying particular properties, in a way that is currently very rigid but could be generalised into a search/optimisation problem where paths supplied by developers, administrators and users are used as hints and bootstrapping input, rather than definitive instructions.

This approach brings two further problems. Firstly, what's to prevent us from choosing a probable-looking binary that is semantically broken (with respect to our use of it)? We can argue that all binaries with the same soname should be interchangeable, but in practice there will be difficulties. And matching by soname might be too restrictive anyway. Secondly, like any search- or rule-based system, our approach has a “delocalising” effect, lessening the administrator's explicit control and making the linker's behaviour more complex to configure and debug.

Another subtlety is that trust in the part is not the same as trust in the whole. Even if we refine Unix's notion of trustedness down to libraries rather than just executables, some exploits can work by combining trusted code in untrusted ways. The case of another linker exploit, CVE-2010-3856, is one instance of this: the library is sane enough that it could easily be deemed trusted, but we can construct a very specific context in which it is anything but. (This context is: use it as a linker-auditing library to a setuid binary, causing its constructor to be run with elevated EUID, hence allowing a temporary file exploit that would not emerge in “normal” contexts where the constructor did not have elevated privileges.) This is a classic “confused deputy” situation.

Confused deputies are always a good argument for yet finer-grained models of privilege, such as capabilities. So it's not clear whether we would get much security value from search-based link-time composition, relative to plumbing a better model more deeply into our operating system.

[/research] permanent link contact

validate this page