Rust for Morello
Overview usize pre-RFC People Links

Traditionally, pointers are simply a number containing the address of a location in memory. CHERI capabilities augment this by adding several pieces of metadata:

This causes a problem for ports of Rust.

diagram of the structure of a capability in memory

Rust defines a type, usize, which is an integer type the same size as a pointer. Because capabilities significantly change the representation and behavior of pointers, any Rust port to a CHERI-enabled architecture will have to change the definition of usize. The definition of usize serves a number of uses:

  1. array indexing (data[4])
  2. representing sizes of objects (std::mem::size_of::<T>())
  3. storing addresses for more complex pointer arithmetic (pointer as usize&!1)

There are two reasonable approaches to solving this:

  1. make usize 128 bits wide to match the in-memory size of a capability
  2. make usize 64 bits wide to match the range of addresses a capability can reference

Solution 1 was explored by Nicholas Sim as part of a masters project, with the conclusion that it lead to compromised performance and was not ideal. In use cases 1 and 2 it doubles the amount of storage used for no gain, increasing memory overhead and wasting processor time. In use case 3 it allows “round trip” pointer casts (pointer as usize as *const T) to work, which is nice, but may result in strange behavior in any code that assumes that a pointer cast to usize only contains an address.

Solution 2 breaks the assumption that round trip casts can be performed safely (the resulting capability will be invalid), but has the advantage of avoiding all of the pitfalls of solution 1 (performance impact and broken assumptions in existing code).

Our fork of the Rust compiler implements solution 2. Pointer types (*const T, *mut T) are 128 bits wide and represented using CHERI capabilities. usize is a 64 bit wide unsigned integer type. Casting a capability to usize, &data as *const _ as usize, will get the address of the memory being pointed to, and discard any metadata. Casting a usize to a capability, 0xdead_beef as *const T, will produce an invalid capability (the validity tag will be unset, dereferencing will trigger an exception).

Discussion about how best to solve this problem in Rust proper is ongoing. In August 2023 we advanced a “pre-RFC” proposing our approach as a solution, which lead to some interesting discussion, especially on Zulip. This issue has previously been talked about on a number of occasions (the pre-RFC includes a list of previous discussions).

One recurring source of concern is the potential for breakage of existing Rust programs. Likely sources of bugs identified so far include:

To help address some of this concern, we organised an experiment using the Rust community's Crater tool. Crater compiles a very large number of Rust projects (about 380,000) from around the community using a given build of the Rust compiler. Our experiment used this to apply an already existing lint, fuzzy_provenance_casts, which detects casts from usize to pointer types that could later result in exceptions. We found that less than 1% of the projects tested had potential to generate CHERI exceptions due to casting. The full analysis is available on GitHub.