OxCaml logo Jane Street logo

Adding null to types

This document describes the way we can add a null value to types, thus granting a non-allocating version of option. This extension builds on the layouts system described in the main document for unboxed types. Before reading this document, you may wish to read up through the layouts section of the main document.

Adding null

Now that layouts provide a way to segment the world of types, we can leverage this feature to provide an option type – dubbed or_null – that requires no allocation. This section describes how it all works.

The value_or_null layout

The key observation powering or_null is that no ordinary OCaml value equals the word 0. A valid pointer will always be non-null, and an immediate (represented as a tagged integer) will always have its bottom bit set. Thus, all are different from 0. Therefore, we can safely update the garbage collector not to traverse null pointers.

We thus want t or_null to be just like t, except now with a meaning for 0. There is still a small problem, though: we cannot ever have t or_null or_null, as that gives two meanings for 0. We thus say that the argument to or_null must be a non-null type; that is, it has room for 0. However, t or_null is a maybe-null type; that is, it uses 0 in its representation.

We call the legacy OCaml non-null layout value and the new maybe-null layout value_or_null.

Here is the layout structure:

              any
               |
         value_or_null
               |
             value

Note that value_or_null is above value. This is because anything that can be done with a maybe-null type can be done with a non-null one, but non-null types have an extra capability (they can be the argument to or_null); this is thus the usual case where a subtype has an extra capability over the supertype.

The layout value_or_null is concrete: we can compile a function that manipulates value_or_nulls.

Along with the kind-system changes above, the following type will be in the initial environment:

type ('a : value) or_null : value_or_null

The with-kinds feature allows 'a or_null to mode-cross the same way that 'a does – this permits, e.g. int or_null to still be subject to optimizations around immediates.

The or_null type has constructors This and Null (usable for both construction and pattern-matching), subject to the usual type-based disambiguation.

Defaults

In the diagram above, the value_or_null layout includes types that do contain null, whereas the value layout is the one that includes all of today’s types, and is the default for abstract types and type variables. Why?

Whatever default we choose for abstract types will also determine the default for type variables due to backwards compatibility. Choosing non-null would mean that all pre-existing OCaml types are valid parameters for 'a or_null, whereas choosing maybe-null allows or_null to be a type argument.

Most types in OCaml programs lack arguments, and those that do are typically container types like 'a list, which we can modify preemptively. Thus, we default abstract types to non-null to reduce the annotation burden on programmers.

Arrays

The float array optimization demands that all types in OCaml are separable: either all elements of that type are float values, or none of them are. But float or_null is not separable: as possible values, it has all floating-point numbers, plus the null pointer. Therefore, float or_null arrays must be forbidden.

This can be done by either declaring float to be with-null or by requiring array elements to be non-null. Since non-null is the default, the former would break legacy code, so we choose the latter. We introduce the any_non_null layout for array elements.

Examples

A safe, non-allocating hd

(* in list.ml *)
let hd' = function
  | [] -> Null
  | x :: _ -> This x
;;

(* in list.mli *)
val hd' : 'a t -> 'a or_null
(* ['a] has layout [value] by default*)

(* somewhere else *)
let f xs default = match List.hd' xs with
  | Null -> default
  | This x -> x
;;
(* we'll infer [f : 'a list -> 'a -> 'a] *)

Note that 'a list can be modified to hold or_null values, and the specialized hd' function will still work.