Compile-time checking of non-allocating functions
This page describes support for [@zero_alloc] attributes on functions. These
annotations are checked statically at compile time. If the check passes, it guarantees
that at runtime there will be no allocations on any execution of the function, including
in the callees.
For example, the check succeeds on add and fails on add64:
let[@zero_alloc] add b x y = if b then x + y else x
let[@zero_alloc] add64 b (x:int64) (y:int64) = if b then Int64.add x y else x
The check forbids allocations on the OCaml heap and other events that may trigger garbage
collection. Local allocations are allowed. For example, the check succeeds on local_pair
but fails on pair:
let[@zero_alloc] local_pair x y = stack_ (x,y)
let[@zero_alloc] pair x y = (x,y)
The check is conservative in the sense that it may fail even if the function never allocates. In particular, all indirect calls are considered to be allocating. For example, the check fails on the following:
let[@zero_alloc] rec iter f l =
match l with
| [] -> ()
| hd::tl -> f hd; iter f tl
A caller of iter can be shown to be non-allocating, for example
the check succeeds when sum is compiled with -O3 and iter is inlined:
let[@zero_alloc opt] sum l =
let c @ local = ref 0 in
iter (fun n -> c := add (no_overflow !c n) !c n) l;
!c
Note the use of opt payload to mark this function as zero_alloc in optimized builds
only.
There is no need to annotate all callees of a zero_alloc function for the check to pass,
and in fact the callees might not pass the check, as the above example of iter shows.
The compiler analyzes all callees to determine their allocation behavior, but only checks
that functions annotated with @zero_alloc do not allocate. Summary of allocation
behavior of all functions is stored in .cmx file and used to analyze callers from other
compilation units.
How to enable the check?
The check is enabled by default for [@zero_alloc] attributes without opt payload.
These annotation should only be used on functions that pass the check in all builds,
even unoptimized builds.
For optimized builds, -zero-alloc-check all compiler flag enables
checking of [@zero_alloc opt] annotations in addition to the default ones.
Relaxed vs strict annotation
Our goal is to make the analysis precise enough to be useful in practice. Consider for
example the following function from core_kernel/iobuf/src/iobuf_expert.ml:
let[@zero_alloc] reinitialize_of_bigstring t ~pos ~len buf =
let str_len = Bigstring.length buf in
if pos < 0 || pos > str_len
then
raise_s
[%message
"Expert.reinitialize_of_bigstring got invalid pos" (pos : int) (str_len : int)];
let max_len = str_len - pos in
if len < 0 || len > max_len
then
raise_s
[%message
"Expert.reinitialize_of_bigstring got invalid len" (len : int) (max_len : int)];
let lo = pos in
let hi = pos + len in
unsafe_reinitialize t ~lo_min:lo ~lo ~hi ~hi_max:hi buf
;;
This function allocates on the error paths. However, such allocations rarely matter for
hot code performance. Therefore, by default, the check uses a “relaxed” meaning of
zero_alloc that ignores allocations on paths that lead to an exceptional return from the
function. The check succeeds on this example.
If we change the annotation of reinitialize_of_bigstring from [@zero_alloc] to
[@zero_alloc strict], the check will fail with the following error:
File "core_kernel/iobuf/src/iobuf_expert.ml", line 1948, characters 7-17:
Error: Annotation check for zero_alloc strict failed on function Iobuf_expert.reinitialize_of_bigstring (camlIobuf_expert__reinitialize_of_bigstring_9401)
The “strict” meaning ensures no allocation on any paths.
Treatment of raise_notrace vs raise with a backtrace
The relaxed meaning of @zero_alloc only applies to exceptions raised with a
backtrace. Our reasoning is that raise_notrace can be used for normal control flow,
whereas raise with a backtrace is expensive enough to only be performed on error paths
that can be ignored in zero_alloc code. If the code is compiled without -g flag or
with backtraces disabled, all raise are treated as raise_notrace. In that case, the
check will fail on all “relaxed” annotations that are not also “strict”.
Annotation assume
Sometimes, the “relaxed” meaning above is not sufficiently permissive.
For example, consider a function that allocates a designated value for
error, rather than raising an exception. The check of “relaxed”
annotation will fail. If we don’t care about allocation in the error
case, we can annotate the function that produces the error value with
[@zero_alloc assume]. For example, the check succeeds for test:
let[@cold][@zero_alloc assume] log_error msg n = print_string msg; Error n
let check ~limit l =
match l with
| None -> log_error "empty" 0
| Some n -> if n > limit then log_error "over" n else Ok ()
let[@zero_alloc] test ~limit ~default l =
match check ~limit l with
| Ok _ -> l
| Error n -> default
What if assume is not enough?
A common pattern is to perform error handling in an exception handler and return an error code instead of reraising. For example:
let[@zero_alloc] g = ...
let[@cold][@zero_alloc assume] handle_exn exn =
print_s [%message "something went wrong" ~_:(exn:exn)];
Answer.Option.none
let[@zero_alloc] f () =
match g () with
| exception exn -> handle_exn exn
| answer -> Answer.Option.some (answer * 2)
The check of f fails in this example, even when the check of g succeeds. The check of
g ignores allocations in g on a path that returns from g by raising an exception,
but f catches the exception and returns normally. As a result, these allocations in g
appear on a path to a normal return from f and the check of f fails. It is not enough
to annotate handle_exn with [@zero_alloc assume] because it still permits
handle_exn to return normally to f and continue executing
within the “zero alloc” scope of f.
To address it, handle_exn can be annotated with [@zero_alloc assume error]. The
intended meaning of assume error is to exit the scope of “zero_alloc” code due to an
error case in which we do not care about allocations. With assume error, the check
ignores all allocations on paths that go through handle_exn and treats
handle_exn as if it gets “stuck” (does not return normally and does not raise).
Note that allocations after a call to handle_exn are also ignored, in addition to
allocation leading up to the call and within handle_exn itself.
Where can I use “assume”?
It is possible to place [@zero_alloc ..] annotation on function definitions and
let-bindings of functions (as in the example above).
It is also possible to annotate function applications
with [@zero_alloc assume ..]. In the above example, instead of annotating
log_error definition, we can annotate specific call sites:
let check ~limit l =
match l with
| None -> log_error "empty" 0
| Some n -> if n > limit then (log_error[@zero_alloc assume]) "over" n else Ok ()
It is also possible to annotate applications
of external functions with [@zero_alloc assume ..].
Functions that are not cold and may be inlined can also be annotated with
[@zero_alloc assume ..] (this was not the case in early versions of the checker).
It is not yet possible to annotate arbitrary sub-expressions or code blocks.
assume_unless_opt vs opt vs assume
A function annotated with [@zero_alloc assume] is never checked.
A function annotated with [@zero_alloc opt] is not checked in a non-optimized build.
This is inconvenient because the signature of such a function cannot be annotated with
[@zero_alloc] and its callers will often fail the check in a non-optimized build too,
and would also need to be annotated with [zero_alloc opt] and therefore defer their
checking to optimized builds.
The payload assume_unless_opt can be used to ensure that a function is zero_alloc in
an optimized build and allow its callers to treat it as non-allocating even in
non-optimized builds. This allows for those callers to be checked as much as possible in
non-optimized builds, to detect as early as possible if an allocation is introduced
elsewhere.
By default, [@zero_alloc assume_unless_opt] is treated as
[@zero_alloc assume] on function applications and implementations.
If -zero-alloc-check all compiler flag is used, then [@zero_alloc assume_unless_opt]
is treated as [@zero_alloc] on function implementation,
and ignored on application.
Use in signatures
Signature items may be annotated zero_alloc, as in:
val[@zero_alloc] f : int -> int
Such annotations have two effects:
- In modules implementing such a signature, the corresponding function is checked for
zero_alloc. The implementation of the function need not be annotated explicitly withzero_alloc(this was not the case in early versions of the checker). - The
zero_allocanalysis off’s callers can use the factfis zero alloc. This does not rely on inlining, sofis reliably known to be zero alloc even when compiling an application of it in a different library.
Legal payloads and module typing
The payload of a such an annotation may contain the strict and opt
directives (as well as a new arity directive, more on which below). If opt
is present, the information will only be available to clients in optimized
builds. Implementing modules must annotate the corresponding function with a
zero_alloc attribute that is at least as strong, so, for example, this
typechecks:
module M : sig
val f[@zero_alloc] : int -> int
end = struct
let f[@zero_alloc strict] x = x
end
However, if the signature had strict and the struct did not, this would be
rejected by the typechecker.
It is legal to satisfy a signature’s zero_alloc requirement with a zero_alloc
assume annotation in the implementation.
Arity
The compiler will infer the arity of a declaration with a [@zero_alloc]
attribute by counting the number of arrows in the type. Applications of this
function will only be considered zero_alloc when fully applied. For
example, if a library declares a function f:
val[@zero_alloc] f : 'a -> 'a -> 'a
Then, in code that uses this library, the zero_alloc analysis will know that
applications of f to two arguments do not allocate. Applications of f to any
other number of arguments will still be considered allocating (unless
cross-module inlining is available and the compiler can see from f’s
definition that it also does not allocate in those cases).
Sometimes it is necessary to override the arity inferred by the compiler here.
For example, the compiler will reject this signature because the [@zero_alloc]
attribute is present on a declaration without any arrows in its type:
type t = int -> int
val[@zero_alloc] f : t
This behavior can be overridden by specifying the arity manually:
type t = int -> int
val[@zero_alloc arity 1] f : t
There are also cases where arrows are present but is still desirable to override the arity, as here:
module M : sig
val[@zero_alloc arity 1] f : int -> int -> int
end = struct
let[@zero_alloc] f x =
if x = 42 then fun y -> (y,y) else fun y -> (y,y+1)
end
Attribute [@@noalloc] vs [@zero_alloc]
The check of [@zero_alloc] annotations on OCaml functions relies on [@@noalloc]
annotations of any external declarations they refer to in order to determine the
allocation behavior of the corresponding external functions. Without [@@noalloc], the
check conservatively assumes that external code allocates. The check treats [@@noalloc]
attribute using the “strict” meaning, as if the external declaration is annotated with
[@zero_alloc assume strict].
There is currently no way to annotate external declarations with
[@zero_alloc assume] for the “relaxed” meaning of the check. A
workaround is to annotate the call sites with [@zero_alloc assume].
An alternative workaround is to define a wrapper OCaml function for
the external and annotate the wrapper with [@zero_alloc assume][@inline always]
to make sure it is always inlined and has no
impact on the generate code.
Note that the presence of [@@noalloc] affects the generated code whereas [@zero_alloc]
is only used for checking and error reporting at compile time.
Interaction with safepoints in OCaml 5
Currently, the check ignores poll instructions (otherwise all loops would be considered
allocating when poll insertion is enabled). Therefore, [@zero_alloc] is not sufficient
to ensure the absence of context switches.
Interaction with inlining and other optimizations
The check is very sensitive to the optimization level and other compilation flags. This is by design and not expected to change. To give the most accurate results, the check for allocations is performed late in the compilation process, after all the optimizations such as inlining, specialization, static allocation and unboxing. The result of the check can sometimes be surprising and counter-intuitive at the source code level.
Dynamic vs static checking
Expect_test_helpers_cre.require_no_allocation counts the exact
number of words allocated in a particular program execution at
runtime, whereas [@zero_alloc] check considers all possible
executions without ever running the program and may over-approximate
the number of allocations. The two approaches are complementary.
We recommend using both.
What is the overhead?
When the check is enabled, the overhead on build time is low. It is just one additional lightweight pass on each compilation unit.
The size of .cmx files can increase slightly. Preliminary measurements on the compiler
distribution itself showed less than 0.5% increase with flambda2. Current
format is naive and can be optimized if this becomes a bottleneck.
There is no runtime overhead!
There is no change to generated code as a result of enabling the check or adding
[@zero_alloc] annotations on functions, except the described below.
Dead code elimination
The presence of zero_alloc annotation can prevent dead code elimination.
Consider for example the function foo:
let[@zero_alloc strict] foo t =
if debug then log "foo" t [%sexp_of: _ t];
bar t
We expect this function to pass the check when debug is statically set to false, and
fail otherwise.
If foo is not exposed in the interface and gets inlined into all call sites, the optimizer
can remove the definition of foo. The presence of zero_alloc annotation delayes the elimination of foo so that the compiler can perform the check after all other optimizations.
The presence of the function may prevent some other
optimizations from triggering.
How do I figure out why the check fails on my function?
When the check fails, the compiler reports all allocations and other expressions such as indirect calls that lead to the failure of the check. The source locations of the offending allocations is shown in the error messages, including inlining information.
For example, compiling this (contrived) example
let[@inline never] bar x = [x+1;x+x]
let[@zero_alloc] foo b x f =
let y =
match bar x with
| [c] -> f c
| [a;_] ->
[%probe "test" (print_int x)];
(a, a)
| _ -> (x,x)
in
Dep.inline_always y
fails with an error message that looks like this:
File "test.ml", line 3, characters 5-15:
3 | let[@zero_alloc] foo b x f =
^^^^^^^^^^
Error: Annotation check for zero_alloc failed on function Test.foo (camlTest__foo_3_6_code)
File "test.ml", line 5, characters 10-15:
5 | match bar x with
^^^^^
Error: called function may allocate (direct call camlTest__bar_2_5_code)
File "test.ml", line 6, characters 13-16:
6 | | [c] -> f c
^^^
Error: called function may allocate (indirect call)
File "test.ml", line 8, characters 6-35:
8 | [%probe "test" (print_int x)];
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: expression may allocate
(probe test handler camlTest__probe_handler_test_4_7_code)
File "test.ml", line 9, characters 6-12:
9 | (a, a)
^^^^^^
Error: allocation of 24 bytes
File "test.ml", line 10, characters 11-16:
10 | | _ -> (x,x)
^^^^^
File "test.ml", line 12, characters 2-21:
12 | Dep.inline_always y
^^^^^^^^^^^^^^^^^^^
Error: called function may allocate (external call to caml_make_vect) (test.ml:12,2--21;dep.ml:2,2--38)
If we remove [@inline never] annotation from bar, the compiler can optimize some but
not all allocations in foo. The check will fail and show the remaining allocations::
File "test.ml", line 3, characters 5-15:
3 | let[@zero_alloc] foo b x f =
^^^^^^^^^^
Error: Annotation check for zero_alloc failed on function Test.foo (camlTest__foo_3_6_code)
File "test.ml", line 8, characters 6-35:
8 | [%probe "test" (print_int x)];
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: expression may allocate
(probe test handler camlTest__probe_handler_test_4_7_code)
File "test.ml", line 9, characters 6-12:
9 | (a, a)
^^^^^^
Error: allocation of 24 bytes
File "test.ml", line 12, characters 2-21:
12 | Dep.inline_always y
^^^^^^^^^^^^^^^^^^^
Error: called function may allocate (external call to caml_make_vect) (test.ml:12,2--21;dep.ml:2,2--38)
Compilation flag -zero-alloc-checker-details-cutoff n controls how many allocations are listed
in these error messages:
- positive
n: print at mostnwitnesses nis0: do not print any witnesses- negative
n: print all witnesses (default)
The default setting is convenient when adding annotations or editing annotated code. The other settings have lower memory overhead and are convenient for testing the analysis.
The flag can be passed in the usual way via dune settings. For example,
to turn off detailed messages in a particular library, add (ocamlopt_flags (:standard
-zero-alloc-checker-details-cutoff 0)) to the library stanza in dune file.
When the check fails, it is a compile error (not a warning) and compilation stops. The error message will show all functions that failed the check in the same compilation unit (not just the first function that failed the check).
Other ways to control the analysis
The following settings are mostly for testing and debugging of the analysis.
The check can be controlled by the compiler flag -zero-alloc-check with the following
arguments:
default: check attributes withoutoptpayload (default)check all: covers bothoptand default payloads and is intended to be used for optimized buildsnone: disable the checkopt: only check attributes withoptpayload and is intended for debugging
The flag can be passed to dune build as usual for the entire project or on a
per-library basis. For example, to disable regular and opt checks:
(ocamlopt_flags (:standard -zero-alloc-check none)
The check can also be controled on a per-file basis using the top-level attribute
[@@@zero_alloc check], [@@@zero_alloc check_none], [@@@zero_alloc check_all],
[@@@zero_alloc check_opt].
If information about dependencies is not available (i.e., a
build without cross module / library inlining or with -opaque), the dependencies will be
conservatively assumed to allocate and the check may fail.
Details of the analysis can be printed using -dzero-alloc compiler flag for debugging.
The analysis can be disabled using -disable-zero-alloc-checker flag.
This disables computation of function summaries (unlike
-zero-alloc-check none that disables checking of annotations). With
-disable-zero-alloc-checker flag, the checker will conservatively
assume that all functions may allocate, so if there are any
annotations, compilation will fail unless -zero-alloc-check none is
also passed.
Compilation flag -disable-precise-zero-alloc-checker
disables computation of summaries only for recursive functions
and functions whose definition appears in the generated code before
the definitions of all their dependencies. This is an escape hatch
for situations of unexpectedly high overhead of the analysis on
specific files, for example due to precise analysis of recursive
functions. It will cause all uses of these functions to be
conservatively treated and likely fail the check.
-zero-alloc-checker-join compilation flag provides a more fine-grained
control of precision.
Annotation [@@@zero_alloc all] and [@zero_alloc ignore]
Some files consist mostly of non-allocating code. To reduce the annotation burden, we
provide a top-level annotation [@@@zero_alloc all]. It requires that all function in the
compilation unit are not allocating. It is equivalent to adding [@zero_alloc]
annotation to all functions in the file.
To opt out individual function, use
[@zero_alloc ignore]. For example:
[@@@zero_alloc all]
let[@zero_alloc ignore] make x y = (x,y)
let fst (x,y) = x
let snd (x,y) = y
Note that [@@@zero_alloc all] does not cover the top-level effects (and any other module
initialization code that ends up in the compiler-generated entry function).
For example, the following file passes the check:
[@@@zero_alloc all]
let () = Printf.printf "Hello world!"
but this one fails:
[@@@zero_alloc all]
let[@inline never] test () = Printf.printf "Hello world!"
let () = test ()
More precisely, [@@@zero_alloc all] applies to all function symbols
emitted for the compilation unit, except the entry function (and
some compiler-generated wrapper functions). As such, it will apply to
inner functions but may not apply to some top level functions
optimized away by the compiler as discussed above.
[@@@zero_alloc all] and [@zero_alloc ignore] are somewhat more experimental than
other annotations. Their scope is global within a compilation unit is global and
they should appear at the top of the file before any other (otherwise
their effect is not well-defined).
Annotation [@@@zero_alloc all] vs [@@@zero_alloc check]
[@@@zero_alloc check] turns on the checking of zero_alloc annotations on functions,
whereas [@@@zero_alloc all] requires that all functions in the file are zero_alloc.
Syntactic limitation
The [@zero_alloc] annotation can only appear on functions, not values.
For example, if we add [@zero_alloc] attribute to recvfrom function defined in
lib/zero_udp/src/zero_udp.ml as follows:
let[@zero_alloc] recvfrom = Zero_unix.recvfrom
the compiler prints a warning:
File "lib/zero_udp/src/zero_udp.ml", line 42, characters 5-15:
Error (warning 53 [misplaced-attribute]): the "zero_alloc" attribute cannot appear in this context
A workaround is to add arguments explicitly:
let[@zero_alloc] recvfrom a b = Zero_unix.recvfrom a b
This limitation is not specific to [@alloc]. Other function-level attributes like
[@inline] and [@specialise] have the same limitation.
Limitations
Summary of known limitations
- We currently do not allow
[@zero_alloc]annotations on external function declarations. Applications of external functions can be annotated with[@zero_alloc assume]. - Annotations cannot be placed on arbitrary sub-expressions.
- Annotations on functions that would otherwise be optimized away may inhibit optimizations. The presence of annotations currently prevents the functions from being dead code eliminated, which in turn may prevent other code from being optimized and also increases the size of generated code.
- Functions annotated with
[@zero_alloc assume]are not analyzed at all. As a result, the summary of such a function can be more conservative than if the function was actually analyzed. For example, if the function does not allocate at all on any path, but assumed to bezero_allocusing the relaxed semantics. - Inlining functions annotated with
assumemay cause thezero_alloccheck of the caller to fail when the inlined body containstry..withand the exception handler may return not via araise. A workaround is to annotate the exception handler with[@zero_alloc assume error]. - Conservative handling of higher order function arguments, for example in
List.iter. Another common example where the check fails isMap.find_exnandHashtbl.memfrombaseif the underlying comparator and hash functions are not annotated with[@zero_alloc]in the signature. - The check may fail on non-allocating functions that manipulate arrays due to flat float optimization, but unfortunately, the error message is not helpful.
- Missing scoping of top-level
[@@@zero_alloc ...]annotations. - A function that does not allocate may fail the check if the analysis does not have access to summaries of functions in other libraries and thus any calls to such functions are conservatively considered as allocating.
- Syntactic limitation of
[@zero_alloc ..]attribute: can only appear on functions, not values. - Safepoints are considered non-allocating.