Compile-time checking of non-allocating functions
This page describes support for [@zero_alloc]
attributes on functions. These
annotations are checked statically at compile time. If the check passes, it guarantees
that at runtime there will be no allocations on any execution of the function, including
in the callees.
For example, the check succeeds on add
and fails on add64
:
let[@zero_alloc] add b x y = if b then x + y else x
let[@zero_alloc] add64 b (x:int64) (y:int64) = if b then Int64.add x y else x
The check forbids allocations on the OCaml heap and other events that may trigger garbage
collection. Local allocations are allowed. For example, the check succeeds on local_pair
but fails on pair
:
let[@zero_alloc] local_pair x y = stack_ (x,y)
let[@zero_alloc] pair x y = (x,y)
The check is conservative in the sense that it may fail even if the function never allocates. In particular, all indirect calls are considered to be allocating. For example, the check fails on the following:
let[@zero_alloc] rec iter f l =
match l with
| [] -> ()
| hd::tl -> f hd; iter f tl
A caller of iter
can be shown to be non-allocating, for example
the check succeeds when sum
is compiled with -O3
and iter
is inlined:
let[@zero_alloc opt] sum l =
let local_ c = ref 0 in
iter (fun n -> c := add (no_overflow !c n) !c n) l;
!c
Note the use of opt
payload to mark this function as zero_alloc
in optimized builds
only.
There is no need to annotate all callees of a zero_alloc
function for the check to pass,
and in fact the callees might not pass the check, as the above example of iter
shows.
The compiler analyzes all callees to determine their allocation behavior, but only checks
that functions annotated with @zero_alloc
do not allocate. Summary of allocation
behavior of all functions is stored in .cmx
file and used to analyze callers from other
compilation units.
How to enable the check?
The check is enabled by default for [@zero_alloc]
attributes without opt
payload.
These annotation should only be used on functions that pass the check in all builds,
even unoptimized builds.
For optimized builds, -zero-alloc-check all
compiler flag enables
checking of [@zero_alloc opt]
annotations in addition to the default ones.
Relaxed vs strict annotation
Our goal is to make the analysis precise enough to be useful in practice. Consider for
example the following function from core_kernel/iobuf/src/iobuf_expert.ml
:
let[@zero_alloc] reinitialize_of_bigstring t ~pos ~len buf =
let str_len = Bigstring.length buf in
if pos < 0 || pos > str_len
then
raise_s
[%message
"Expert.reinitialize_of_bigstring got invalid pos" (pos : int) (str_len : int)];
let max_len = str_len - pos in
if len < 0 || len > max_len
then
raise_s
[%message
"Expert.reinitialize_of_bigstring got invalid len" (len : int) (max_len : int)];
let lo = pos in
let hi = pos + len in
unsafe_reinitialize t ~lo_min:lo ~lo ~hi ~hi_max:hi buf
;;
This function allocates on the error paths. However, such allocations rarely matter for
hot code performance. Therefore, by default, the check uses a “relaxed” meaning of
zero_alloc
that ignores allocations on paths that lead to an exceptional return from the
function. The check succeeds on this example.
If we change the annotation of reinitialize_of_bigstring
from [@zero_alloc]
to
[@zero_alloc strict]
, the check will fail with the following error:
File "core_kernel/iobuf/src/iobuf_expert.ml", line 1948, characters 7-17:
Error: Annotation check for zero_alloc strict failed on function Iobuf_expert.reinitialize_of_bigstring (camlIobuf_expert__reinitialize_of_bigstring_9401)
The “strict” meaning ensures no allocation on any paths.
Treatment of raise_notrace
vs raise
with a backtrace
The relaxed meaning of @zero_alloc
only applies to exceptions raised with a
backtrace. Our reasoning is that raise_notrace
can be used for normal control flow,
whereas raise
with a backtrace is expensive enough to only be performed on error paths
that can be ignored in zero_alloc
code. If the code is compiled without -g
flag or
with backtraces disabled, all raise
are treated as raise_notrace
. In that case, the
check will fail on all “relaxed” annotations that are not also “strict”.
Annotation assume
Sometimes, the “relaxed” meaning above is not sufficiently permissive.
For example, consider a function that allocates a designated value for
error, rather than raising an exception. The check of “relaxed”
annotation will fail. If we don’t care about allocation in the error
case, we can annotate the function that produces the error value with
[@zero_alloc assume]
. For example, the check succeeds for test
:
let[@cold][@zero_alloc assume] log_error msg n = print_string msg; Error n
let check ~limit l =
match l with
| None -> log_error "empty" 0
| Some n -> if n > limit then log_error "over" n else Ok ()
let[@zero_alloc] test ~limit ~default l =
match check ~limit l with
| Ok _ -> l
| Error n -> default
What if assume
is not enough?
A common pattern is to perform error handling in an exception handler and return an error code instead of reraising. For example:
let[@zero_alloc] g = ...
let[@cold][@zero_alloc assume] handle_exn exn =
print_s [%message "something went wrong" ~_:(exn:exn)];
Answer.Option.none
let[@zero_alloc] f () =
match g () with
| exception exn -> handle_exn exn
| answer -> Answer.Option.some (answer * 2)
The check of f
fails in this example, even when the check of g
succeeds. The check of
g
ignores allocations in g
on a path that returns from g
by raising an exception,
but f
catches the exception and returns normally. As a result, these allocations in g
appear on a path to a normal return from f
and the check of f
fails. It is not enough
to annotate handle_exn
with [@zero_alloc assume]
because it still permits
handle_exn
to return normally to f
and continue executing
within the “zero alloc” scope of f
.
To address it, handle_exn
can be annotated with [@zero_alloc assume error]
. The
intended meaning of assume error
is to exit the scope of “zero_alloc” code due to an
error case in which we do not care about allocations. With assume error
, the check
ignores all allocations on paths that go through handle_exn
and treats
handle_exn
as if it gets “stuck” (does not return normally and does not raise).
Note that allocations after a call to handle_exn
are also ignored, in addition to
allocation leading up to the call and within handle_exn
itself.
Where can I use “assume”?
It is possible to place [@zero_alloc ..]
annotation on function definitions and
let-bindings of functions (as in the example above).
It is also possible to annotate function applications
with [@zero_alloc assume ..]
. In the above example, instead of annotating
log_error
definition, we can annotate specific call sites:
let check ~limit l =
match l with
| None -> log_error "empty" 0
| Some n -> if n > limit then (log_error[@zero_alloc assume]) "over" n else Ok ()
It is also possible to annotate applications
of external
functions with [@zero_alloc assume ..]
.
Functions that are not cold
and may be inlined can also be annotated with
[@zero_alloc assume ..]
(this was not the case in early versions of the checker).
It is not yet possible to annotate arbitrary sub-expressions or code blocks.
assume_unless_opt
vs opt
vs assume
A function annotated with [@zero_alloc assume]
is never checked.
A function annotated with [@zero_alloc opt]
is not checked in a non-optimized build.
This is inconvenient because the signature of such a function cannot be annotated with
[@zero_alloc]
and its callers will often fail the check in a non-optimized build too,
and would also need to be annotated with [zero_alloc opt]
and therefore defer their
checking to optimized builds.
The payload assume_unless_opt
can be used to ensure that a function is zero_alloc
in
an optimized build and allow its callers to treat it as non-allocating even in
non-optimized builds. This allows for those callers to be checked as much as possible in
non-optimized builds, to detect as early as possible if an allocation is introduced
elsewhere.
By default, [@zero_alloc assume_unless_opt]
is treated as
[@zero_alloc assume]
on function applications and implementations.
If -zero-alloc-check all
compiler flag is used, then [@zero_alloc assume_unless_opt]
is treated as [@zero_alloc]
on function implementation,
and ignored on application.
Use in signatures
Signature items may be annotated zero_alloc
, as in:
val[@zero_alloc] f : int -> int
Such annotations have two effects:
- In modules implementing such a signature, the corresponding function is checked for
zero_alloc
. The implementation of the function need not be annotated explicitly withzero_alloc
(this was not the case in early versions of the checker). - The
zero_alloc
analysis off
’s callers can use the factf
is zero alloc. This does not rely on inlining, sof
is reliably known to be zero alloc even when compiling an application of it in a different library.
Legal payloads and module typing
The payload of a such an annotation may contain the strict
and opt
directives (as well as a new arity
directive, more on which below). If opt
is present, the information will only be available to clients in optimized
builds. Implementing modules must annotate the corresponding function with a
zero_alloc
attribute that is at least as strong, so, for example, this
typechecks:
module M : sig
val f[@zero_alloc] : int -> int
end = struct
let f[@zero_alloc strict] x = x
end
However, if the signature had strict
and the struct did not, this would be
rejected by the typechecker.
It is legal to satisfy a signature’s zero_alloc
requirement with a zero_alloc
assume
annotation in the implementation.
Arity
The compiler will infer the arity of a declaration with a [@zero_alloc]
attribute by counting the number of arrows in the type. Applications of this
function will only be considered zero_alloc
when fully applied. For
example, if a library declares a function f
:
val[@zero_alloc] f : 'a -> 'a -> 'a
Then, in code that uses this library, the zero_alloc
analysis will know that
applications of f
to two arguments do not allocate. Applications of f
to any
other number of arguments will still be considered allocating (unless
cross-module inlining is available and the compiler can see from f
’s
definition that it also does not allocate in those cases).
Sometimes it is necessary to override the arity inferred by the compiler here.
For example, the compiler will reject this signature because the [@zero_alloc]
attribute is present on a declaration without any arrows in its type:
type t = int -> int
val[@zero_alloc] f : t
This behavior can be overridden by specifying the arity manually:
type t = int -> int
val[@zero_alloc arity 1] f : t
There are also cases where arrows are present but is still desirable to override the arity, as here:
module M : sig
val[@zero_alloc arity 1] f : int -> int -> int
end = struct
let[@zero_alloc] f x =
if x = 42 then fun y -> (y,y) else fun y -> (y,y+1)
end
Attribute [@@noalloc]
vs [@zero_alloc]
The check of [@zero_alloc]
annotations on OCaml functions relies on [@@noalloc]
annotations of any external
declarations they refer to in order to determine the
allocation behavior of the corresponding external functions. Without [@@noalloc]
, the
check conservatively assumes that external code allocates. The check treats [@@noalloc]
attribute using the “strict” meaning, as if the external declaration is annotated with
[@zero_alloc assume strict]
.
There is currently no way to annotate external declarations with
[@zero_alloc assume]
for the “relaxed” meaning of the check. A
workaround is to annotate the call sites with [@zero_alloc assume]
.
An alternative workaround is to define a wrapper OCaml function for
the external
and annotate the wrapper with [@zero_alloc assume][@inline always]
to make sure it is always inlined and has no
impact on the generate code.
Note that the presence of [@@noalloc]
affects the generated code whereas [@zero_alloc]
is only used for checking and error reporting at compile time.
Interaction with safepoints in OCaml 5
Currently, the check ignores poll instructions (otherwise all loops would be considered
allocating when poll insertion is enabled). Therefore, [@zero_alloc]
is not sufficient
to ensure the absence of context switches.
Interaction with inlining and other optimizations
The check is very sensitive to the optimization level and other compilation flags. This is by design and not expected to change. To give the most accurate results, the check for allocations is performed late in the compilation process, after all the optimizations such as inlining, specialization, static allocation and unboxing. The result of the check can sometimes be surprising and counter-intuitive at the source code level.
Dynamic vs static checking
Expect_test_helpers_cre.require_no_allocation
counts the exact
number of words allocated in a particular program execution at
runtime, whereas [@zero_alloc]
check considers all possible
executions without ever running the program and may over-approximate
the number of allocations. The two approaches are complementary.
We recommend using both.
What is the overhead?
When the check is enabled, the overhead on build time is low. It is just one additional lightweight pass on each compilation unit.
The size of .cmx
files can increase slightly. Preliminary measurements on the compiler
distribution itself showed less than 0.5%
increase with flambda2. Current
format is naive and can be optimized if this becomes a bottleneck.
There is no runtime overhead!
There is no change to generated code as a result of enabling the check or adding
[@zero_alloc]
annotations on functions, except the described below.
Dead code elimination
The presence of zero_alloc
annotation can prevent dead code elimination.
Consider for example the function foo
:
let[@zero_alloc strict] foo t =
if debug then log "foo" t [%sexp_of: _ t];
bar t
We expect this function to pass the check when debug
is statically set to false
, and
fail otherwise.
If foo
is not exposed in the interface and gets inlined into all call sites, the optimizer
can remove the definition of foo
. The presence of zero_alloc
annotation delayes the elimination of foo
so that the compiler can perform the check after all other optimizations.
The presence of the function may prevent some other
optimizations from triggering.
How do I figure out why the check fails on my function?
When the check fails, the compiler reports all allocations and other expressions such as indirect calls that lead to the failure of the check. The source locations of the offending allocations is shown in the error messages, including inlining information.
For example, compiling this (contrived) example
let[@inline never] bar x = [x+1;x+x]
let[@zero_alloc] foo b x f =
let y =
match bar x with
| [c] -> f c
| [a;_] ->
[%probe "test" (print_int x)];
(a, a)
| _ -> (x,x)
in
Dep.inline_always y
fails with an error message that looks like this:
File "test.ml", line 3, characters 5-15:
3 | let[@zero_alloc] foo b x f =
^^^^^^^^^^
Error: Annotation check for zero_alloc failed on function Test.foo (camlTest__foo_3_6_code)
File "test.ml", line 5, characters 10-15:
5 | match bar x with
^^^^^
Error: called function may allocate (direct call camlTest__bar_2_5_code)
File "test.ml", line 6, characters 13-16:
6 | | [c] -> f c
^^^
Error: called function may allocate (indirect call)
File "test.ml", line 8, characters 6-35:
8 | [%probe "test" (print_int x)];
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: expression may allocate
(probe test handler camlTest__probe_handler_test_4_7_code)
File "test.ml", line 9, characters 6-12:
9 | (a, a)
^^^^^^
Error: allocation of 24 bytes
File "test.ml", line 10, characters 11-16:
10 | | _ -> (x,x)
^^^^^
File "test.ml", line 12, characters 2-21:
12 | Dep.inline_always y
^^^^^^^^^^^^^^^^^^^
Error: called function may allocate (external call to caml_make_vect) (test.ml:12,2--21;dep.ml:2,2--38)
If we remove [@inline never]
annotation from bar
, the compiler can optimize some but
not all allocations in foo
. The check will fail and show the remaining allocations::
File "test.ml", line 3, characters 5-15:
3 | let[@zero_alloc] foo b x f =
^^^^^^^^^^
Error: Annotation check for zero_alloc failed on function Test.foo (camlTest__foo_3_6_code)
File "test.ml", line 8, characters 6-35:
8 | [%probe "test" (print_int x)];
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: expression may allocate
(probe test handler camlTest__probe_handler_test_4_7_code)
File "test.ml", line 9, characters 6-12:
9 | (a, a)
^^^^^^
Error: allocation of 24 bytes
File "test.ml", line 12, characters 2-21:
12 | Dep.inline_always y
^^^^^^^^^^^^^^^^^^^
Error: called function may allocate (external call to caml_make_vect) (test.ml:12,2--21;dep.ml:2,2--38)
Compilation flag -zero-alloc-checker-details-cutoff n
controls how many allocations are listed
in these error messages:
- positive
n
: print at mostn
witnesses n
is0
: do not print any witnesses- negative
n
: print all witnesses (default)
The default setting is convenient when adding annotations or editing annotated code. The other settings have lower memory overhead and are convenient for testing the analysis.
The flag can be passed in the usual way via dune
settings. For example,
to turn off detailed messages in a particular library, add (ocamlopt_flags (:standard
-zero-alloc-checker-details-cutoff 0))
to the library
stanza in dune
file.
When the check fails, it is a compile error (not a warning) and compilation stops. The error message will show all functions that failed the check in the same compilation unit (not just the first function that failed the check).
Other ways to control the analysis
The following settings are mostly for testing and debugging of the analysis.
The check can be controlled by the compiler flag -zero-alloc-check
with the following
arguments:
default
: check attributes withoutopt
payload (default)check all
: covers bothopt
and default payloads and is intended to be used for optimized buildsnone
: disable the checkopt
: only check attributes withopt
payload and is intended for debugging
The flag can be passed to dune
build as usual for the entire project or on a
per-library basis. For example, to disable regular and opt
checks:
(ocamlopt_flags (:standard -zero-alloc-check none)
The check can also be controled on a per-file basis using the top-level attribute
[@@@zero_alloc check]
, [@@@zero_alloc check_none]
, [@@@zero_alloc check_all]
,
[@@@zero_alloc check_opt]
.
If information about dependencies is not available (i.e., a
build without cross module / library inlining or with -opaque
), the dependencies will be
conservatively assumed to allocate and the check may fail.
Details of the analysis can be printed using -dzero-alloc
compiler flag for debugging.
The analysis can be disabled using -disable-zero-alloc-checker
flag.
This disables computation of function summaries (unlike
-zero-alloc-check none
that disables checking of annotations). With
-disable-zero-alloc-checker
flag, the checker will conservatively
assume that all functions may allocate, so if there are any
annotations, compilation will fail unless -zero-alloc-check none
is
also passed.
Compilation flag -disable-precise-zero-alloc-checker
disables computation of summaries only for recursive functions
and functions whose definition appears in the generated code before
the definitions of all their dependencies. This is an escape hatch
for situations of unexpectedly high overhead of the analysis on
specific files, for example due to precise analysis of recursive
functions. It will cause all uses of these functions to be
conservatively treated and likely fail the check.
-zero-alloc-checker-join
compilation flag provides a more fine-grained
control of precision.
Annotation [@@@zero_alloc all]
and [@zero_alloc ignore]
Some files consist mostly of non-allocating code. To reduce the annotation burden, we
provide a top-level annotation [@@@zero_alloc all]
. It requires that all function in the
compilation unit are not allocating. It is equivalent to adding [@zero_alloc]
annotation to all functions in the file.
To opt out individual function, use
[@zero_alloc ignore]
. For example:
[@@@zero_alloc all]
let[@zero_alloc ignore] make x y = (x,y)
let fst (x,y) = x
let snd (x,y) = y
Note that [@@@zero_alloc all]
does not cover the top-level effects (and any other module
initialization code that ends up in the compiler-generated entry
function).
For example, the following file passes the check:
[@@@zero_alloc all]
let () = Printf.printf "Hello world!"
but this one fails:
[@@@zero_alloc all]
let[@inline never] test () = Printf.printf "Hello world!"
let () = test ()
More precisely, [@@@zero_alloc all]
applies to all function symbols
emitted for the compilation unit, except the entry
function (and
some compiler-generated wrapper functions). As such, it will apply to
inner functions but may not apply to some top level functions
optimized away by the compiler as discussed above.
[@@@zero_alloc all]
and [@zero_alloc ignore]
are somewhat more experimental than
other annotations. Their scope is global within a compilation unit is global and
they should appear at the top of the file before any other (otherwise
their effect is not well-defined).
Annotation [@@@zero_alloc all]
vs [@@@zero_alloc check]
[@@@zero_alloc check]
turns on the checking of zero_alloc
annotations on functions,
whereas [@@@zero_alloc all]
requires that all functions in the file are zero_alloc
.
Syntactic limitation
The [@zero_alloc]
annotation can only appear on functions, not values.
For example, if we add [@zero_alloc]
attribute to recvfrom
function defined in
lib/zero_udp/src/zero_udp.ml
as follows:
let[@zero_alloc] recvfrom = Zero_unix.recvfrom
the compiler prints a warning:
File "lib/zero_udp/src/zero_udp.ml", line 42, characters 5-15:
Error (warning 53 [misplaced-attribute]): the "zero_alloc" attribute cannot appear in this context
A workaround is to add arguments explicitly:
let[@zero_alloc] recvfrom a b = Zero_unix.recvfrom a b
This limitation is not specific to [@alloc]
. Other function-level attributes like
[@inline]
and [@specialise]
have the same limitation.
Limitations
Summary of known limitations
- We currently do not allow
[@zero_alloc]
annotations on external function declarations. Applications of external functions can be annotated with[@zero_alloc assume]
. - Annotations cannot be placed on arbitrary sub-expressions.
- Annotations on functions that would otherwise be optimized away may inhibit optimizations. The presence of annotations currently prevents the functions from being dead code eliminated, which in turn may prevent other code from being optimized and also increases the size of generated code.
- Functions annotated with
[@zero_alloc assume]
are not analyzed at all. As a result, the summary of such a function can be more conservative than if the function was actually analyzed. For example, if the function does not allocate at all on any path, but assumed to bezero_alloc
using the relaxed semantics. - Inlining functions annotated with
assume
may cause thezero_alloc
check of the caller to fail when the inlined body containstry..with
and the exception handler may return not via araise
. A workaround is to annotate the exception handler with[@zero_alloc assume error]
. - Conservative handling of higher order function arguments, for example in
List.iter
. Another common example where the check fails isMap.find_exn
andHashtbl.mem
frombase
if the underlying comparator and hash functions are not annotated with[@zero_alloc]
in the signature. - The check may fail on non-allocating functions that manipulate arrays due to flat float optimization, but unfortunately, the error message is not helpful.
- Missing scoping of top-level
[@@@zero_alloc ...]
annotations. - A function that does not allocate may fail the check if the analysis does not have access to summaries of functions in other libraries and thus any calls to such functions are conservatively considered as allocating.
- Syntactic limitation of
[@zero_alloc ..]
attribute: can only appear on functions, not values. - Safepoints are considered non-allocating.