Lwt introduction/tutorial Part 2 of 2
In order to understand the advanced features of Lwt, it is necessary to understand the internals of Lwt: how promises are created, stored, resolved, etc. This page delves into those details. It assumes familiarity with the basic concepts of Lwt covered in Part 1.
To be more precise, this part of the tutorial gives an acceptably precise working model of Lwt, but not an exact one. In particular, we do not mention any parts that are not necessary for the understanding of the user-facing features.
A simple model of a promise
To start with, consider that a promise is simply a cell for a value.
The cell can take different values that map almost exactly on the
state type. The actual type of promises in our
only-slightly-simplified model is as follows:
type 'a internal_state =
| Fulfilled of 'a
| Rejected of exn
| Pending of 'a pending
type 'a t = { mutable state : internal_state }
Basically, internally, Lwt maintains its promise similar to a
state cell, but with added information for the pending
promises. We give more details on the added information on a by-need
basis below. For now suffice to say that pending contains
the information that Lwt needs in order to handle all that can happen to
a still-to-be-resolved promise.
We can already think of some of the simpler primitives as code over this simplified model:
let return v = { state = Fulfilled v }
let fail exc = { state = Rejected exc }
let state p = match p.state with
| Fulfilled v -> Return v
| Rejected exc -> Fail exc
| Pending _ -> Sleep
But we cannot write much more at this point. So we refine the model
and expand on the definition of pending.
Callbacks
Lwt.on_any: 'a t -> ('a -> unit) -> (exn -> unit) -> unit
If p is pending, it attaches the callbacks to
p; consequently, when p resolves, either of
hf or hr is called depending if the promise is
fulfilled or rejected. If p is already resolved,
on_any p hf hr calls either hf or
hr immediately.
Internal representation
This is achieved by attaching callbacks to pending promises. When Lwt
modifies the internal state of a promise from Pending to
either Fulfilled or Rejected, Lwt also calls
all of the attached callbacks. As a first approximation, we simply
declare:
type 'a pending = {
mutable when_fulfils: ('a -> unit) list;
mutable when_rejects: (exc -> unit) list;
..
}
And Lwt calls the callbacks when it changes the state of the promise.
For example, here is the implementation of task and
wakeup in our simplified model.
type 'a u = 'a t
let task () =
let p = { state = Pending .. } in
(p, p)
let wakeup p v =
match p.state with
| Pending { when_fulfils; } ->
p.state <- Fulfilled v;
List.iter (fun f -> f v) when_fulfils
| _ -> raise (Invalid_argument "..")
let wakeup_exn p exc =
match p.state with
| Pending { when_rejects; } ->
p.state <- Rejected exc;
List.iter (fun f -> f exc) when_rejects
| _ -> raise (Invalid_argument "..")
The internals are a little bit more complicated than that, in particular with respect to error management. But the general idea is as above.
Other explicit callbacks
There are variants of on_any that attach different
callbacks. Their semantics can be inferred from their name and type
signature, but we list them here anyway.
Lwt.on_success: 'a t -> ('a -> unit) -> unit
on_success p h calls h when p is
fulfilled. If p is already fulfilled, it calls
h immediately. Nothing happens if p is already
rejected, is eventually rejected, or is never resolved.
Lwt.on_failure: 'a t -> (exc -> unit) -> unit
on_failure p h calls h when p is
rejected. If p is already rejected, it calls h
immediately. Nothing happens if p is already fulfilled, is
eventually fulfilled, or is never resolved.
Lwt.on_termination: 'a t -> (unit -> unit) -> unit
on_termination p h calls h when p
is resolved. If p is already resolved, it calls
h immediately. Nothing happens if p is never
resolved.
Implicit callbacks
Lwt also uses callbacks to implement promise combinators such as
bind. To give an idea of how, here is an implementation in
the simplified model. The details are different, notably with respect to
error management and some optimisations, but the gist is similar: return
a pending promise that is resolved by callbacks attached to other
promises.
let bind p f =
(* create the final promise p'' *)
let p'', r = task () in
(* attach callbacks to the original promise p *)
on_any p
(fun v ->
(* call `f` to get the intermediate promise p' *)
let p' = f v in
(* attach callbacks to p' in order to resolve p'' *)
on_any p'
(fun v -> wakeup r v)
(fun exc -> wakeup_exn r exc))
(fun exc -> wakeup_exn r exc);
(* return the final promise *)
p''
There are other uses of callbacks (in join,
choose, etc.), but studying them does not provide more
insight into the inner workings of Lwt. One of the things to take from
this model, one of the things that this model is accurate enough to
describe, is that Lwt does not maintain a full list of all the existing
promises in a data-structure.
For example, in the bind above, the final promise
p'' is created, then a reference to it is kept by the
callbacks attached to p, and then it is returned. Lwt does
not keep p'' in a list/table/map of promises. There are
some exceptions to this which we mention below.
Cancellation
Lwt.Canceled: exn
The Canceled exception is used by Lwt to handle promise
cancellation. The Canceled exception is treated differently
than others in several of the Lwt primitives. For example,
wakeup will fail if the associated promise is already
resolved, except if it is rejected with Canceled. As a
result, the following code does not raise an exception – but it would
with a different exception:
let _, r = task () ;;
wakeup_exn r Canceled ;;
wakeup r 0 ;;
Lwt.cancel: 'a t -> unit
cancel p causes p to be rejected with
Canceled. Except that sometimes it does not do that, and
sometimes it does much more than that.
Cancelable, not cancelable
Some promises are cancelable (calling cancel on them
causes them to be rejected with Canceled) others are not
(calling cancel on them is a no-op). For example, consider
the difference between task and wait.
Lwt.wait: unit -> 'a t * 'a u
wait () is similar to task () except that the
returned promise is not cancelable. Calling cancel on the
promise created by wait has no effect.
There is no way to test whether a promise is cancelable – besides actually canceling it and observing a change of state.
The semantics of simple cancellation
The specific semantics of cancellation (detailed below) are inherited from a way of thinking about threads rather than promises. Consider the following code with uninteresting details elided:
let p =
let* .. = .. in
let* .. = .. in
let .. = .. in
let .. = .. in
let* .. = .. in
let* .. = .. in
match .. with
| .. -> return ..
| .. ->
let* .. = .. in
let .. = .. in
let* .. = .. in
return ..
The control-flow within p is relatively simple: it
mostly goes from line to line traversing bind operators and there is a
single branching point. The control-flow is simple enough that it is
easy to think about p as a “thread”: a series of
instructions that execute in relative independence to the rest of the
program. Or, slightly more but not entirely accurate, you can think of
p as a promise and the expression that p is
bound to as a thread which eventually resolves p. Either
way, the cancel function works well on the “thread”
p: it interrupts the thread at whatever point of its
execution it currently is.
The correct, thread-less, description is that cancel p
finds whichever of the intermediate promise of p is
currently pending and rejects it with Canceled.
Set up for cancellation propagation
In order to find the pending intermediate promise of p,
Lwt stores additional information in pending promises. Specifically, it
stores a list of promises to propagate the cancellation to.
type cancelation = Self | Others of 'a t list
type pending = {
..
cancelation: cancelation;
}
The promise combinators simply set the cancelation field
of pending when they create the promise. For example,
task and wait work as follows.
let task () =
let p =
{ state =
Pending {
..;
cancelation = Self; }
}
in
(p, p)
let wait () =
let p =
{ state =
Pending {
..;
cancelation = Others []; }
}
in
(p, p)
And bind actually performs more legwork than previously
shown.
let bind p f =
match p.state with
| Fulfilled v -> f v
| Rejected exc -> Rejected exc
| Pending pending ->
let p'' =
{ state =
Pending {
..;
cancelation = Others [ p ]; }
}
in
.. (* set up callbacks to resolve p'' *)
p''
And more complex combinators such as choose and
join include multiple promises in their Others
cancellation field.
Propagating cancellation
Using this additional information, Lwt can find the pending promises that a given promise depends on.
let rec cancel p =
match p.state with
| Fulfilled _ -> ()
| Rejected _ -> ()
| Pending { cancelation = Self } ->
wakeup_exn p Canceled
| Pending { cancelation = Others ps } ->
List.iter cancel ps
This is a mock implementation on the simplified model. In particular, the cancellation works in two separate phases: collect all promises that the cancellation affects, and then trigger all the cancellations. This is to avoid subtle bugs where some cancellation has side-effects that modify the state of some of the not yet traversed promises.
Cancellation-focused combinators
As mentioned, the semantics of cancel are easy to grasp
for simple examples such as promises daisy-chained in a simple flow
control. On the other hand, the semantics of cancel on
complex constructions can be difficult to understand. In particular,
cancelling a promise p may cancel other promises that share
a common ancestor with p.
In order to avoid unwanted side-effects, Lwt provides a few
combinators to tweak the behaviour of the cancellation mechanism.
Specifically, these combinators introduce new promises with carefully
crafted cancelation fields that cause the propagation of
cancellation to behave in the desired way.
Lwt.protected: 'a t -> 'a t
protected p is a promise pp that is similar to
p: it has the same state and it resolves similarly.
However, when the cancellation propagation described above reaches
pp, then pp is rejected with
Canceled but nothing happens to p. In our
simplified model, this is achieved by setting pp’s
cancelation field to Self.
Lwt.no_cancel: 'a t -> 'a t
no_cancel p is a promise pp that is similar to
p: it has the same state and it resolves similarly.
However, when the cancellation propagation described above reaches
pp, then nothing happens. In our simplified model, this is
achieved by setting pp’s cancelation field to
Others [].
Pausing and scheduling
As seen previously, Lwt’s scheduling is eager. Specifically, when
given an already resolved promise and a function that returns an already
resolved promise, bind returns and already resolved
promise. As a result, the user must call pause (or
yield) in order to force the Lwt scheduler to give
opportunities to other promises to progress towards resolution.
To achieve this, Lwt stores the pending promises created with
pause into a global variable. By itself, this promise can
make no progress. However, when the scheduler’s main-loop is running, it
resolves paused promises at each iterations.
let paused_promises = ref []
let pause () =
let p = { state = Pending { .. } } in
paused_promises := p :: !paused_promises;
p
let rec run p =
match p.state with
| Fulfilled v -> v
| Rejected exc -> raise exc
| Pending _ ->
let all_paused = !paused_promises in
paused_promises := [];
List.iter (fun p -> wakeup p ()) all_paused;
run p
Note, however, that different environment have different ways to deal
with paused promises. For example, in Unix the scheduler only kicks in
when a call to Lwt_main.run is on-going. By contrast, in
js_of_ocaml the scheduler is always running. For
compatibility with other environment, you cannot rely on
pause to hold your promises pending until you call
Lwt_main.run.
Also note that the mock scheduler above is only for building a mental model. The specifics differ, but the core idea is there.
OS interactions
The last part of Lwt that this tutorial covers is interaction with
the operating system. Whilst it is possible to call functions from
OCaml’s Unix module, such functions are oblivious to Lwt
and may cause issues. Most notably, these functions may block the
execution of the program.
There are multiple ways to interact with the OS without blocking. For
the most common OS interactions (e.g., creating files, reading from
pipes, accepting incoming connections, etc.), Lwt comes with the
sub-library lwt.unix which includes the module
Lwt_unix which contains Lwt-aware wrappers for the
Unix module.
Lwt_unix.sleep: float -> unit t
The promise sleep t is a pending promise that becomes
resolved after t seconds have elapsed. Note that the
resolution of the promise only happens if the scheduler is running –
i.e., if there is an ongoing call to Lwt_main.run. If the
scheduler is not running when the timer expires, then the promise is
only resolved the next time the scheduler is started.
Another way to call blocking functions is to wrap them in
Lwt_preemptive.detach (provided by lwt.unix)
or, equivalently, in Mwt.run (provided by
mwt). This solution relies on OCaml’s Thread
module to isolate the blocking calls in a different execution context.
It is mostly useful when making calls to a third-party library that does
not provide Lwt-aware system primitives.
Blocking system calls are maintained in their own global variable. The scheduler checks these calls at each iteration, causing the process to actually sleep if none of the calls are done and no paused promises can be found.
Part 3
The third and final part of this 2-part introduction/tutorial collects some miscellaneous remarks of little practical use.