Lwt introduction/tutorial Part 2 of 2
In order to understand the advanced features of Lwt, it is necessary to understand the internals of Lwt: how promises are created, stored, resolved, etc. This page delves into those details. It assumes familiarity with the basic concepts of Lwt covered in Part 1.
To be more precise, this part of the tutorial gives an acceptably precise working model of Lwt, but not an exact one. In particular, we do not mention any parts that are not necessary for the understanding of the user-facing features.
A simple model of a promise
To start with, consider that a promise is simply a cell for a value.
The cell can take different values that map almost exactly on the
state
type. The actual type of promises in our
only-slightly-simplified model is as follows:
type 'a internal_state =
| Fulfilled of 'a
| Rejected of exn
| Pending of 'a pending
type 'a t = { mutable state : internal_state }
Basically, internally, Lwt maintains its promise similar to a
state
cell, but with added information for the pending
promises. We give more details on the added information on a by-need
basis below. For now suffice to say that pending
contains
the information that Lwt needs in order to handle all that can happen to
a still-to-be-resolved promise.
We can already think of some of the simpler primitives as code over this simplified model:
let return v = { state = Fulfilled v }
let fail exc = { state = Rejected exc }
let state p = match p.state with
| Fulfilled v -> Return v
| Rejected exc -> Fail exc
| Pending _ -> Sleep
But we cannot write much more at this point. So we refine the model
and expand on the definition of pending
.
Callbacks
Lwt.on_any: 'a t -> ('a -> unit) -> (exn -> unit) -> unit
If p
is pending, it attaches the callbacks to
p
; consequently, when p
resolves, either of
hf
or hr
is called depending if the promise is
fulfilled or rejected. If p
is already resolved,
on_any p hf hr
calls either hf
or
hr
immediately.
Internal representation
This is achieved by attaching callbacks to pending promises. When Lwt
modifies the internal state of a promise from Pending
to
either Fulfilled
or Rejected
, Lwt also calls
all of the attached callbacks. As a first approximation, we simply
declare:
type 'a pending = {
mutable when_fulfils: ('a -> unit) list;
mutable when_rejects: (exc -> unit) list;
..
}
And Lwt calls the callbacks when it changes the state of the promise.
For example, here is the implementation of task
and
wakeup
in our simplified model.
type 'a u = 'a t
let task () =
let p = { state = Pending .. } in
(p, p)
let wakeup p v =
match p.state with
| Pending { when_fulfils; } ->
p.state <- Fulfilled v;
List.iter (fun f -> f v) when_fulfils
| _ -> raise (Invalid_argument "..")
let wakeup_exn p exc =
match p.state with
| Pending { when_rejects; } ->
p.state <- Rejected exc;
List.iter (fun f -> f exc) when_rejects
| _ -> raise (Invalid_argument "..")
The internals are a little bit more complicated than that, in particular with respect to error management. But the general idea is as above.
Other explicit callbacks
There are variants of on_any
that attach different
callbacks. Their semantics can be inferred from their name and type
signature, but we list them here anyway.
Lwt.on_success: 'a t -> ('a -> unit) -> unit
on_success p h
calls h
when p
is
fulfilled. If p
is already fulfilled, it calls
h
immediately. Nothing happens if p
is already
rejected, is eventually rejected, or is never resolved.
Lwt.on_failure: 'a t -> (exc -> unit) -> unit
on_failure p h
calls h
when p
is
rejected. If p
is already rejected, it calls h
immediately. Nothing happens if p
is already fulfilled, is
eventually fulfilled, or is never resolved.
Lwt.on_termination: 'a t -> (unit -> unit) -> unit
on_termination p h
calls h
when p
is resolved. If p
is already resolved, it calls
h
immediately. Nothing happens if p
is never
resolved.
Implicit callbacks
Lwt also uses callbacks to implement promise combinators such as
bind
. To give an idea of how, here is an implementation in
the simplified model. The details are different, notably with respect to
error management and some optimisations, but the gist is similar: return
a pending promise that is resolved by callbacks attached to other
promises.
let bind p f =
(* create the final promise p'' *)
let p'', r = task () in
(* attach callbacks to the original promise p *)
on_any p
(fun v ->
(* call `f` to get the intermediate promise p' *)
let p' = f v in
(* attach callbacks to p' in order to resolve p'' *)
on_any p'
(fun v -> wakeup r v)
(fun exc -> wakeup_exn r exc))
(fun exc -> wakeup_exn r exc);
(* return the final promise *)
p''
There are other uses of callbacks (in join
,
choose
, etc.), but studying them does not provide more
insight into the inner workings of Lwt. One of the things to take from
this model, one of the things that this model is accurate enough to
describe, is that Lwt does not maintain a full list of all the existing
promises in a data-structure.
For example, in the bind
above, the final promise
p''
is created, then a reference to it is kept by the
callbacks attached to p
, and then it is returned. Lwt does
not keep p''
in a list/table/map of promises. There are
some exceptions to this which we mention below.
Cancellation
Lwt.Canceled: exn
The Canceled
exception is used by Lwt to handle promise
cancellation. The Canceled
exception is treated differently
than others in several of the Lwt primitives. For example,
wakeup
will fail if the associated promise is already
resolved, except if it is rejected with Canceled
. As a
result, the following code does not raise an exception – but it would
with a different exception:
let _, r = task () ;;
wakeup_exn r Canceled ;;
wakeup r 0 ;;
Lwt.cancel: 'a t -> unit
cancel p
causes p
to be rejected with
Canceled
. Except that sometimes it does not do that, and
sometimes it does much more than that.
Cancelable, not cancelable
Some promises are cancelable (calling cancel
on them
causes them to be rejected with Canceled
) others are not
(calling cancel
on them is a no-op). For example, consider
the difference between task
and wait
.
Lwt.wait: unit -> 'a t * 'a u
wait ()
is similar to task ()
except that the
returned promise is not cancelable. Calling cancel
on the
promise created by wait
has no effect.
There is no way to test whether a promise is cancelable – besides actually canceling it and observing a change of state.
The semantics of simple cancellation
The specific semantics of cancellation (detailed below) are inherited from a way of thinking about threads rather than promises. Consider the following code with uninteresting details elided:
let p =
let* .. = .. in
let* .. = .. in
let .. = .. in
let .. = .. in
let* .. = .. in
let* .. = .. in
match .. with
| .. -> return ..
| .. ->
let* .. = .. in
let .. = .. in
let* .. = .. in
return ..
The control-flow within p
is relatively simple: it
mostly goes from line to line traversing bind operators and there is a
single branching point. The control-flow is simple enough that it is
easy to think about p
as a “thread”: a series of
instructions that execute in relative independence to the rest of the
program. Or, slightly more but not entirely accurate, you can think of
p
as a promise and the expression that p
is
bound to as a thread which eventually resolves p
. Either
way, the cancel
function works well on the “thread”
p
: it interrupts the thread at whatever point of its
execution it currently is.
The correct, thread-less, description is that cancel p
finds whichever of the intermediate promise of p
is
currently pending and rejects it with Canceled
.
Set up for cancellation propagation
In order to find the pending intermediate promise of p
,
Lwt stores additional information in pending promises. Specifically, it
stores a list of promises to propagate the cancellation to.
type cancelation = Self | Others of 'a t list
type pending = {
..
cancelation: cancelation;
}
The promise combinators simply set the cancelation
field
of pending
when they create the promise. For example,
task
and wait
work as follows.
let task () =
let p =
{ state =
Pending {
..;
cancelation = Self; }
}
in
(p, p)
let wait () =
let p =
{ state =
Pending {
..;
cancelation = Others []; }
}
in
(p, p)
And bind
actually performs more legwork than previously
shown.
let bind p f =
match p.state with
| Fulfilled v -> f v
| Rejected exc -> Rejected exc
| Pending pending ->
let p'' =
{ state =
Pending {
..;
cancelation = Others [ p ]; }
}
in
.. (* set up callbacks to resolve p'' *)
p''
And more complex combinators such as choose
and
join
include multiple promises in their Others
cancellation field.
Propagating cancellation
Using this additional information, Lwt can find the pending promises that a given promise depends on.
let rec cancel p =
match p.state with
| Fulfilled _ -> ()
| Rejected _ -> ()
| Pending { cancelation = Self } ->
wakeup_exn p Canceled
| Pending { cancelation = Others ps } ->
List.iter cancel ps
This is a mock implementation on the simplified model. In particular, the cancellation works in two separate phases: collect all promises that the cancellation affects, and then trigger all the cancellations. This is to avoid subtle bugs where some cancellation has side-effects that modify the state of some of the not yet traversed promises.
Cancellation-focused combinators
As mentioned, the semantics of cancel
are easy to grasp
for simple examples such as promises daisy-chained in a simple flow
control. On the other hand, the semantics of cancel
on
complex constructions can be difficult to understand. In particular,
cancelling a promise p
may cancel other promises that share
a common ancestor with p
.
In order to avoid unwanted side-effects, Lwt provides a few
combinators to tweak the behaviour of the cancellation mechanism.
Specifically, these combinators introduce new promises with carefully
crafted cancelation
fields that cause the propagation of
cancellation to behave in the desired way.
Lwt.protected: 'a t -> 'a t
protected p
is a promise pp
that is similar to
p
: it has the same state and it resolves similarly.
However, when the cancellation propagation described above reaches
pp
, then pp
is rejected with
Canceled
but nothing happens to p
. In our
simplified model, this is achieved by setting pp
’s
cancelation
field to Self
.
Lwt.no_cancel: 'a t -> 'a t
no_cancel p
is a promise pp
that is similar to
p
: it has the same state and it resolves similarly.
However, when the cancellation propagation described above reaches
pp
, then nothing happens. In our simplified model, this is
achieved by setting pp
’s cancelation
field to
Others []
.
Pausing and scheduling
As seen previously, Lwt’s scheduling is eager. Specifically, when
given an already resolved promise and a function that returns an already
resolved promise, bind
returns and already resolved
promise. As a result, the user must call pause
(or
yield
) in order to force the Lwt scheduler to give
opportunities to other promises to progress towards resolution.
To achieve this, Lwt stores the pending promises created with
pause
into a global variable. By itself, this promise can
make no progress. However, when the scheduler’s main-loop is running, it
resolves paused promises at each iterations.
let paused_promises = ref []
let pause () =
let p = { state = Pending { .. } } in
paused_promises := p :: !paused_promises;
p
let rec run p =
match p.state with
| Fulfilled v -> v
| Rejected exc -> raise exc
| Pending _ ->
let all_paused = !paused_promises in
paused_promises := [];
List.iter (fun p -> wakeup p ()) all_paused;
run p
Note, however, that different environment have different ways to deal
with paused promises. For example, in Unix the scheduler only kicks in
when a call to Lwt_main.run
is on-going. By contrast, in
js_of_ocaml
the scheduler is always running. For
compatibility with other environment, you cannot rely on
pause
to hold your promises pending until you call
Lwt_main.run
.
Also note that the mock scheduler above is only for building a mental model. The specifics differ, but the core idea is there.
OS interactions
The last part of Lwt that this tutorial covers is interaction with
the operating system. Whilst it is possible to call functions from
OCaml’s Unix
module, such functions are oblivious to Lwt
and may cause issues. Most notably, these functions may block the
execution of the program.
There are multiple ways to interact with the OS without blocking. For
the most common OS interactions (e.g., creating files, reading from
pipes, accepting incoming connections, etc.), Lwt comes with the
sub-library lwt.unix
which includes the module
Lwt_unix
which contains Lwt-aware wrappers for the
Unix
module.
Lwt_unix.sleep: float -> unit t
The promise sleep t
is a pending promise that becomes
resolved after t
seconds have elapsed. Note that the
resolution of the promise only happens if the scheduler is running –
i.e., if there is an ongoing call to Lwt_main.run
. If the
scheduler is not running when the timer expires, then the promise is
only resolved the next time the scheduler is started.
Another way to call blocking functions is to wrap them in
Lwt_preemptive.detach
(provided by lwt.unix
)
or, equivalently, in Mwt.run
(provided by
mwt
). This solution relies on OCaml’s Thread
module to isolate the blocking calls in a different execution context.
It is mostly useful when making calls to a third-party library that does
not provide Lwt-aware system primitives.
Blocking system calls are maintained in their own global variable. The scheduler checks these calls at each iteration, causing the process to actually sleep if none of the calls are done and no paused promises can be found.
Part 3
The third and final part of this 2-part introduction/tutorial collects some miscellaneous remarks of little practical use.