What happens when a linked process dies
What happens when a linked process dies
Summary
Previously we looked at different ways we can kill (or attempt to kill) a process and what happens in each case. Now let’s see what happens when a linked process dies. The tl;dr is in the table below.
Trapping exits? | Reason for linked process exit | Exit message received? | Exits? |
---|---|---|---|
no | :normal |
no | no |
no | any reason other than :normal |
no | yes |
yes | any reason including :normal |
yes | no |
Note that the behaviour describes what happens when the exiting linked process is _not_the parent process. We look at that in a subsequent post.
This page can be downloaded as a LiveBook for execution. The raw markdown is here, or you can follow the previous instructions for cloning the blog and executing the LiveBook pages.
When the final value of a snippet is significant I have matched against it to make the output clearer when reading the blog rather than executing the page. eg
true = Linky.alive_after_wait_for_death?(l1)
A GenServer for the experiments
defmodule Linky do
use GenServer
def start do
GenServer.start(__MODULE__, {})
end
def init(_), do: {:ok, %{}}
def link(server, other) do
:ok = GenServer.call(server, {:link, other})
end
def trap_exits(server) do
:ok = GenServer.call(server, :trap_exits)
end
def alive_after_wait_for_death?(pid, count \\ 50)
def alive_after_wait_for_death?(pid, 0), do: Process.alive?(pid)
def alive_after_wait_for_death?(pid, count) do
case Process.alive?(pid) do
true ->
:timer.sleep(1)
alive_after_wait_for_death?(pid, count - 1)
_ ->
false
end
end
def handle_call({:link, other}, _, s) do
Process.link(other)
{:reply, :ok, s}
end
def handle_call(:trap_exits, _, s) do
Process.flag(:trap_exit, true)
{:reply, :ok, s}
end
def handle_info(event, s) do
IO.inspect({self(), event}, label: :handle_info)
{:noreply, s}
end
def terminate(reason, _s) do
IO.inspect({self(), reason}, label: :terminate)
end
end
The GenServer above is used in the illustrative code that follows.
Linked process exits with :normal reason
I’ve read in more than one place, that linking two processes ties their lifecycle together, so that one dies the other pops off too. This is not the full story.
{:ok, l1} = Linky.start()
{:ok, l2} = Linky.start()
Linky.link(l1, l2)
GenServer.stop(l2, :normal)
true = Linky.alive_after_wait_for_death?(l1)
When we execute the above code, we discover that when a linked process exits with a :normal
reason then the
other process does not die.
{:ok, l1} = Linky.start()
{:ok, l2} = Linky.start()
Linky.link(l1, l2)
Linky.trap_exits(l1)
GenServer.stop(l2, :normal)
true = Linky.alive_after_wait_for_death?(l1)
We still get a message from a linked process’s :normal
exit (eg {:EXIT, #PID<0.136.0>, :normal}
) when we are
trapping exits. In fact, with one exception, given linked procceses l1 and l2, when l1 experiences l2’s exit
exactly as if l2 had called Process.exit/2
on l1
with its exit reason.
Linked processes that exit by a :kill
Remember that :kill
is an untrappable exit when calling Process.exit/2
? Its untrappability does not cascade
to linked processes.
{:ok, l1} = Linky.start()
{:ok, l2} = Linky.start()
Linky.link(l1, l2)
Linky.trap_exits(l1)
Process.exit(l2, :kill)
true = Linky.alive_after_wait_for_death?(l1)
Interestingly the reason received by the linked process is :killed
not :kill
, which would of course be
trappable if sent via Process.exit/2
. That is one explanation for :kill
s not cascasding through linked
processes.
{:ok, l1} = Linky.start()
{:ok, l2} = Linky.start()
Linky.link(l1, l2)
Linky.trap_exits(l1)
GenServer.stop(l2, :kill)
true = Linky.alive_after_wait_for_death?(l1)
It would be an explanation for kill
s not cascading, except it is perfectly possible, as above, for a process to
exit with a reason:kill
, and have the signal :kill
trapped by a linked process.
(Possible, but it would be an curious thing to actually do in production code.)
{:ok, l1} = Linky.start()
{:ok, l2} = Linky.start()
Linky.link(l1, l2)
Process.exit(l2, :kill)
false = Linky.alive_after_wait_for_death?(l1)
Of course if we are not trapping exits and a linked process is kiled, then the other processes also dies.
Linked processes exiting with other reasons
{:ok, l1} = Linky.start()
{:ok, l2} = Linky.start()
Linky.link(l1, l2)
GenServer.stop(l2, :whatever)
false = Linky.alive_after_wait_for_death?(l1)
For completeness, the above code shows that if you are not trapping exits then linked processes will exit when
the other exits, as long as the reason is not :normal
.
Linked (non GenServer / OTP) processes that simply exit
{:ok, l1} = Linky.start()
{:ok, l2} = Linky.start()
spawny =
spawn(fn ->
receive do
:bye ->
IO.puts("Goodbye sweet world 😿")
end
end)
Linky.link(l1, spawny)
Linky.link(l2, spawny)
Linky.trap_exits(l1)
send(spawny, :bye)
[true, true] = for pid <- [l1, l2], do: Linky.alive_after_wait_for_death?(pid)
When any process’s function returns normally, then the process exits with the :normal
reason. Linked processes
behave accordingly: they receive a :normal
exit notification from the Linked process if they are trapping exits;
even if they are not trapping exits, they do not exit. This is why it’s safe, for instance, to start a task with
Task.start_link/1
or spawn_link/1
.
A silly mistake I made that resulted in a process leak
For reasons
I created a helper process for a LiveView page in
the mount/3
callback. I wrote something like
def mount(_params, _session, socket) do
{:ok, debouncer} = Debouncer.start_link(self(), 500)
{:noreply, assign(socket, debouncer: debouncer)}
end
You can see the problem here, right? Yes, mount/3
is called twice: once when the initial html is rendered a
and again when the socket connects, so an extra Debouncer
is started.
This would be a waste of cpu cycles, but
no more, except that the initial rendering call is initiated by a Cowboy process
that exits with a :normal
reason. The Debouncer
process does not exit, but stays orphaned and taking up
resources.
def mount(_params, _session, socket) do
if connected?(socket) do
{:ok, debouncer} = Debouncer.start_link(self(), 500)
{:noreply, assign(socket, debouncer: debouncer)}
else
{:noreply, socket}
end
end
The above mount/3
does not leak processes, as the LiveView’s socket process
exits with {:shutdown, reason}
, eg {:shutdown, :closed}
, causing the linked Debouncer
to also exit. It
would be even safer, and proof against LiveView changes, to trap exits in Debouncer and voluntarily exit with a
:stop
return on the callback; I may do just that.
I thought about adding this story at the beginning of the post, but I thought it might make it like one of those cooking articles that people complain about - the ones where you have to scroll though paragraphs of prose before getting to the actual recipe. (Also it doesn’t reflect well on me and people might get bored before reading this far.)
This series
Starting out looking at exit signals and OTP process death has turned into a small series of posts, including this one. These are:
-
The many and varied ways to kill an OTP Process: investigation of different ways to cause (or fail to cause) a process to exit.
-
What happens when a linked process dies: the impact of a process exiting on processes that are linked to it, excluding OTP processes with a parent/child relationship.
-
Death, Children, and OTP: the impact on an OTP process when the process that spawned it (its parent) exits, particularly when the child is trapping exits.
Updates
-
2020-06-28 Added check to code to show that a process trapping exits does not die when a linked process dies with
:normal
, just like when not trapping exits. -
2021-06-29: included the section linking to posts in this series.
-
2021-06-29: added a note that this post does not look at linked processes with a parent/child relationship.