Supervisors and Workers

Understand how the supervisor supervises and manages the processes' states.

Elixir doesn’t worry much about code that crashes. Instead, it makes sure the overall application keeps running. This might sound contradictory, but it isn’t. Think of a typical application. If an unhandled error causes an exception to be raised, the application stops. Nothing else gets done until it’s restarted. If it’s a server handling multiple requests, they all might be lost.

The issue here is that one error takes the whole application down. Now, imagine that our application consists of hundreds or thousands of processes, each handling just a small part of a request. If one of those crashes, everything else carries on. We might lose the work it’s doing, but we can design our applications to minimize even that risk. And when that process gets restarted, we’re back running at 100%.

In the Elixir and OTP worlds, supervisors perform all of this process monitoring and restarting.

Supervisor

An Elixir supervisor has just one purpose. It manages one or more processes. As we’ll discuss later, these processes can be workers or other supervisors.

At its simplest, a supervisor is a process that uses the OTP supervisor behavior. It’s given a list of processes to monitor and is told what to do if a process dies and how to prevent restart loops (when a process is restarted, dies, gets restarted, dies, and so on).

To do this, the supervisor uses the Erlang VM’s process-linking and process-monitoring facilities. We can write supervisors as separate modules, but the Elixir style is to include them inline. The easiest way to get started is to create our project with the --sup flag. Let’s do this for our sequence server.

$ mix new --sup sequence 

The only apparent difference is the appearance of the file lib/sequence/application. Let’s have a look inside (Note that we stripped out some comments):

defmodule Sequence.Application do   
  @moduledoc false
  use Application
  def start(_type, _args) do 
    children = [
      # {Sequence.Worker, arg},
    ]
    opts = [strategy: :one_for_one, name: Sequence.Supervisor]
    Supervisor.start_link(children, opts)
  end 
end

Our start function now creates a supervisor for our application. All we need to do is tell it what we want to be supervised. We copy the second version of the Sequence.Server module from the last chapter into the lib/sequence folder. Then, we uncomment and change the line in the child_list to reference this module:

defmodule Sequence.Application do
  @moduledoc false

  use Application

  def start(_type, _args) do
    children = [
      { Sequence.Server, 123},
    ]

    opts = [strategy: :one_for_one, name: Sequence.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

Let’s look at what’s going to happen:

  1. When our application starts, the start function is called.

  2. It creates a list of child server modules. In our case, there’s just one, the Sequence.Server.

  3. Along with the module name, we specify an argument to be passed to the server when we start it.

  4. We call Supervisor.start_link, passing it the list of child specifications and a set of options. This creates a supervisor process.

  5. Now, our supervisor process calls the start_link function for each of its managed children. In our case, this is the function in Sequence.Server. This code is unchanged. It calls GenServer.start_link to create a GenServer process.

Now, we’re up and running. Let’s try it:

iex> Sequence.Server.increment_number 3 
:ok
iex> Sequence.Server.next_number
126

So far, so good. But the key with a supervisor is that it’s supposed to manage our worker process. If it dies, for example, we want it to be restarted. Let’s try that. If we pass something that isn’t a number to increment_number, the process should die trying to add it to the current number.

iex> Sequence.Server.increment_number "cat"
:ok
iex> 
10:04:05.805 [error] GenServer Sequence.Server terminating
** (ArithmeticError) bad argument in arithmetic expression
    :erlang.+(123, "cat")
    (sequence 0.1.0) lib/sequence/server.ex:31: Sequence.Server.handle_cast/2
    (stdlib 3.15.2) gen_server.erl:695: :gen_server.try_dispatch/4
    (stdlib 3.15.2) gen_server.erl:771: :gen_server.handle_msg/6
    (stdlib 3.15.2) proc_lib.erl:226: :proc_lib.init_p_do_apply/3
Last message: {:"$gen_cast", {:increment_number, "cat"}}
State: 126

Run commands given in the above two snippets to execute the code below:

Get hands-on with 1200+ tech skills courses.