13 Running the system

This chapter covers

You’ve spent a lot of time building a to-do system, and now it’s time to prepare it for production. There are several ways to start a system, but the basic idea is always the same. You have to compile your code as well as your dependencies. Then, you start the BEAM instance and ensure all compiled artifacts are in the load path. Finally, from within the BEAM instance, you need to start your OTP application together with its dependencies. Once the OTP application is started, you can consider your system to be running.

There are various approaches to achieving this, and in this chapter, we’ll focus on two of them. First, we’ll look at how you can use Elixir tools, most notably mix, to start the system. Then, we’ll discuss OTP releases. Finally, I’ll end the chapter and the book by providing some pointers on how to interact with a running system, so you can detect and analyze faults and errors that inevitably happen at run time.

13.1 Running a system with Elixir tools

Regardless of the method you use to start the system, some common principles always hold. Running the system amounts to doing the following:

  1. Compile all modules. Corresponding .beam files must exist somewhere on the disk (as explained in section 2.7). The same holds for the application resource (.app) files of all OTP applications needed to run the system.

  2. Start the BEAM instance, and set up load paths to include all locations from step 1.

  3. Start all required OTP applications.

Probably the simplest way to do this is to rely on standard Elixir tools. Doing so is straightforward, and you’re already familiar with some aspects of mix, iex, and elixir command-line tools. So far, you’ve been using iex, which lets you start the system and interact with it. When you invoke iex -S mix, all the steps just mentioned are taken to start the system.

When running in production, you may want to start the system as a background process without the iex shell started. To do this, you need to start the system via the mix and elixir commands.

13.1.1 Using the mix and elixir commands

So far, we’ve been using the iex -S mix command to start the system. It’s also possible to start the system with mix run --no-halt. This command starts the BEAM instance and then starts your OTP application together with its dependencies. The --no-halt option instructs mix to keep the BEAM instance running forever:

$ mix run --no-halt          
 
Starting database worker.
Starting database worker.
Starting database worker.
Starting to-do cache.

Starts the system without the iex shell

Compared to iex -S mix, the important difference is that mix run doesn’t start the interactive shell.

A slightly more elaborate option is to use the elixir command:

$ elixir -S mix run --no-halt
 
Starting database worker.
Starting database worker.
Starting database worker.
Starting to-do cache.

This approach requires a bit more typing, but it allows you to run the system in the background.

By using the -detached Erlang flag, you can start the system in detached mode. The OS process will be detached from the terminal, and there will be no console output. When starting a detached system, it’s also useful to turn the BEAM instance into a node, so you can later interact with it and terminate it when needed:

$ elixir --erl "-detached" --sname todo_system@localhost \
    -S mix run --no-halt

This starts the BEAM instance in the background.

You can check that it’s running by looking at which BEAM nodes exist on your system:

$ epmd -names
 
epmd: up and running on port 4369 with data:
name todo_system at port 51028                 

The node is running.

At this point, your system is running, and you can use it—for example, by issuing an HTTP request to manipulate to-do lists.

You can connect to a running BEAM instance and interact with it. It’s possible to establish a remote shell—something like a terminal shell session to the running BEAM instance. In particular, with the the --remsh option, you can start another node and use it as a shell to the todo_system node:

$ iex --sname debugger@localhost --remsh todo_system@localhost --hidden
iex(todo_system@localhost)1>       

Shell is running on the todo_system node.

In this example, you start the debugger node, but the shell is running in the context of todo_system. Whatever function you call will be invoked on todo_system. This is extremely useful because you can now interact with the running system. BEAM provides all kinds of nice services that allow you to query the system and individual processes, as we’ll discuss a bit later.

Notice that you start the debugger node as hidden. As mentioned in chapter 12, this means the debugger node won’t appear in the results of Node.list (or Node.list([:this, :visible])) on todo_system, so it won’t be considered part of the cluster.

To stop the running system, you can use the System.stop function (https://hexdocs.pm/elixir/System.xhtml#stop/1), which takes down the system in a graceful manner. It shuts down all running applications and then terminates the BEAM instance:

iex(todo_system@localhost)1> System.stop()

The remote shell session is left hanging, and an attempt to run any other command will result in an error:

iex(todo_system@localhost)2>
*** ERROR: Shell process terminated! (^G to start new job) ***

At this point, you can close the shell and verify the running BEAM nodes:

$ epmd -names
epmd: up and running on port 4369 with data:

If you want to stop a node programmatically, you can rely on the distributed features described in chapter 12. Here’s a quick example:

if Node.connect(:todo_system@localhost) == true do
  :rpc.call(:todo_system@localhost, System, :stop, [])    
  IO.puts "Node terminated."
else
  IO.puts "Can't connect to a remote node."
end

Invokes System.stop on a remote node

Here, you connect to a remote node and then rely on :rpc.call/4 to invoke System .stop there.

You can store the code in the stop_node.exs file (the .exs extension is frequently used for Elixir-based scripts). Then, you can run the script from the command line:

$ elixir --sname terminator@localhost stop_node.exs

Running a script starts a separate BEAM instance and interprets the code in that instance. After the script code is executed, the host instance is terminated. Because the script instance needs to connect to a remote node (the one you want to terminate), you need to give it a name to turn the BEAM instance into a proper node.

13.1.2 Running scripts

I haven’t discussed scripts and tools so far, but they’re worth a quick mention. Sometimes, you may want to build a command-line tool that does some processing, produces the results, and then stops. The simplest way to go about that is to write a script.

You can create a plain Elixir file, give it an .exs extension to indicate it’s a script, implement one or more modules, and invoke a function:

defmodule MyTool do
  def run do
    ...
  end
 
  ...
end
 
MyTool.run()   

Starts the tool

You can then invoke the script with the elixir my_script.exs command. All modules you define will be compiled in memory, and all expressions outside of any module will be interpreted. After everything finishes, the script will terminate. Of course, an Elixir script can run only on a system with correct versions of Erlang and Elixir installed.

External libraries can be added with Mix.install (https://hexdocs.pm/mix/Mix.xhtml#install/2). For example, the following script uses the Jason library to parse the JSON content provided as the command line argument:

Mix.install([{:jason, "~> 1.4"}])    
 
input = hd(System.argv())
decoded = Jason.decode!(input)       
 
IO.inspect(decoded)

Installs the Jason dependency

Uses the Jason library

The list passed to Mix.install follows the same format as the dependency list used in mix.exs.

Let’s try this out. Save the code above to the file named json_decode.exs. Then, execute the script:

$ elixir json_decode.exs '{"some_key": 42}'
 
Resolving Hex dependencies...    
Resolution completed in 0.011s   
New:                             
  jason 1.4.0                    
* Getting jason (Hex package)    
==> jason                        
Compiling 10 files (.ex)         
Generated jason app              
 
%{"some_key" => 42}              

Dependency installation and compilation

Script output

When the script is executed for the first time, Mix installs the dependency, compiles it, and caches the result to the disk. Subsequent executions will use the cached version, so the script will run much more quickly than on the first run.

An .exs script is fine for simpler tools, but it’s not efficient when the code becomes more complex. In this case, it’s best to use a proper Mix project and build a full OTP application.

But because you’re not building a system that runs continuously, you also need to include a runner module in the project—something that does processing and produces output:

defmodule MyTool.Runner do
  def run do
    ...
  end
end

Then, you can start the tool with mix run -e MyTool.Runner.run. This starts the OTP application, invokes the MyTool.Runner.run/0 function, and terminates as soon as the function is finished.

You can also package the entire tool in an escript--a single binary file that embeds all your .beam files, Elixir .beam files, and the start-up code. An escript file is, thus, a fully compiled, cross-platform script that requires only the presence of Erlang on the running machine. For more details, refer to the mix escript.build documentation (https://hexdocs.pm/mix/Mix.Tasks.Escript.Build.xhtml).

A somewhat similar but more limited option is an Erlang archive, a zip file containing the compiled binaries. Compared to escripts, the main benefit of archives is that they can be installed globally with the mix archive.install task (https://hexdocs.pm/mix/Mix.Tasks.Archive.Install.xhtml). This makes them perfect to distribute system-wide Mix tasks. A popular example is the phx.new task, which is used to generate a new project powered by the Phoenix web framework. You can read more about building archives at https://hexdocs.pm/mix/Mix.Tasks.Archive.Build.xhtml.

13.1.3 Compiling for production

As mentioned in chapter 11, there’s a construct called the Mix environment—a compile-time identifier that allows you to conditionally define code. The default Mix environment is dev, indicating you’re dealing with development. In contrast, when you run tests with mix test, the code is compiled in the test environment.

You can use the Mix environment to conditionally include code for development- or test-time convenience. For example, you can rely on the Mix.env/0 function to define different versions of a function. Here’s a simple sketch:

defmodule Todo.Database do
  case Mix.env() do
    :dev ->
      def store(key, data) do ... end
 
    :test ->
      def store(key, data) do ... end
 
    _ ->
      def store(key, data) do ... end
  end
end

Notice how you branch on the result of Mix.env/0 at the module level, outside of any functions. This is a compile-time construct, and this code runs during compilation. The final definition of store/2 will depend on the Mix environment you’re using to compile the code. In the dev environment, you might run additional logging and benchmarking, whereas in the test environment, you might use an in-memory storage, such as ETS.

It’s important to understand that Mix.env/0 has meaning only during compilation. You should never rely on it at run time. In any case, your code may contain such conditional definitions, so you should assume your project isn’t completely optimized when compiled in the dev environment.

To start your system in production, you can set the MIX_ENV OS environment variable to the corresponding value:

$ MIX_ENV=prod elixir -S mix run --no-halt

This causes the recompilation of the code and all dependencies. All .beam files are stored in the _build/prod folder, and Mix ensures the BEAM instance loads files from this folder.

Tip It should be obvious from the discussion that the default compiled code (in the dev environment) isn’t optimized. The dev environment allows for better development convenience, but it makes the code perform less efficiently. When you decide to measure how your system behaves under a heavier load, you should always compile everything in the prod environment. Measuring with the dev environment may give you false indications about bottlenecks, and you may spend energy and time optimizing code that isn’t problematic at all in production.

You’ve now seen the basics of starting the system with mix and elixir. This process is straightforward, and it fits nicely into your development flow.

There are some serious downsides, though. First, to start the project with Mix, you need to compile it, which means the system source code must reside on the host machine. You need to fetch all dependencies and compile them as well. Consequently, you’ll need to install all the tools required for compilation on the target host machine. This includes Erlang and Elixir, Hex, and Mix, as well as any other third-party tools that you integrate in your Mix workflow.

Moreover, if you’re running multiple systems on the same machine, it can become increasingly difficult to reconcile the different versions of support tools necessary for different systems. Luckily, there’s a way out, in the form of OTP releases.

13.2 OTP releases

An OTP release is a standalone, compiled, runnable system that consists of the minimum set of OTP applications needed by the system. An OTP release can, optionally, include the minimum set of Erlang runtime binaries, which makes the release completely self-sufficient. A release doesn’t contain artifacts, such as source code, documentation files, or tests.

This approach provides all sorts of benefits. First, you can build the system on your development machine or the build server and ship only binary artifacts. The host machine doesn’t need to have any tools installed. If you embed the minimum Erlang runtime into the release, you don’t even need Elixir and Erlang installed on the production server. Whatever is required to run the system will be part of your release package. In addition, releases simplify some operational tasks, such as connecting to the running system and executing custom Elixir code in the system context. Finally, releases pave the way for systematic online system upgrades (and downgrades), known in Erlang as release handling.

13.2.1 Building a release

To build a release, you need to compile your main OTP application and all of its dependencies. Then, you need to include all the binaries in the release, together with the Erlang runtime. This can be done with the mix release command (https://hexdocs.pm/mix/Mix.Tasks.Release.xhtml).

Let’s see it in action. Go to the to-do folder, and run the release command:

$ mix release
 
* assembling todo-0.1.0 on MIX_ENV=dev
* using config/runtime.exs to configure the release at runtime
 
Release created at _build/dev/rel/todo
 
...

This builds the release in the dev Mix environment. Since release is meant to be running in production, you typically want to build it in the prod environment. You can do this by prefixing the command with MIX_ENV=prod. Alternatively, you can enforce the default environment for the release task in mix.exs.

Listing 13.1 Enforcing the prod environment for the release task (todo_release/mix.exs)

defmodule Todo.MixProject do
  ...
 
  def cli do
    [
      preferred_envs: [release: :prod]
    ]
  end
 
  ...
end

The cli function can be used to configure the default Mix environments for different Mix tasks. The function must return a keyword list with supported options. The :preferred_envs option is a keyword list, where each key is the task name (provided as an atom), and the value is the desired default environment for that task.

With this change in place, you can invoke mix release, which will compile your project in the prod environment and then generate the release:

$ mix release
 
* assembling todo-0.1.0 on MIX_ENV=prod
...

After mix release is done, the release will reside in the _build/prod/rel/todo/ subfolder. We’ll discuss the release’s contents a bit later, but first, let’s see how you can use it.

13.2.2 Using a release

The main tool used to interact with a release is the shell script that resides in _build/prod/rel/todo/bin/todo. You can use it to perform all kinds of tasks, such as these:

The simplest way to verify that the release works is to start the system in the foreground together with the iex shell:

$ RELEASE_NODE="todo@localhost" _build/prod/rel/todo/bin/todo start_iex
 
Starting database worker.
Starting database worker.
Starting database worker.
Starting to-do cache.
 
iex(todo@localhost)1>

Here, the RELEASE_NODE OS environment variable is set to the desired node name. Without it, Elixir would choose a default value based on the host name. To make the example work on different machines, the hardcoded value using localhost as the host part is chosen. Note that this is a short node name. If you want to use long names, you’ll also need to set the RELEASE_DISTRIBUTION OS environment variable to the value name. Refer to the mix release documentation for more details on how to configure the release.

The release is no longer dependent on your system’s Erlang and Elixir. It’s fully standalone; you can copy the contents of the _build/prod/rel/todo subfolder to another machine where Elixir and Erlang aren’t installed, and it will still work. Of course, because the release contains Erlang runtime binaries, the target machine must be powered by the same OS and architecture.

To start the system as a background process, you can use the daemon command:

$ RELEASE_NODE="todo@localhost" _build/prod/rel/todo/bin/todo daemon

This isn’t the same as a detached process, mentioned earlier. Instead, the system is started via the run_erl tool (https://erlang.org/doc/man/run_erl.xhtml). This tool redirects standard output to a log file residing in the _build/prod/rel/todo/tmp/log folder, which allows you to analyze your system’s console output.

Once the system is running in the background, you can start a remote shell to the node:

$ RELEASE_NODE="todo@localhost" _build/prod/rel/todo/bin/todo remote
 
iex(todo@localhost)1>

At this point, you have an iex shell session running in the context of the production node. Pressing Ctrl-C twice to exit the shell stops the remote shell, but the todo node will still be running.

If the system is running as a background process, and you want to stop it, you can use the stop command:

$ RELEASE_NODE="todo@localhost" _build/prod/rel/todo/bin/todo stop

It’s also possible to attach directly to the shell of the running process. Attaching offers an important benefit: it captures the standard output of the running node. Whatever the running node prints—for example, via IO.puts—is seen in the attached process (which isn’t the case for the remote shell).

Let’s see it in action. First, we’ll start the release in background with iex running. This can be done with the daemon_iex command:

$ RELEASE_NODE="todo@localhost" _build/prod/rel/todo/bin/todo daemon_iex

Now, we can attach to the shell with the to_erl tool:

$ _build/prod/rel/todo/erts-13.0/bin/to_erl _build/prod/rel/todo/tmp/pipe/
 
iex(todo@localhost)1>
 
[memory_usage: 70117728, process_count: 230]    

Captured standard output of the console

Back in chapter 10, you added a job that periodically prints memory usage and process count to the standard output. The output of this job is present when you attach to the shell. Conversely, when running a remote shell, this output won’t be seen.

Be careful when attaching to the shell. Unlike a remote shell, an attached shell runs in the context of the running node. You’re merely attached to the running node via an OS pipe. Consequently, you can only have one attached session at a time. In addition, you might accidentally stop the running node by hitting Ctrl-\. You should press Ctrl-D to detach from the running node, without stopping it.

The todo script can perform various other commands. To get the help, simply invoke _build/prod/rel/todo/bin/todo without any argument. This will print the help to the standard output. Finally, for more details on building a release, take a look at the official Mix documentation at https://hexdocs.pm/mix/Mix.Tasks.Release.xhtml.

13.2.3 Release contents

Let’s spend some time discussing the structure of your release. A fully standalone release consists of the following:

In this case, all these reside somewhere in the _build/prod/rel/todo folder. Let’s take a closer look at some important parts of the release.

Compiled binaries

Compiled versions of all required applications reside in the _build/prod/rel/todo/ lib folder:

$ ls -1 _build/prod/rel/todo/lib
 
asn1-5.1
compiler-8.3
cowboy-2.10.0
cowboy_telemetry-0.4.0
cowlib-2.12.1
crypto-5.2
eex-1.15.0
elixir-1.15.0
iex-1.15.0
kernel-9.0
logger-1.15.0
mime-2.0.3
plug-1.14.2
plug_cowboy-2.6.1
plug_crypto-1.2.5
poolboy-1.5.2
public_key-1.14
ranch-1.8.0
runtime_tools-2.0
sasl-4.2.1
ssl-11.0
stdlib-5.0
telemetry-1.2.1
todo-0.1.0

This list includes all of your runtime dependencies, both direct (specified in mix.exs) and indirect (dependencies of dependencies). In addition, some OTP applications, such as kernel, stdlib, and elixir, are automatically included in the release. These are core OTP applications needed by any Elixir-based system. Finally, the iex application is also included, which makes it possible to run the remote iex shell.

In each of these folders, there is an ebin subfolder, where the compiled binaries reside together with the .app file. Each OTP application folder may also contain the priv folder with additional application-specific files.

Tip If you need to include additional files in the release, the best way to do it is to create a priv folder under your project root. This folder, if it exists, automatically appears in the release under the application folder. When you need to access a file from the priv folder, you can invoke Application.app_ dir(:an_app_name, "priv") to find the folder’s absolute path.

Bundling all required OTP applications makes the release standalone. Because the system includes all required binaries (including the Elixir and Erlang standard libraries), nothing else is required on the target host machine.

You can prove this by looking at the load paths:

$ RELEASE_NODE="todo@localhost" _build/prod/rel/todo/bin/todo start_iex
 
iex(todo@localhost)1> :code.get_path()        
 
[~c"ch13/todo_release/_build/prod/rel/todo/lib/../releases/0.1.0/consolidated",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/kernel-9.0/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/stdlib-5.0/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/compiler-8.3/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/elixir-1.15.0/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/sasl-4.2.1/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/logger-1.15.0/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/crypto-5.2/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/cowlib-2.12.1/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/asn1-5.1/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/public_key-1.14/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/ssl-11.0/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/ranch-1.8.0/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/cowboy-2.10.0/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/telemetry-1.2.1/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/cowboy_telemetry-0.4.0/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/eex-1.15.0/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/mime-2.0.3/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/plug_crypto-1.2.5/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/plug-1.14.2/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/plug_cowboy-2.6.1/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/poolboy-1.5.2/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/runtime_tools-2.0/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/todo-0.1.0/ebin",
 ~c"ch13/todo_release/_build/prod/rel/todo/lib/iex-1.15.0/ebin"]

Retrieves a list of load paths

Notice how all the load paths point to the release folder. In contrast, when you start a plain iex -S mix shell and run :code.get_path/0, you’ll see a much longer list of load paths, with some pointing to the build folder and others pointing to the system Elixir and Erlang installation paths. This should convince you that your release is self-contained. The runtime will only look for modules in the release folder.

In addition, the minimum Erlang binaries are included in the release. They reside in _build/prod/rel/todo/erts-X.Y, where X.Y corresponds to the runtime version number (which isn’t related to the Erlang version number). The fact that the Erlang runtime is included makes the release completely standalone. Moreover, it allows you to run multiple systems powered by different Elixir or Erlang versions on the same machine.

Configurations

Configuration files reside in the _build/prod/rel/todo/releases/0.1.0 folder, with 0.1.0 corresponding to the version of your todo application (as provided in mix.exs). The two most relevant files in this folder are vm.args and env.sh.

The vm.args file can be used to provide flags to the Erlang runtime, such as the +P flag, which sets the maximum number of running processes. The env.sh file can be used to set environment variables, such as RELEASE_NODE and RELEASE_DISTRIBUTION, mentioned earlier. For more details on how to provide your own versions of these files, see https://hexdocs.pm/mix/Mix.Tasks.Release.xhtml#module-vm-args-and-env-sh-env-bat.

13.2.4 Packaging in a Docker container

There are many ways of running the system in production. You could deploy it to a platform as a service (PaaS), such as Heroku, Fly.io, or Gigalixir, or you could run it in a Kubernetes cluster. Yet another option is to run the system as a service under a service manager, such as systemd.

No matter which deployment strategy you choose, you should strive to run the system as an OTP release. In most cases, this means starting the release in the foreground. Therefore, the valid start commands are either start_iex or start.

The former command also starts the iex session. This allows you to attach to the iex shell of the running BEAM node and interact with the production system while capturing the node’s standard output. On the flip side, this approach is risky because you might end up accidentally stopping the node (by pressing Ctrl-C twice).

In contrast, the start command will start the system in foreground but without the iex session. Consequently, you won’t be able to attach to the main iex shell. You can still interact with the running system by establishing a remote iex shell session, but in this case, the node’s standard output isn’t captured.

Specific deployment steps depend on the chosen strategy. There are too many options to cover them all. A good basic introduction to some of the popular choices is given in the deployment guide of the Phoenix web framework (https://hexdocs.pm/phoenix/deployment.xhtml).

As a small example, let’s see how to run the to-do system inside a Docker container. Docker is a popular option chosen by many teams because it helps automate deployments, supports running a production-like version locally, and paves the way for various deployment options, especially in the cloud space. This part assumes you’re somewhat familiar with Docker. If that’s not the case, you can take a look at the official get started guide at https://docs.docker.com/get-started/.

The Docker image for an Elixir project is typically built in two stages. In the first stage, often called build, you need to compile the code and assemble the OTP release. Then, in the second stage, you copy the release over to the final image, which is going to be deployed to the target hosts. The final image doesn’t contain build tools, such as Erlang and Elixir. Such tools are not needed because the OTP release itself contains the minimum set of the required Erlang and Elixir binaries.

To build the Docker image, we need to create the file named Dockerfile in the project root. The following listing presents the first build stage, which produces the OTP release.

Listing 13.2 The build stage (todo_release/Dockerfile)

ARG ELIXIR="1.15.4"                                               
ARG ERLANG="26.0.2"                                               
ARG DEBIAN="bookworm-20230612-slim"                               
ARG OS="debian-${DEBIAN}"                                         
FROM "hexpm/elixir:${ELIXIR}-erlang-${ERLANG}-${OS}" as builder   
 
WORKDIR /todo
 
ENV MIX_ENV="prod"                                                
 
RUN mix local.hex --force && mix local.rebar --force              
 
COPY mix.exs mix.lock ./                                          
COPY config config                                                
COPY lib lib                                                      
 
RUN mix deps.get --only prod                                      
 
RUN mix release                                                   
 
...

Base image

Uses prod mix env by default

Installs build tools

Copies the required source files

Fetches prod deps

Builds the release

The base Docker image used in this example is maintained by the Hex package manager team (https://hub.docker.com/r/hexpm/elixir).

It’s worth noting that for the sake of brevity, this Docker file is too naive because it doesn’t take advantage of the Docker layer caching. As a result, a change in any source file will require the full project recompilation, including all the dependencies. For a more refined way of building the image, take a look at the Dockerfile generated by the Phoenix web framework (https://hexdocs.pm/phoenix/releases.xhtml#containers).

Next, let’s move on to build the final image.

Listing 13.3 Building the final image (todo_release/Dockerfile)

ARG DEBIAN="bookworm-20230612-slim"
 
...
 
FROM debian:${DEBIAN}                    
 
WORKDIR "/todo"
 
RUN apt-get update -y && apt-get install -y openssl locales
 
COPY \                                   
  --from=builder \                       
  --chown=nobody:root \                  
  /todo/_build/prod/rel/todo ./          
 
RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen
ENV LANG="en_US.UTF-8"
ENV LANGUAGE="en_US:en"
ENV LC_ALL="en_US.UTF-8"
 
CMD ["/todo/bin/todo", "start_iex"]      

Base image

Copies the built release

Defines the start command

The first thing to notice is that the base image is Debian, not Elixir or Erlang. It’s important to use the same base OS as the one used in the builder image. Otherwise, you might experience crashes due to incompatibilities.

To build the final image, you need to copy the OTP release from the build stage, configure the locale, and define the default start command. In this example, the start_iex command is chosen, which makes it possible to attach to the running shell.

At this point, you can build the image:

$ docker build . -t elixir-in-action/todo

Next, you can start the container:

$ docker run             \
    --rm -it             \
    --name todo_system   \
    -p "5454:5454"       \    
    elixir-in-action/todo

Publishes the http port to the host

You can now interact with the system locally:

$ curl -d "" \
  "http://localhost:5454/add_entry?list=bob&date=2023-12-19&title=Dentist"
OK
 
$ curl "http://localhost:5454/entries?list=bob&date=2023-12-19"
2023-12-19 Dentist

Like the build stage, the production image is overly naive. In particular, it doesn’t support clustering via distributed Erlang, or establishing a remote shell (via the --remsh switch). This can be addressed with some work, but for the sake of brevity, it’s not discussed here. If you want to establish an Erlang cluster from multiple containers, especially if they are running in a Kubernetes cluster, take a look at the libcluster library (https://hexdocs.pm/libcluster/).

This concludes the topic of releases. Once you have your system up and running, it’s useful to see how you can analyze its behavior.

13.3 Analyzing system behavior

Even after the system is built and placed in production, your work isn’t done. Things will occasionally go wrong, and you’ll experience errors. The code also may not be properly optimized, and you may end up consuming too many resources. If you manage to properly implement a fault-tolerant system, it may recover and cope with the errors and increased load. Regardless, you’ll still need to get to the bottom of any issues and fix them.

Given that your system is highly concurrent and distributed, it may not be obvious how you can discover and understand the issues that arise. Proper treatment of this topic could easily fill a separate book—and an excellent free book is available, called Stuff Goes Bad: Erlang in Anger, by Fred Hébert (https://www.erlang-in-anger.com/). This chapter provides a basic introduction to some standard techniques of analyzing complex BEAM systems, but if you plan to run Elixir or Erlang code in production, you should at some point study the topic in more detail, and Stuff Goes Bad is a great place to start.

13.3.1 Debugging

Although it’s not strictly related to the running system, debugging deserves a brief mention. It may come as a surprise that standard step-by-step debugging isn’t a frequently used approach in Erlang (which ships with a GUI-based debugger; see https://www.erlang.org/doc/apps/debugger/debugger_chapter.xhtml). That’s because it’s impossible to do classical debugging of a highly concurrent system, where many things happen simultaneously. Imagine you set a breakpoint in a process. What should happen to other processes when the breakpoint is encountered? Should they continue running, or should they pause as well? Once you step over a line, should all other processes move forward by a single step? How should timeouts be handled? What happens if you’re debugging a distributed system? As you can see, there are many problems with classical debugging, due to the highly concurrent and distributed nature of BEAM-powered systems.

Instead of relying on a debugger, you should adopt more appropriate strategies. The key to understanding a highly concurrent system lies in logging and tracing. Once something goes wrong, you’ll want to have as much information as possible, which will allow you to find the cause of the problems.

The nice thing is that some logging is available out of the box in the form of Elixir’s logger application (https://hexdocs.pm/logger/Logger.xhtml). In particular, whenever an OTP-compliant process crashes (e.g., GenServer), an error is printed, together with a stack trace. The stack trace also contains file and line information, so this should serve as a good starting point for investigating the error.

Sometimes, the failure reason may not be obvious from the stack trace, and you’ll need more data. At development time, a primitive helper tool for this purpose is IO.inspect. Remember that IO.inspect takes an expression, prints its result, and returns it. This means you can surround any part of the code with IO.inspect (or pipe into it via |>) without affecting the behavior of the program. This is a simple technique that can help you quickly determine the cause of the problem, and I use it frequently when a new piece of code goes wrong. Placing IO.inspect to see how values were propagated to the failing location often helps me discover errors. Once I’m done fixing the problem, I remove the IO.inspect calls.

A richer experience can be obtained with the dbg macro (https://hexdocs.pm/elixir/Kernel.xhtml#dbg/2). Similarly to IO.inspect, this macro generates the code that returns its input argument. As a result, any expression can be safely wrapped in dbg, as long as it’s not binding any variables. The dbg macro prints more detailed information, such as intermediate results of the pipe chain.

Another useful feature is pry, which allows you to temporarily stop execution in the iex shell and inspect the state of the system, such as variables that are in scope. For detailed instructions, refer to the IEx.pry/0 documentation (https://hexdocs.pm/iex/IEx.xhtml#pry/0). An overview of typical debugging techniques is also available on the official Elixir site at https://elixir-lang.org/getting-started/debugging.xhtml.

It goes without saying that automated tests can be of significant assistance. Testing individual parts in isolation can help you quickly discover and fix errors.

It’s also worth mentioning a couple of useful benchmarking and profiling tools. The most primitive one comes in the form of the :timer.tc/1 function (https://erlang.org/doc/man/timer.xhtml#tc-1), which takes a lambda, runs it, and returns its result together with the running time (in microseconds).

In addition, a few profiling tools are shipped with Erlang/OTP: cprof, eprof, and fprof. Elixir includes mix tasks for running these tools:

Finally, there are various benching libraries available, such as Benchee (https://hexdocs.pm/benchee). I won’t explain these in detail, so when you decide to profile, it’s best to start reading the official documentation as well as the Erlang documentation at https://www.erlang.org/doc/efficiency_guide/profiling.xhtml.

13.3.2 Logging

Once you’re in production, you shouldn’t rely on IO.inspect or dbg calls anymore. Instead, it’s better to log various pieces of information that may help you understand what went wrong. For this purpose, you can rely on Elixir’s logger application. When you generate your Mix project, this dependency will be included automatically, and you’re encouraged to use logger to log various events. As already mentioned, logger automatically catches various BEAM reports, such as crash errors that happen in processes.

Logging information goes to the console, by default. If you start your system as a release, the standard output will be forwarded to the log folder under the root folder of your release, and you’ll be able to later find and analyze those errors.

Of course, you can write a custom logger handler, such as one that writes to syslog or sends log reports to a different machine. See the logger documentation for more details (https://hexdocs.pm/logger/Logger.xhtml). The logger application is mostly a wrapper around Erlang’s :logger module, so it’s also worth studying the Erlang logging guide (https://www.erlang.org/doc/apps/kernel/logger_chapter.xhtml).

13.3.3 Interacting with the system

A substantial benefit of the Erlang runtime is that you can connect to the running node and interact with it in various ways. You can send messages to processes and stop or restart different processes (including supervisors) or OTP applications. It’s even possible to force the VM to reload the code for a module.

On top of this, all sorts of built-in functions allow you to gather data about the system and individual processes. For example, you can start a remote shell and use functions such as :erlang.system_info/1 and :erlang.memory/0 to get information about the runtime.

You can also get a list of all processes using Process.list/0 and then query each process in detail with Process.info/1, which returns information such as memory usage and the total number of instructions (known in Erlang as reductions) the process has executed. Such services make way for tools that can connect to the running system and present BEAM system information in a GUI.

One example is the observer application, which you’ve seen in chapter 11. Being GUI-based, observer works only when there’s a windowing system in the host OS. On the production server, this usually isn’t the case. But you can start the observer locally and have it gather data from a remote node.

Let’s see this in action. You’ll start your system as a background service and then start another node on which you’ll run the observer application. The observer application will connect to the remote node, collect data from it, and present it in the GUI.

The production system doesn’t need to run the observer application, but it needs to contain the modules that gather data for the remote observer application. These modules are part of the runtime_tools application you need to include in your release. You can easily do this via the :extra_applications option in mix.exs.

Listing 13.4 Including runtime_tools in a release (todo_release/mix.exs)

defmodule Todo.MixProject do
  ...
 
  def application do
    [
      extra_applications: [:logger, :runtime_tools],   
      ...
    ]
  end
 
  ...
end

Includes runtime_tools in the OTP release

The :extra_applications option specifies Elixir and Erlang stock OTP applications you depend on. By default, Elixir’s :logger OTP application is included as a dependency when you generate a new project with the mix tool.

note Notice that :extra_applications serves a different purpose than the deps function in the mix.exs file. With deps, you list third-party dependencies that must be fetched and compiled. In contrast, with :extra_applications, you list Elixir and Erlang stock applications that are already compiled on your disk, as a part of Erlang and Elixir installations. The code of these dependencies doesn’t have to be fetched, and nothing needs to be compiled. But you still need to list these dependencies to ensure applications are included in the OTP release.

With this change, runtime_tools is included in your OTP release, and now, you can remotely observe the production system. Let’s see this in action. First, you need to start the to-do system in the background:

$ RELEASE_NODE="todo@localhost" \
  RELEASE_COOKIE="todo" \
  _build/prod/rel/todo/bin/todo daemon

Note that the RELEASE_COOKIE OS environment variable is set to configure the secret node cookie.

Now, start the interactive shell as a named node, and then start the observer application:

$ iex --hidden --sname observer@localhost --cookie todo
 
iex(observer@localhost)1> :observer.start()

Note how you explicitly set the node’s cookie to match the one used in the running system. Also, just as with the earlier remsh example in section 13.1.1, you start the node as hidden. Once the observer is started, you need to select Nodes > todo@localhost from the menu. At this point, observer is presenting the data about the production node.

It’s worth mentioning that observer and runtime_tools are written in plain Erlang and rely on lower-level functions to gather data and present it in various ways. Therefore, you can use other kinds of frontends or even write your own. One example is observer_cli (https://github.com/zhongwencool/observer_cli), an observer-like frontend with a textual interface, which can be used via the command-line interface.

13.3.4 Tracing

It’s also possible to turn on traces related to processes and function calls, relying on services from the :sys (https://www.erlang.org/doc/man/sys.xhtml) and :dbg (https://www.erlang.org/doc/man/dbg.xhtml) modules. The :sys module allows you to trace OTP-compliant processes (e.g., GenServer). Tracing is done on the standard output, so you need to attach to the system (as opposed to establishing a remote shell). Then, you can turn on tracing for a particular process with the help of :sys.trace/2.

Let’s see it in action. Make sure that the node is not running, and then start it in the background with iex started:

$ TODO_SERVER_EXPIRY=600 \
  RELEASE_NODE="todo@localhost" \
  RELEASE_COOKIE="todo" \
  _build/prod/rel/todo/bin/todo daemon_iex

For the purpose of this demo, the todo server expiry time is increased to 10 minutes.

Now, you can attach to the running node and trace the process:

$ _build/prod/rel/todo/erts-13.0/bin/to_erl _build/prod/rel/todo/tmp/pipe/
 
iex(todo@localhost)1> :sys.trace(Todo.Cache.server_process("bob"), true)

This turns on console tracing. Information about process-related events, such as received requests, will be printed to the standard output.

Now, issue an HTTP request for Bob’s list:

$ curl "http://localhost:5454/entries?list=bob&date=2023-12-19"

Back in the attached shell, you should see something like this:

*DBG* {todo_server,<<"bob">>} got call {entries,
  #{'__struct__' => 'Elixir.Date', calendar => 'Elixir.Calendar.ISO',
    day => 19,month => 12, year => 2023}} from <0.983.0>}
 
*DBG* {todo_server,<<"bob">>} sent [] to <0.322.0>,
  new state {<<"bob">>, #{'__struct__' => 'Elixir.Todo.List',
    next_id => 1, entries => #{}}}

The output may seem a bit cryptic, but if you look carefully, you can see two trace entries: one for a received call request and another for the response you sent. You can also see the full state of the server process. Keep in mind that all terms are printed in Erlang syntax.

Tracing is a powerful tool because it allows you to analyze the behavior of the running system. But be careful because excessive tracing may hurt the system’s performance. If the server process you’re tracing is heavily loaded or has a huge state, BEAM will spend a lot of time doing tracing I/O, which may slow down the entire system.

In any case, once you’ve gathered some knowledge about the process, you should stop tracing it:

iex(todo@localhost)1> :sys.trace(Todo.Cache.server_process("bob"), false)

Other useful services from :sys allow you to get the OTP process state (:sys.get_state/1) and even change it (:sys.replace_state/2). Those functions are meant to be used purely for debugging or hacky manual fixes—you shouldn’t invoke them from your code.

Another useful tracing tool comes with the :erlang.trace/3 function (https://www.erlang.org/doc/man/erlang.xhtml#trace-3), which allows you to subscribe to events in the system such as message passing or function calls.

Additionally, a module called :dbg (https://www.erlang.org/doc/man/dbg.xhtml) simplifies tracing. You can run :dbg directly on the attached console, but it’s also possible to start another node and make it trace the main system. This is the route you’ll take in the next example.

Assuming the to-do node is still running, start another node:

$ iex --sname tracer@localhost --cookie todo --hidden

Now, on the tracer node, start tracing the main todo node, and then specify that you’re interested in all calls to functions from the Todo.Server module:

iex(tracer@localhost)1> :dbg.tracer()                
iex(tracer@localhost)2> :dbg.n(:"todo@localhost")    
iex(tracer@localhost)3> :dbg.p(:all, [:call])        
iex(tracer@localhost)4> :dbg.tp(Todo.Server, [])     

Starts the tracer process

Subscribes only to events from the todo node

Subscribes to function calls in all processes

Sets the trace pattern to all functions from the Todo.Server process

With traces set up, you can make an HTTP request to retrieve Bob’s entries. In the shell of the tracer node, you should see something like the following:

(<12505.1106.0>) call 'Elixir.Todo.Server':whereis(<<"bob">>)
(<12505.1106.0>) call 'Elixir.Todo.Server':child_spec(<<"bob">>)
(<12505.1012.0>) call 'Elixir.Todo.Server':start_link(<<"bob">>)
(<12505.1107.0>) call 'Elixir.Todo.Server':init(<<"bob">>)
(<12505.1107.0>) call 'Elixir.Todo.Server':handle_continue(init, ...)
(<12505.1106.0>) call 'Elixir.Todo.Server':entries(<12505.1107.0>, ...)
(<12505.1107.0>) call 'Elixir.Todo.Server':handle_call({entries, ...})

Each output line shows the caller process, the invoked function, and the input arguments.

Be careful about tracing in production because huge numbers of traces may flood the system. Once you’re finished tracing, invoke :dbg.stop_clear/0 to stop all traces.

This was, admittedly, a brief demo; :dbg has many more options. If you decide to do some tracing, you should look at the :dbg documentation. In addition, you should take a look at the library called Recon (https://github.com/ferd/recon), which provides many useful functions for analyzing a running BEAM node.

We’re now finished exploring Elixir, Erlang, and OTP. This book covered the primary aspects of the Elixir language, basic functional programming idioms, the Erlang concurrency model, and the most frequently used OTP behaviors (GenServer, Supervisor, and Application). In my experience, these are the most frequently needed building blocks of Elixir and Erlang systems.

Of course, many topics have been left untreated, so your journey doesn’t stop here. You’ll probably want to look for other knowledge resources, such as other books, blogs, and podcasts. A good starting place to look for further material is the “Learning” page on the official Elixir site (https://elixir-lang.org/learning.xhtml).

Summary