24 Coroutines

We do not need coroutines very often, but when we do, it is an unparalleled feature. Coroutines can literally turn upside-down the relationship between callers and callees, and this flexibility solves what I call the “who-is-the-boss” (or “who-has-the-main-loop”) problem in software architecture. This is a generalization of several seemingly unrelated problems, such as entanglement in event-driven programs, building iterators through generators, and cooperative multithreading. Coroutines solve all these problems in simple and efficient ways.

A coroutine is similar to a thread (in the sense of multithreading): it is a line of execution, with its own stack, its own local variables, and its own instruction pointer; it shares global variables and mostly anything else with other coroutines. The main difference between threads and coroutines is that a multithreaded program runs several threads in parallel, while coroutines are collaborative: at any given time, a program with coroutines is running only one of its coroutines, and this running coroutine suspends its execution only when it explicitly requests to be suspended.

In this chapter we will cover how coroutines work in Lua and how we can use them to solve a diverse set of problems.

Coroutine Basics

Lua packs all its coroutine-related functions in the table coroutine. The function create creates new coroutines. It has a single argument, a function with the code that the coroutine will run (the coroutine body). It returns a value of type "thread", which represents the new coroutine. Often, the argument to create is an anonymous function, like here:

      co = coroutine.create(function () print("hi") end)
      print(type(co))           --> thread

A coroutine can be in one of four states: suspended, running, normal, and dead. We can check the state of a coroutine with the function coroutine.status:

      print(coroutine.status(co))   --> suspended

When we create a coroutine, it starts in the suspended state; a coroutine does not run its body automatically when we create it. The function coroutine.resume (re)starts the execution of a coroutine, changing its state from suspended to running:

      coroutine.resume(co)   --> hi

(If you run this code in interactive mode, you may want to finish the previous line with a semicolon, to suppress the display of the result from resume.) In this first example, the coroutine body simply prints "hi" and terminates, leaving the coroutine in the dead state:

      print(coroutine.status(co))   --> dead

Until now, coroutines look like nothing more than a complicated way to call functions. The real power of coroutines stems from the function yield, which allows a running coroutine to suspend its own execution so that it can be resumed later. Let us see a simple example:

      co = coroutine.create(function ()
             for i = 1, 10 do
               print("co", i)
               coroutine.yield()
             end
           end)

Now, the coroutine body does a loop, printing numbers and yielding after each print. When we resume this coroutine, it starts its execution and runs until the first yield:

      coroutine.resume(co)    --> co   1

If we check its status, we can see that the coroutine is suspended and, therefore, can be resumed:

      print(coroutine.status(co))   --> suspended

From the coroutine’s point of view, all activity that happens while it is suspended is happening inside its call to yield. When we resume the coroutine, this call to yield finally returns and the coroutine continues its execution until the next yield or until its end:

      coroutine.resume(co)    --> co   2
      coroutine.resume(co)    --> co   3
        ...
      coroutine.resume(co)    --> co   10
      coroutine.resume(co)    -- prints nothing

During the last call to resume, the coroutine body finishes the loop and then returns, without printing anything. If we try to resume it again, resume returns false plus an error message:

      print(coroutine.resume(co))
        --> false   cannot resume dead coroutine

Note that resume runs in protected mode, like pcall. Therefore, if there is any error inside a coroutine, Lua will not show the error message, but instead will return it to the resume call.

When a coroutine resumes another, it is not suspended; after all, we cannot resume it. However, it is not running either, because the running coroutine is the other one. So, its own status is what we call the normal state.

A useful facility in Lua is that a pair resume–yield can exchange data. The first resume, which has no corresponding yield waiting for it, passes its extra arguments to the coroutine main function:

      co = coroutine.create(function (a, b, c)
             print("co", a, b, c + 2)
           end)
      coroutine.resume(co, 1, 2, 3)    --> co  1  2  5

A call to coroutine.resume returns, after the true that signals no errors, any arguments passed to the corresponding yield:

      co = coroutine.create(function (a,b)
             coroutine.yield(a + b, a - b)
           end)
      print(coroutine.resume(co, 20, 10))  --> true  30  10

Symmetrically, coroutine.yield returns any extra arguments passed to the corresponding resume:

      co = coroutine.create (function (x)
             print("co1", x)
             print("co2", coroutine.yield())
           end)
      coroutine.resume(co, "hi")     --> co1  hi
      coroutine.resume(co, 4, 5)     --> co2  4  5

Finally, when a coroutine ends, any values returned by its main function go to the corresponding resume:

      co = coroutine.create(function ()
             return 6, 7
           end)
      print(coroutine.resume(co))   --> true  6  7

We seldom use all these facilities in the same coroutine, but all of them have their uses.

Although the general concept of coroutines is well understood, the details vary considerably. So, for those that already know something about coroutines, it is important to clarify these details before we go on. Lua offers what we call asymmetric coroutines. This means that it has a function to suspend the execution of a coroutine and a different function to resume a suspended coroutine. Some other languages offer symmetric coroutines, where there is only one function to transfer control from one coroutine to another.

Some people call asymmetric coroutines semi-coroutines. However, other people use the same term semi-coroutine to denote a restricted implementation of coroutines, where a coroutine can suspend its execution only when it is not calling any function, that is, when it has no pending calls in its control stack. In other words, only the main body of such semi-coroutines can yield. (A generator in Python is an example of this meaning of semi-coroutines.)

Unlike the difference between symmetric and asymmetric coroutines, the difference between coroutines and generators (as presented in Python) is a deep one; generators are simply not powerful enough to implement some of the most interesting constructions that we can write with full coroutines. Lua offers full, asymmetric coroutines. Those that prefer symmetric coroutines can implement them on top of the asymmetric facilities of Lua (see Exercise 24.6).

Who Is the Boss?

One of the most paradigmatic examples of coroutines is the producer–consumer problem. Let us suppose that we have a function that continually produces values (e.g., reading them from a file) and another function that continually consumes these values (e.g., writing them to another file). These two functions could look like this:

      function producer ()
        while true do
          local x = io.read()     -- produce new value
          send(x)                 -- send it to consumer
        end
      end
      
      function consumer ()
        while true do
          local x = receive()        -- receive value from producer
          io.write(x, "\n")          -- consume it
        end
      end

(To simplify this example, both the producer and the consumer run forever. It is not hard to change them to stop when there is no more data to handle.) The problem here is how to match send with receive. It is a typical instance of the “who-has-the-main-loop” problem. Both the producer and the consumer are active, both have their own main loops, and both assume that the other is a callable service. For this particular example, it is easy to change the structure of one of the functions, unrolling its loop and making it a passive agent. However, this change of structure may be far from easy in other real scenarios.

Coroutines provide an ideal tool to match producers and consumers without changing their structure, because a resume–yield pair turns upside-down the typical relationship between the caller and its callee. When a coroutine calls yield, it does not enter into a new function; instead, it returns a pending call (to resume). Similarly, a call to resume does not start a new function, but returns a call to yield. This property is exactly what we need to match a send with a receive in such a way that each one acts as if it were the master and the other the slave. (That is why I called this the “who-is-the-boss” problem.) So, receive resumes the producer, so that it can produce a new value; and send yields the new value back to the consumer:

      function receive ()
        local status, value = coroutine.resume(producer)
        return value
      end
      
      function send (x)
        coroutine.yield(x)
      end

Of course, the producer must now run inside a coroutine:

      producer = coroutine.create(producer)

In this design, the program starts by calling the consumer. When the consumer needs an item, it resumes the producer, which runs until it has an item to give to the consumer, and then stops until the consumer resumes it again. Therefore, we have what we call a consumer-driven design. Another way to write the program is to use a producer-driven design, where the consumer is the coroutine. Although the details seem reversed, the overall idea of both designs is the same.

We can extend this design with filters, which are tasks that sit between the producer and the consumer doing some kind of transformation in the data. A filter is a consumer and a producer at the same time, so it resumes a producer to get new values and yields the transformed values to a consumer. As a trivial example, we can add to our previous code a filter that inserts a line number at the beginning of each line. The code is in Figure 24.1, “Producer–consumer with filters”.

Figure 24.1. Producer–consumer with filters

      function receive (prod)
        local status, value = coroutine.resume(prod)
        return value
      end
      
      function send (x)
        coroutine.yield(x)
      end
      
      function producer ()
        return coroutine.create(function ()
          while true do
            local x = io.read()     -- produce new value
            send(x)
          end
        end)
      end
      
      function filter (prod)
        return coroutine.create(function ()
          for line = 1, math.huge do
            local x = receive(prod)   -- get new value
            x = string.format("%5d %s", line, x)
            send(x)      -- send it to consumer
          end
        end)
      end
      
      function consumer (prod)
        while true do
          local x = receive(prod)    -- get new value
          io.write(x, "\n")          -- consume new value
        end
      end
      
      consumer(filter(producer()))

Its last line simply creates the components it needs, connects them, and starts the final consumer.

If you thought about POSIX pipes after reading the previous example, you are not alone. After all, coroutines are a kind of (non-preemptive) multithreading. With pipes, each task runs in a separate process; with coroutines, each task runs in a separate coroutine. Pipes provide a buffer between the writer (producer) and the reader (consumer) so there is some freedom in their relative speeds. This is important in the context of pipes, because the cost of switching between processes is high. With coroutines, the cost of switching between tasks is much smaller (roughly equivalent to a function call), so the writer and the reader can run hand in hand.

Coroutines as Iterators

We can see loop iterators as a particular example of the producer–consumer pattern: an iterator produces items to be consumed by the loop body. Therefore, it seems appropriate to use coroutines to write iterators. Indeed, coroutines provide a powerful tool for this task. Again, the key feature is their ability to turn inside out the relationship between caller and callee. With this feature, we can write iterators without worrying about how to keep state between successive calls.

To illustrate this kind of use, let us write an iterator to traverse all permutations of a given array. It is not an easy task to write directly such an iterator, but it is not so difficult to write a recursive function that generates all these permutations. The idea is simple: put each array element in the last position, in turn, and recursively generate all permutations of the remaining elements. The code is in Figure 24.2, “A function to generate permutations”.

Figure 24.2. A function to generate permutations

      function permgen (a, n)
        n = n or #a          -- default for 'n' is size of 'a'
        if n <= 1 then       -- nothing to change?
          printResult(a)
        else
          for i = 1, n do
      
            -- put i-th element as the last one
            a[n], a[i] = a[i], a[n]
      
            -- generate all permutations of the other elements
            permgen(a, n - 1)
      
            -- restore i-th element
            a[n], a[i] = a[i], a[n]
          end
        end
      end

To put it to work, we must define an appropriate printResult function and call permgen with proper arguments:

      function printResult (a)
        for i = 1, #a do io.write(a[i], " ") end
        io.write("\n")
      end
      
      permgen ({1,2,3,4})
        --> 2 3 4 1
        --> 3 2 4 1
        --> 3 4 2 1
          ...
        --> 2 1 3 4
        --> 1 2 3 4

After we have the generator ready, it is an automatic task to convert it to an iterator. First, we change printResult to yield:

      function permgen (a, n)
        n = n or #a
        if n <= 1 then
          coroutine.yield(a)
        else
          as before

Then, we define a factory that arranges for the generator to run inside a coroutine and creates the iterator function. The iterator simply resumes the coroutine to produce the next permutation:

      function permutations (a)
        local co = coroutine.create(function () permgen(a) end)
        return function ()   -- iterator
          local code, res = coroutine.resume(co)
          return res
        end
      end

With this machinery in place, it is trivial to iterate over all permutations of an array with a for statement:

      for p in permutations{"a", "b", "c"} do
        printResult(p)
      end
        --> b c a
        --> c b a
        --> c a b
        --> a c b
        --> b a c
        --> a b c

The function permutations uses a common pattern in Lua, which packs a call to resume with its corresponding coroutine inside a function. This pattern is so common that Lua provides a special function for it: coroutine.wrap. Like create, wrap creates a new coroutine. Unlike create, wrap does not return the coroutine itself; instead, it returns a function that, when called, resumes the coroutine. Unlike the original resume, that function does not return an error code as its first result; instead, it raises the error in case of error. Using wrap, we can write permutations as follows:

      function permutations (a)
        return coroutine.wrap(function () permgen(a) end)
      end

Usually, coroutine.wrap is simpler to use than coroutine.create. It gives us exactly what we need from a coroutine: a function to resume it. However, it is also less flexible. There is no way to check the status of a coroutine created with wrap. Moreover, we cannot check for runtime errors.

Event-Driven Programming

It may not be obvious at first sight, but the typical entanglement created by conventional event-driven programming is another consequence of the who-is-the-boss problem.

In a typical event-driven platform, an external entity generates events to our program in a so-called event loop (or run loop). It is clear who is the boss there, and it is not our code. Our program becomes a slave of the event loop, and that makes it a collection of individual event handlers without any apparent connection.

To make things a little more concrete, let us assume that we have an asynchronous I/O library similar to libuv. The library has four functions that concern our small example:

        lib.runloop();
        lib.readline(stream, callback);
        lib.writeline(stream, line, callback);
        lib.stop();

The first function runs the event loop, which will process the incoming events and call the associated callbacks. A typical event-driven program initializes some stuff and then calls this function, which becomes the main loop of the application. The second function instructs the library to read a line of the given stream and, when it is done, to call the given callback function with the result. The third function is similar to the second, but for writing a line. The last function breaks the event loop, usually to finish the program.

Figure 24.3, “An ugly implementation of the asynchronous I/O library” presents an implementation for such a library.

Figure 24.3. An ugly implementation of the asynchronous I/O library

      local cmdQueue = {}     -- queue of pending operations
      
      local lib = {}
      
      function lib.readline (stream, callback)
        local nextCmd = function ()
          callback(stream:read())
        end
        table.insert(cmdQueue, nextCmd)
      end
      
      function lib.writeline (stream, line, callback)
        local nextCmd = function ()
          callback(stream:write(line))
        end
        table.insert(cmdQueue, nextCmd)
      end
      
      function lib.stop ()
        table.insert(cmdQueue, "stop")
      end
      
      function lib.runloop ()
        while true do
          local nextCmd = table.remove(cmdQueue, 1)
          if nextCmd == "stop" then
            break
          else
            nextCmd()       -- perform next operation
          end
        end
      end
      
      return lib

It is a very ugly implementation. Its “event queue” is in fact a list of pending operations that, when performed (synchronously!), will generate the events. Despite its uglyness, it fulfills the previous specification and, therefore, allows us to test the following examples without the need for a real asynchronous library.

Let us now write a trivial program with that library, which reads all lines from its input stream into a table and writes them to the output stream in reverse order. With traditional I/O, the program would be like this:

      local t = {}
      local inp = io.input()                 -- input stream
      local out = io.output()                -- output stream
      
      for line in inp:lines() do
        t[#t + 1] = line
      end
      
      for i = #t, 1, -1 do
        out:write(t[i], "\n")
      end

Now we rewrite that program in an event-driven style on top of the asynchronous I/O library; the result is in Figure 24.4, “Reversing a file in event-driven fashion”.

Figure 24.4. Reversing a file in event-driven fashion

      local lib = require "async-lib"
      
      local t = {}
      local inp = io.input()
      local out = io.output()
      local i
      
      -- write-line handler
      local function putline ()
        i = i - 1
        if i == 0 then       -- no more lines?
          lib.stop()         -- finish the main loop
        else                 -- write line and prepare next one
          lib.writeline(out, t[i] .. "\n", putline)
        end
      end
      
      -- read-line handler
      local function getline (line)
        if line then                      -- not EOF?
          t[#t + 1] = line                -- save line
          lib.readline(inp, getline)      -- read next one
        else                              -- end of file
          i = #t + 1                      -- prepare write loop
          putline()                       -- enter write loop
        end
      end
      
      lib.readline(inp, getline)          -- ask to read first line
      lib.runloop()                       -- run the main loop

As is typical in an event-driven scenario, all our loops are gone, because the main loop is in the library. They got replaced by recursive calls disguised as events. We could improve things by using closures in a continuation-passing style, but we still could not write our own loops; we would have to rewrite them through recursion.

Coroutines allow us to reconcile our loops with the event loop. The key idea is to run our main code as a coroutine that, at each request to the library, sets the callback as a function to resume itself and then yields. Figure 24.5, “Running synchronous code on top of the asynchronous library” uses this idea to implement a library that runs conventional, synchronous code on top of the asynchronous I/O library.

Figure 24.5. Running synchronous code on top of the asynchronous library

      local lib = require "async-lib"
      
      function run (code)
        local co = coroutine.wrap(function ()
          code()
          lib.stop()     -- finish event loop when done
        end)
        co()             -- start coroutine
        lib.runloop()    -- start event loop
      end
      
      function putline (stream, line)
        local co = coroutine.running()     -- calling coroutine
        local callback = (function () coroutine.resume(co) end)
        lib.writeline(stream, line, callback)
        coroutine.yield()
      end
      
      function getline (stream, line)
        local co = coroutine.running()     -- calling coroutine
        local callback = (function (l) coroutine.resume(co, l) end)
        lib.readline(stream, callback)
        local line = coroutine.yield()
        return line
      end

As its name implies, the run function runs the synchronous code, which it takes as a parameter. It first creates a coroutine to run the given code and finish the event loop when it is done. Then, it resumes this coroutine (which will yield at its first I/O call) and then enters the event loop.

The functions getline and putline simulate synchronous I/O. As outlined, both call an appropriate asynchronous function passing as the callback a function that resumes the calling coroutine. (Note the use of the coroutine.running function to access the calling coroutine.) After that, they yield, and the control goes back to the event loop. Once the operation completes, the event loop calls the callback, resuming the coroutine that triggered the operation.

With that library in place, we are ready to run synchronous code on the top of the asynchronous library. As an example, the following fragment implements once more our reverse-lines example:

      run(function ()
        local t = {}
        local inp = io.input()
        local out = io.output()
      
        while true do
          local line = getline(inp)
          if not line then break end
          t[#t + 1] = line
        end
      
        for i = #t, 1, -1 do
          putline(out, t[i] .. "\n")
        end
      end)

The code is equal to the original synchronous one, except that it uses get/putline for I/O and runs inside a call to run. Underneath its synchronous structure, it actually runs in an event-driven fashion, and it is fully compatible with other parts of the program written in a more typical event-driven style.

Exercises

Exercise 24.1: Rewrite the producer–consumer example in the section called “Who Is the Boss?” using a producer-driven design, where the consumer is the coroutine and the producer is the main thread.

Exercise 24.2: Exercise 6.5 asked you to write a function that prints all combinations of the elements in a given array. Use coroutines to transform this function into a generator for combinations, to be used like here:

      for c in combinations({"a", "b", "c"}, 2) do
        printResult(c)
      end

Exercise 24.3: In Figure 24.5, “Running synchronous code on top of the asynchronous library”, both the functions getline and putline create a new closure each time they are called. Use memorization to avoid this waste.

Exercise 24.4: Write a line iterator for the coroutine-based library (Figure 24.5, “Running synchronous code on top of the asynchronous library”), so that you can read the file with a for loop.

Exercise 24.5: Can you use the coroutine-based library (Figure 24.5, “Running synchronous code on top of the asynchronous library”) to run multiple threads concurrently? What would you have to change?

Exercise 24.6: Implement a transfer function in Lua. If you think about resume–yield as similar to call–return, a transfer would be like a goto: it suspends the running coroutine and resumes any other coroutine, given as an argument. (Hint: use a kind of dispatch to control your coroutines. Then, a transfer would yield to the dispatch signaling the next coroutine to run, and the dispatch would resume that next coroutine.)