16 Compilation, Execution, and Errors

Although we refer to Lua as an interpreted language, Lua always precompiles source code to an intermediate form before running it. (This is not a big deal: many interpreted languages do the same.) The presence of a compilation phase may sound out of place in an interpreted language. However, the distinguishing feature of interpreted languages is not that they are not compiled, but that it is possible (and easy) to execute code generated on the fly. We may say that the presence of a function like dofile is what entitles us to call Lua an interpreted language.

In this chapter, we will discuss in more details the process that Lua uses for running its chunks, what compilation means (and does), how Lua runs that compiled code, and how it handles errors in that process.

Previously, we introduced dofile as a kind of primitive operation to run chunks of Lua code, but dofile is actually an auxiliary function: the function loadfile does the hard work. Like dofile, loadfile loads a Lua chunk from a file, but it does not run the chunk. Instead, it only compiles the chunk and returns the compiled chunk as a function. Moreover, unlike dofile, loadfile does not raise errors, but instead returns error codes. We could define dofile as follows:

      function dofile (filename)
        local f = assert(loadfile(filename))
        return f()
      end

Note the use of assert to raise an error if loadfile fails.

For simple tasks, dofile is handy, because it does the complete job in one call. However, loadfile is more flexible. In case of error, loadfile returns nil plus the error message, which allows us to handle the error in customized ways. Moreover, if we need to run a file several times, we can call loadfile once and call its result several times. This approach is much cheaper than several calls to dofile, because it compiles the file only once. (Compilation is a somewhat expensive operation when compared to other tasks in the language.)

The function load is similar to loadfile, except that it reads its chunk from a string or from a function, not from a file.[16] For instance, consider the next line:

      f = load("i = i + 1")

After this code, f will be a function that executes i = i + 1 when invoked:

      i = 0
      f(); print(i)   --> 1
      f(); print(i)   --> 2

The function load is powerful; we should use it with care. It is also an expensive function (when compared to some alternatives) and can result in incomprehensible code. Before you use it, make sure that there is no simpler way to solve the problem at hand.

If we want to do a quick-and-dirty dostring (i.e., to load and run a chunk), we can call the result from load directly:

      load(s)()

However, if there is any syntax error, load will return nil and the final error message will be something like attempt to call a nil value. For clearer error messages, it is better to use assert:

      assert(load(s))()

Usually, it does not make sense to use load on a literal string. For instance, the next two lines are roughly equivalent:

      f = load("i = i + 1")
      
      f = function () i = i + 1 end

However, the second line is much faster, because Lua compiles the function together with its enclosing chunk. In the first line, the call to load involves a separate compilation.

Because load does not compile with lexical scoping, the two lines in the previous example may not be truly equivalent. To see the difference, let us change the example a little:

      i = 32
      local i = 0
      f = load("i = i + 1; print(i)")
      g = function () i = i + 1; print(i) end
      f()             --> 33
      g()             --> 1

The function g manipulates the local i, as expected, but f manipulates a global i, because load always compiles its chunks in the global environment.

The most typical use of load is to run external code (that is, pieces of code that come from outside our program) or dynamically-generated code. For instance, we may want to plot a function defined by the user; the user enters the function code and then we use load to evaluate it. Note that load expects a chunk, that is, statements. If we want to evaluate an expression, we can prefix the expression with return, so that we get a statement that returns the value of the given expression. See the example:

      print "enter your expression:"
      local line = io.read()
      local func = assert(load("return " .. line))
      print("the value of your expression is " .. func())

Because the function returned by load is a regular function, we can call it several times:

      print "enter function to be plotted (with variable 'x'):"
      local line = io.read()
      local f = assert(load("return " .. line))
      for i = 1, 20 do
        x = i   -- global 'x' (to be visible from the chunk)
        print(string.rep("*", f()))
      end

We can call load also with a reader function as its first argument. A reader function can return the chunk in parts; load calls the reader successively until it returns nil, which signals the chunk’s end. As an example, the next call is equivalent to loadfile:

      f = load(io.lines(filename, "*L"))

As we saw in Chapter 7, The External World, the call io.lines(filename, "*L") returns a function that, at each call, returns a new line from the given file. So, load will read the chunk from the file line by line. The following version is similar, but slightly more efficient:

      f = load(io.lines(filename, 1024))

Here, the iterator returned by io.lines reads the file in blocks of 1024 bytes.

Lua treats any independent chunk as the body of an anonymous variadic function. For instance, load("a = 1") returns the equivalent of the following expression:

      function (...) a = 1 end

Like any other function, chunks can declare local variables:

      f = load("local a = 10; print(a + 20)")
      f()          --> 30

Using these features, we can rewrite our plot example to avoid the use of a global variable x:

      print "enter function to be plotted (with variable 'x'):"
      local line = io.read()
      local f = assert(load("local x = ...; return " .. line))
      for i = 1, 20 do
        print(string.rep("*", f(i)))
      end

In this code, we append the declaration "local x = ..." at the beginning of the chunk to declare x as a local variable. We then call f with an argument i that becomes the value of the vararg expression (...).

The functions load and loadfile never raise errors. In case of any kind of error, they return nil plus an error message:

      print(load("i i"))
        --> nil     [string "i i"]:1: '=' expected near 'i'

Moreover, these functions never have any kind of side effect, that is, they do not change or create variables, do not write to files, etc. They only compile the chunk to an internal representation and return the result as an anonymous function. A common mistake is to assume that loading a chunk defines functions. In Lua, function definitions are assignments; as such, they happen at runtime, not at compile time. For instance, suppose we have a file foo.lua like this:

      -- file 'foo.lua'
      function foo (x)
        print(x)
      end

We then run the command

      f = loadfile("foo.lua")

This command compiles foo but does not define it. To define it, we must run the chunk:

      f = loadfile("foo.lua")
      print(foo)    --> nil
      f()           -- run the chunk
      foo("ok")     --> ok

This behavior may sound strange, but it becomes clear if we rewrite the file without the syntax sugar:

      -- file 'foo.lua'
      foo = function (x)
        print(x)
      end

In a production-quality program that needs to run external code, we should handle any errors reported when loading a chunk. Moreover, we may want to run the new chunk in a protected environment, to avoid unpleasant side effects. We will discuss environments in detail in Chapter 22, The Environment.

As I mentioned in the beginning of this chapter, Lua precompiles source code before running it. Lua also allows us to distribute code in precompiled form.

The simplest way to produce a precompiled file —also called a binary chunk in Lua jargon— is with the luac program that comes in the standard distribution. For instance, the next call creates a new file prog.lc with a precompiled version of a file prog.lua:

      $ luac -o prog.lc prog.lua

The Lua interpreter can execute this new file just like normal Lua code, performing exactly as it would with the original source:

      $ lua prog.lc

Lua accepts precompiled code mostly anywhere it accepts source code. In particular, both loadfile and load accept precompiled code.

We can write a minimal luac directly in Lua:

      p = loadfile(arg[1])
      f = io.open(arg[2], "wb")
      f:write(string.dump(p))
      f:close()

The key function here is string.dump: it receives a Lua function and returns its precompiled code as a string, properly formatted to be loaded back by Lua.

The luac program offers some other interesting options. In particular, option -l lists the opcodes that the compiler generates for a given chunk. As an example, Figure 16.1, “Example of output from luac -l shows the output of luac with option -l on the following one-line file:

      a = x + y - z

(We will not discuss the internals of Lua in this book; if you are interested in more details about those opcodes, a Web search for "lua opcode" should give you relevant material.)

Code in precompiled form is not always smaller than the original, but it loads faster. Another benefit is that it gives a protection against accidental changes in sources. Unlike source code, however, maliciously corrupted binary code can crash the Lua interpreter or even execute user-provided machine code. When running usual code, there is nothing to worry about. However, you should avoid running untrusted code in precompiled form. The function load has an option exactly for this task.

Besides its required first argument, load has three more arguments, all of them optional. The second is a name for the chunk, used only in error messages. The fourth argument is an environment, which we will discuss in Chapter 22, The Environment. The third argument is the one we are interested here; it controls what kinds of chunks can be loaded. If present, this argument must be a string: the string "t" allows only textual (normal) chunks; "b" allows only binary (precompiled) chunks; "bt", the default, allows both formats.

Errare humanum est. Therefore, we must handle errors the best way we can. Because Lua is an extension language, frequently embedded in an application, it cannot simply crash or exit when an error happens. Instead, whenever an error occurs, Lua must offer ways to handle it.

Any unexpected condition that Lua encounters raises an error. Errors occur when a program tries to add values that are not numbers, call values that are not functions, index values that are not tables, and so on. (We can modify this behavior using metatables, as we will see later.) We can also explicitly raise an error calling the function error, with an error message as an argument. Usually, this function is the appropriate way to signal errors in our code:

      print "enter a number:"
      n = io.read("n")
      if not n then error("invalid input") end

This construction of calling error subject to some condition is so common that Lua has a built-in function just for this job, called assert:

      print "enter a number:"
      n = assert(io.read("*n"), "invalid input")

The function assert checks whether its first argument is not false and simply returns this argument; if the argument is false, assert raises an error. Its second argument, the message, is optional. Beware, however, that assert is a regular function. As such, Lua always evaluates its arguments before calling the function. If we write something like

      n = io.read()
      assert(tonumber(n), "invalid input: " .. n .. " is not a number")

Lua will always do the concatenation, even when n is a number. It may be wiser to use an explicit test in such cases.

When a function finds an unexpected situation (an exception), it can assume two basic behaviors: it can return an error code (typically nil or false) or it can raise an error, calling error. There are no fixed rules for choosing between these two options, but I use the following guideline: an exception that is easily avoided should raise an error; otherwise, it should return an error code.

For instance, let us consider math.sin. How should it behave when called on a table? Suppose it returns an error code. If we need to check for errors, we would have to write something like this:

      local res = math.sin(x)
      if not res then     -- error?
        error-handling code

However, we could as easily check this exception before calling the function:

      if not tonumber(x) then     -- x is not a number?
        error-handling code

Frequently we check neither the argument nor the result of a call to sin; if the argument is not a number, it means that probably there is something wrong in our program. In such situations, the simplest and most practical way to handle the exception is to stop the computation and issue an error message.

On the other hand, let us consider io.open, which opens a file. How should it behave when asked to open a file that does not exist? In this case, there is no simple way to check for the exception before calling the function. In many systems, the only way of knowing whether a file exists is by trying to open it. Therefore, if io.open cannot open a file because of an external reason (such as file does not exist or permission denied), it returns false, plus a string with the error message. In this way, we have a chance to handle the situation in an appropriate way, for instance by asking the user for another file name:

      local file, msg
      repeat
        print "enter a file name:"
        local name = io.read()
        if not name then return end   -- no input
        file, msg = io.open(name, "r")
        if not file then print(msg) end
      until file

If we do not want to handle such situations, but still want to play safe, we simply use assert to guard the operation:

      file = assert(io.open(name, "r"))
        --> stdin:1: no-file: No such file or directory

This is a typical Lua idiom: if io.open fails, assert will raise an error. Notice how the error message, which is the second result from io.open, goes as the second argument to assert.

For many applications, we do not need to do any error handling in Lua; the application program does this handling. All Lua activities start from a call by the application, usually asking Lua to run a chunk. If there is any error, this call returns an error code, so that the application can take appropriate actions. In the case of the stand-alone interpreter, its main loop just prints the error message and continues showing the prompt and running the given commands.

However, if we want to handle errors inside the Lua code, we should use the function pcall (protected call) to encapsulate our code.

Suppose we want to run a piece of Lua code and to catch any error raised while running that code. Our first step is to encapsulate that piece of code in a function; more often than not, we use an anonymous function for that. Then, we call that function through pcall:

      local ok, msg = pcall(function ()
           some code
           if unexpected_condition then error() end
           some code
           print(a[i])    -- potential error: 'a' may not be a table
           some code
         end)
      
      if ok then    -- no errors while running protected code
        regular code
      else   -- protected code raised an error: take appropriate action
        error-handling code
      end

The function pcall calls its first argument in protected mode, so that it catches any errors while the function is running. The function pcall never raises any error, no matter what. If there are no errors, pcall returns true, plus any values returned by the call. Otherwise, it returns false, plus the error message.

Despite its name, the error message does not have to be a string; a better name is error object, because pcall will return any Lua value that we pass to error:

      local status, err = pcall(function () error({code=121}) end)
      print(err.code)  --> 121

These mechanisms provide all we need to do exception handling in Lua. We throw an exception with error and catch it with pcall. The error message identifies the kind of error.

Although we can use a value of any type as an error object, usually error objects are strings describing what went wrong. When there is an internal error (such as an attempt to index a non-table value), Lua generates the error object, which in that case is always a string; otherwise, the error object is the value passed to the function error. Whenever the object is a string, Lua tries to add some information about the location where the error happened:

      local status, err = pcall(function () error("my error") end)
      print(err)          --> stdin:1: my error

The location information gives the chunk’s name (stdin, in the example) plus the line number (1, in the example).

The function error has an additional second parameter, which gives the level where it should report the error. We use this parameter to blame someone else for the error. For instance, suppose we write a function whose first task is to check whether it was called correctly:

      function foo (str)
        if type(str) ~= "string" then
          error("string expected")
        end
        regular code
      end

Then, someone calls this function with a wrong argument:

      foo({x=1})

As it is, Lua points its finger to foo —after all, it was it who called error— and not to the real culprit, the caller. To correct this problem, we inform error that the error it is reporting occurred on level two in the calling hierarchy (level one is our own function):

      function foo (str)
        if type(str) ~= "string" then
          error("string expected", 2)
        end
        regular code
      end

Frequently, when an error happens, we want more debug information than only the location where the error occurred. At least, we want a traceback, showing the complete stack of calls leading to the error. When pcall returns its error message, it destroys part of the stack (the part that goes from it to the error point). Consequently, if we want a traceback, we must build it before pcall returns. To do this, Lua provides the function xpcall. It works like pcall, but its second argument is a message handler function. In case of error, Lua calls this message handler before the stack unwinds, so that it can use the debug library to gather any extra information it wants about the error. Two common message handlers are debug.debug, which gives us a Lua prompt so that we can inspect by ourselves what was going on when the error happened; and debug.traceback, which builds an extended error message with a traceback. The latter is the function that the stand-alone interpreter uses to build its error messages.

Exercise 16.1: Frequently, it is useful to add some prefix to a chunk of code when loading it. (We saw an example previously in this chapter, where we prefixed a return to an expression being loaded.) Write a function loadwithprefix that works like load, except that it adds its extra first argument (a string) as a prefix to the chunk being loaded.

Like the original load, loadwithprefix should accept chunks represented both as strings and as reader functions. Even in the case that the original chunk is a string, loadwithprefix should not actually concatenate the prefix with the chunk. Instead, it should call load with a proper reader function that first returns the prefix and then returns the original chunk.

Exercise 16.2: Write a function multiload that generalizes loadwithprefix by receiving a list of readers, as in the following example:

      f = multiload("local x = 10;",
                    io.lines("temp", "*L"),
                    " print(x)")

In the above example, multiload should load a chunk equivalent to the concatenation of the string "local...", the contents of the temp file, and the string "print(x)". Like loadwithprefix, from the previous exercise, multiload should not actually concatenate anything.

Exercise 16.3: The function stringrep, in Figure 16.2, “String repetition”, uses a binary multiplication algorithm to concatenate n copies of a given string s.

For any fixed n, we can create a specialized version of stringrep by unrolling the loop into a sequence of instructions r = r .. s and s = s .. s. As an example, for n = 5 the unrolling gives us the following function:

      function stringrep_5 (s)
        local r = ""
        r = r .. s
        s = s .. s
        s = s .. s
        r = r .. s
        return r
      end

Write a function that, given n, returns a specialized function stringrep_n. Instead of using a closure, your function should build the text of a Lua function with the proper sequence of instructions (a mix of r = r .. s and s = s .. s) and then use load to produce the final function. Compare the performance of the generic function stringrep (or of a closure using it) with your tailor-made functions.

Exercise 16.4: Can you find any value for f such that the call pcall(pcall, f) returns false as its first result? Why is this relevant?



[16] In Lua 5.1, function loadstring did the role of load for strings.

Personal copy of Eric Taylor <jdslkgjf.iapgjflksfg@yandex.com>