25 Reflection

Reflection is the ability of a program to inspect and modify some aspects of its own execution. Dynamic languages like Lua naturally support several reflective features: environments allow run-time inspection of global variables; functions like type and pairs allow run-time inspection and traversal of unknown data structures; functions like load and require allow a program to add code to itself or update its own code. However, many things are still missing: programs cannot introspect on their local variables, programs cannot trace their execution, functions cannot know their callers, etc. The debug library fills many of these gaps.

The debug library comprises two kinds of functions: introspective functions and hooks. Introspective functions allow us to inspect several aspects of the running program, such as its stack of active functions, current line of execution, and values and names of local variables. Hooks allow us to trace the execution of a program.

Despite its name, the debug library does not give us a debugger for Lua. Nevertheless, it provides all the primitives that we need to write our own debuggers, with varying levels of sophistication.

Unlike the other libraries, we should use the debug library with parsimony. First, some of its functionality is not exactly famous for performance. Second, it breaks some sacred truths of the language, such as that we cannot access a local variable from outside its lexical scope. Although the library is readily available as a standard library, I prefer to require it explicitly in any chunk that uses it.

Introspective Facilities

The main introspective function in the debug library is getinfo. Its first parameter can be a function or a stack level. When we call debug.getinfo(foo) for a function foo, it returns a table with some data about this function. The table can have the following fields:

source:: This field tells where the function was defined. If the function was defined in a string (through a call to load), source is that string. If the function was defined in a file, source is the file name prefixed with an at-sign.
short_src:: This field gives a short version of source (up to 60 characters). It is useful for error messages.
linedefined:: This field gives the number of the first line in the source where the function was defined.
lastlinedefined:: This field gives the number of the last line in the source where the function was defined.
what:: This field tells what this function is. Options are "Lua" if foo is a regular Lua function, "C" if it is a C function, or "main" if it is the main part of a Lua chunk.
name:: This field gives a reasonable name for the function, such as the name of a global variable that stores this function.
namewhat:: This field tells what the previous field means. This field can be "global", "local", "method", "field", or "" (the empty string). The empty string means that Lua did not find a name for the function.
nups:: This is the number of upvalues of that function.
nparams:: This is the number of parameters of that function.
isvararg:: This tells whether the function is variadic (a Boolean).
activelines:: This field is a table representing the set of active lines of the function. An active line is a line with some code, as opposed to empty lines or lines containing only comments. (A typical use of this information is for setting breakpoints. Most debuggers do not allow us to set a breakpoint outside an active line, as it would be unreachable.)
func:: This field has the function itself.

When foo is a C function, Lua does not have much data about it. For such functions, only the fields what, name, namewhat, nups, and func are meaningful.

When we call debug.getinfo(n) for some number n, we get data about the function active at that stack level. A stack level is a number that refers to a particular function that is active at that moment. The function calling getinfo has level one, the function that called it has level two, and so on. (At level zero, we get data about getinfo itself, a C function.) If n is larger than the number of active functions on the stack, debug.getinfo returns nil. When we query an active function, by calling debug.getinfo with a stack level, the resulting table has two extra fields: currentline, the line where the function is at that moment; and istailcall (a Boolean), true if this function was called by a tail call. (In this case, the real caller of this function is not on the stack anymore.)

The field name is tricky. Remember that, because functions are first-class values in Lua, a function may not have a name, or may have several names. Lua tries to find a name for a function by looking into the code that called the function, to see how it was called. This method works only when we call getinfo with a number, that is, when we ask information about a particular invocation.

The function getinfo is not efficient. Lua keeps debug information in a form that does not impair program execution; efficient retrieval is a secondary goal here. To achieve better performance, getinfo has an optional second parameter that selects what information to get. In this way, the function does not waste time collecting data that the user does not need. The format of this parameter is a string, where each letter selects a group of fields, according to the following table:

`n`	selects `name` and `namewhat`
`f`	selects `func`
`S`	selects `source`, `short_src`, `what`, `linedefined`, and `lastlinedefined`
`l`	selects `currentline`
`L`	selects `activelines`
`u`	selects `nup`, `nparams`, and `isvararg`

The following function illustrates the use of debug.getinfo by printing a primitive traceback of the active stack:

      function traceback ()
        for level = 1, math.huge do
          local info = debug.getinfo(level, "Sl")
          if not info then break end
          if info.what == "C" then   -- is a C function?
            print(string.format("%d\tC function", level))
          else   -- a Lua function
            print(string.format("%d\t[%s]:%d", level,
               info.short_src, info.currentline))
          end
        end
      end

It is not difficult to improve this function, by including more data from getinfo. Actually, the debug library offers such an improved version, the function traceback. Unlike our version, debug.traceback does not print its result; instead, it returns a (potentially long) string containing the traceback:

      > print(debug.traceback())
      stack traceback:
              stdin:1: in main chunk
              [C]: in ?

Accessing local variables

We can inspect the local variables of any active function with debug.getlocal. This function has two parameters: the stack level of the function we are querying and a variable index. It returns two values: the name and the current value of the variable. If the variable index is larger than the number of active variables, getlocal returns nil. If the stack level is invalid, it raises an error. (We can use debug.getinfo to check the validity of the stack level.)

Lua numbers local variables in the order that they appear in a function, counting only the variables that are active in the current scope of the function. For instance, consider the following function:

      function foo (a, b)
        local x
        do local c = a - b end
        local a = 1
        while true do
          local name, value = debug.getlocal(1, a)
          if not name then break end
          print(name, value)
          a = a + 1
        end
      end

The call foo(10, 20) will print this:

      a       10
      b       20
      x       nil
      a       4

The variable with index 1 is a (the first parameter), 2 is b, 3 is x, and 4 is the inner a. At the point where getlocal is called, c is already out of scope, while name and value are not yet in scope. (Remember that local variables are only visible after their initialization code.)

Starting with Lua 5.2, negative indices get information about the extra arguments of a variadic function: index -1 refers to the first extra argument. The name of the variable in this case is always "(*vararg)".

We can also change the values of local variables, with debug.setlocal. Its first two parameters are a stack level and a variable index, like in getlocal. Its third parameter is the new value for the variable. It returns the variable name or nil if the variable index is out of scope.

Accessing non-local variables

The debug library also allows us to access the non-local variables used by a Lua function, with getupvalue. Unlike local variables, the non-local variables referred by a function exist even when the function is not active (this is what closures are about, after all). Therefore, the first argument for getupvalue is not a stack level, but a function (a closure, more precisely). The second argument is the variable index. Lua numbers non-local variables in the order in which they are first referred in a function, but this order is not relevant, because a function cannot access two non-local variables with the same name.

We can also update non-local variables, with debug.setupvalue. As you might expect, it has three parameters: a closure, a variable index, and the new value. Like setlocal, it returns the name of the variable, or nil if the variable index is out of range.

Figure 25.1, “Getting the value of a variable” shows how we can access the value of a variable from a calling function, given the variable’s name.

Figure 25.1. Getting the value of a variable

      function getvarvalue (name, level, isenv)
        local value
        local found = false
      
        level = (level or 1) + 1
      
        -- try local variables
        for i = 1, math.huge do
          local n, v = debug.getlocal(level, i)
          if not n then break end
          if n == name then
            value = v
            found = true
          end
        end
        if found then return "local", value end
      
        -- try non-local variables
        local func = debug.getinfo(level, "f").func
        for i = 1, math.huge do
          local n, v = debug.getupvalue(func, i)
          if not n then break end
          if n == name then return "upvalue", v end
        end
      
        if isenv then return "noenv" end   -- avoid loop
      
        -- not found; get value from the environment
        local _, env = getvarvalue("_ENV", level, true)
        if env then
          return "global", env[name]
        else        -- no _ENV available
          return "noenv"
        end
      end

It can be used like here:

      > local a = 4; print(getvarvalue("a"))   --> local    4
      > a = "xx"; print(getvarvalue("a"))      --> global   xx

The parameter level tells where on the stack the function should look; one (the default) means the immediate caller. The plus one in the code corrects the level to include the call to getvarvalue itself. I will explain the parameter isenv in a moment.

The function first looks for a local variable. If there is more than one local with the given name, it must get the one with the highest index; thus, it must always go through the whole loop. If it cannot find any local variable with that name, then it tries the non-local variables. For that, it gets the calling closure, with debug.getinfo, and then it traverses its non-local variables. Finally, if it cannot find a non-local variable with that name, then it goes for a global variable: it calls itself recursively to access the proper _ENV variable and then looks up the name in that environment.

The parameter isenv avoids a tricky problem. It tells when we are in a recursive call, looking for the variable _ENV to query a global name. A function that uses no global variables may not have an upvalue _ENV. In that case, if we tried to consult _ENV as a global, we would enter a recursive loop, because we would need _ENV to get its own value. So, when isenv is true and the function cannot find a local or an upvalue, it does not try the global variables.

Accessing other coroutines

All introspective functions from the debug library accept an optional coroutine as their first argument, so that we can inspect the coroutine from the outside. For instance, consider the next example:

      co = coroutine.create(function ()
        local x = 10
        coroutine.yield()
        error("some error")
      end)
      
      coroutine.resume(co)
      print(debug.traceback(co))

The call to traceback will work on the coroutine co, resulting in something like this:

      stack traceback:
              [C]: in function 'yield'
              temp:3: in function <temp:1>

The trace does not go through the call to resume, because the coroutine and the main program run in different stacks.

When a coroutine raises an error, it does not unwind its stack. This means that we can inspect it after the error. Continuing our example, the coroutine hits the error if we resume it again:

      print(coroutine.resume(co))      --> false   temp:4: some error

Now, if we print its traceback, we get something like this:

      stack traceback:
              [C]: in function 'error'
              temp:4: in function <temp:1>

We can also inspect local variables from a coroutine, even after an error:

      print(debug.getlocal(co, 1, 1))     --> x       10

Hooks

The hook mechanism of the debug library allows us to register a function to be called at specific events as a program runs. There are four kinds of events that can trigger a hook:

call events happen every time Lua calls a function;
return events happen every time a function returns;
line events happen when Lua starts executing a new line of code;
count events happen after a given number of instructions. (Instructions here mean internal opcodes, which we visited briefly in the section called “Precompiled Code”.)

Lua calls all hooks with a string argument that describes the event that generated the call: "call" (or "tail call"), "return", "line", or "count". For line events, it also passes a second argument, the new line number. To get more information inside a hook, we have to call debug.getinfo.

To register a hook, we call debug.sethook with two or three arguments: the first argument is the hook function; the second argument is a mask string, which describes the events we want to monitor; and the optional third argument is a number that describes at what frequency we want to get count events. To monitor the call, return, and line events, we add their first letters (c, r, or l) into the mask string. To monitor the count event, we simply supply a counter as the third argument. To turn off hooks, we call sethook with no arguments.

As a simple example, the following code installs a primitive tracer, which prints each line the interpreter executes:

      debug.sethook(print, "l")

This call simply installs print as the hook function and instructs Lua to call it only at line events. A more elaborated tracer can use getinfo to add the current file name to the trace:

      function trace (event, line)
        local s = debug.getinfo(2).short_src
        print(s .. ":" .. line)
      end
      
      debug.sethook(trace, "l")

A useful function to use with hooks is debug.debug. This simple function gives us a prompt that executes arbitrary Lua commands. It is roughly equivalent to the following code:

      function debug1 ()
        while true do
          io.write("debug> ")
          local line = io.read()
          if line == "cont" then break end
          assert(load(line))()
        end
      end

When the user enters the “command” cont, the function returns. The standard implementation is very simple and runs the commands in the global environment, outside the scope of the code being debugged. Exercise 25.4 discusses a better implementation.

Profiles

Besides debugging, another common application for reflection is profiling, that is, an analysis of the behavior of a program regarding its use of resources. For a timing profile, it is better to use the C interface: the overhead of a Lua call for each hook is too high and may invalidate any measurement. However, for counting profiles, Lua code does a decent job. In this section, we will develop a rudimentary profiler that lists the number of times each function in a program is called during a run.

The main data structures of our program are two tables: one maps functions to their call counters, and the other maps functions to their names. The indices to both tables are the functions themselves.

      local Counters = {}
      local Names = {}

We could retrieve the function names after the profiling, but remember that we get better results if we get the name of a function while it is active, because then Lua can look at the code that is calling the function to find its name.

Now we define the hook function. Its job is to get the function being called, increment the corresponding counter, and collect the function name. The code is in Figure 25.2, “Hook for counting number of calls”.

Figure 25.2. Hook for counting number of calls

      local function hook ()
        local f = debug.getinfo(2, "f").func
        local count = Counters[f]
        if count == nil then    -- first time 'f' is called?
          Counters[f] = 1
          Names[f] = debug.getinfo(2, "Sn")
        else     -- only increment the counter
          Counters[f] = count + 1
        end
      end

The next step is to run the program with that hook. We will assume that the program we want to analyze is in a file and that the user gives this file name as an argument to the profiler, like this:

      % lua profiler main-prog

With this scheme, the profiler can get the file name in arg[1], turn on the hook, and run the file:

      local f = assert(loadfile(arg[1]))
      debug.sethook(hook, "c")  -- turn on the hook for calls
      f()                       -- run the main program
      debug.sethook()           -- turn off the hook

The last step is to show the results. The function getname, in Figure 25.3, “Getting the name of a function”, produces a name for a function.

Figure 25.3. Getting the name of a function

      function getname (func)
        local n = Names[func]
        if n.what == "C" then
          return n.name
        end
        local lc = string.format("[%s]:%d", n.short_src, n.linedefined)
        if n.what ~= "main" and n.namewhat ~= "" then
          return string.format("%s (%s)", lc, n.name)
        else
          return lc
        end
      end

Because function names in Lua are so uncertain, we add to each function its location, given as a pair file:line. If a function has no name, then we use just its location. For a C function, we use only its name (as it has no location). After that definition, we print each function with its counter:

      for func, count in pairs(Counters) do
        print(getname(func), count)
      end

If we apply our profiler to the Markov example that we developed in Chapter 19, Interlude: Markov Chain Algorithm, we get a result like this:

      [markov.lua]:4 884723
      write   10000
      [markov.lua]:0 1
      read    31103
      sub     884722
      ...

This result means that the anonymous function at line 4 (which is the iterator function defined inside allwords) was called 884723 times, write (io.write) was called 10000 times, and so on.

There are several improvements that we can make to this profiler, such as to sort the output, to print better function names, and to embellish the output format. Nevertheless, this basic profiler is already useful as it is.

Sandboxing

In the section called “_ENV and load”, we saw how easy it is to use load to run a Lua chunk in a restricted environment. Because Lua does all communication with the external world through library functions, once we remove these functions, we also remove the possibility of a script to have any effect on the external world. Nevertheless, we are still susceptible to denial of service (DoS) attacks, with a script wasting large amounts of CPU time or memory. Reflection, in the form of debug hooks, provides an interesting approach to curb such attacks.

A first step is to use a count hook to limit the number of instructions that a chunk can execute. Figure 25.4, “A naive sandbox with hooks” shows a program to run a given file in that kind of sandbox.

Figure 25.4. A naive sandbox with hooks

      local debug = require "debug"
      
      -- maximum "steps" that can be performed
      local steplimit = 1000
      
      local count = 0     -- counter for steps
      
      local function step ()
        count = count + 1
        if count > steplimit then
          error("script uses too much CPU")
        end
      end
      
      -- load file
      local f = assert(loadfile(arg[1], "t", {}))
      
      debug.sethook(step, "", 100)    -- set hook
      
      f()    -- run file

The program loads the given file, sets the hook, and runs the file. It sets the hook as a count hook, so that Lua calls the hook every 100 instructions. The hook (the function step) only increments a counter and checks it against a fixed limit. What can possibly go wrong?

Of course, we must restrict the size of the chunks that we load: a huge chunk can exhaust memory only by being loaded. Another problem is that a program can consume huge amounts of memory with surprisingly few instructions, as the next fragment shows:

      local s = "123456789012345"
      for i = 1, 36 do s = s .. s end

With less than 150 instructions, this tiny fragment will try to create a string with one terabyte. Clearly, restricting only steps and program size is not enough.

One improvement is to check and limit memory use in the step function, as we show in Figure 25.5, “Controlling memory use”.

Figure 25.5. Controlling memory use

      -- maximum memory (in KB) that can be used
      local memlimit = 1000
      
      -- maximum "steps" that can be performed
      local steplimit = 1000
      
      local function checkmem ()
        if collectgarbage("count") > memlimit then
          error("script uses too much memory")
        end
      end
      
      local count = 0
      local function step ()
        checkmem()
        count = count + 1
        if count > steplimit then
          error("script uses too much CPU")
        end
      end
      
      as before

Because memory can grow so fast with so few instructions, we should set a very low limit or call the hook in small steps. More concretely, a program can do a thousandfold increase in the size of a string in 40 instructions. So, either we call the hook with a higher frequency than every 40 steps or we set the memory limit to one thousandth of what we can really afford. I would probably choose both.

A subtler problem is the string library. We can call any function from this library as a method on a string. Therefore, we can call these functions even if they are not in the environment; literal strings smuggle them into our sandbox. No function in the string library affects the external world, but they bypass our step counter. (A call to a C function counts as one instruction in Lua.) Some functions in the string library can be quite dangerous for DoS attacks. For instance, the call ("x"):rep(2^30) swallows 1 GB of memory in a single step. As another example, Lua 5.2 takes 13 minutes to run the following call in my new machine:

      s = "01234567890123456789012345678901234567890123456789"
      s:find(".*.*.*.*.*.*.*.*.*x")

An interesting way to restrict the access to the string library is to use call hooks. Every time a function is called, we check whether it is authorized. Figure 25.6, “Using hooks to bar calls to unauthorized functions” implements this idea.

Figure 25.6. Using hooks to bar calls to unauthorized functions

      local debug = require "debug"
      
      -- maximum "steps" that can be performed
      local steplimit = 1000
      
      local count = 0     -- counter for steps
      
      -- set of authorized functions
      local validfunc = {
        [string.upper] = true,
        [string.lower] = true,
        ...       -- other authorized functions
      }
      
      local function hook (event)
        if event == "call" then
          local info = debug.getinfo(2, "fn")
          if not validfunc[info.func] then
            error("calling bad function: " .. (info.name or "?"))
          end
        end
        count = count + 1
        if count > steplimit then
          error("script uses too much CPU")
        end
      end
      
      -- load chunk
      local f = assert(loadfile(arg[1], "t", {}))
      
      debug.sethook(hook, "", 100)    -- set hook
      
      f()    -- run chunk

In that code, the table validfunc represents a set with the functions that the program can call. The function hook uses the debug library to access the function being called and then checks whether that function is in the validfunc set.

An important point in any sandbox implementation is what functions we allow inside the sandbox. Sandboxes for data description can restrict all or most functions. Other sandboxes must be more forgiving, maybe offering their own restricted implementations for some functions (e.g., load restricted to small text chunks, file access restricted to a fixed directory, or pattern matching restricted to small subjects).

We should never think in terms of what functions to remove, but what functions to add. For each candidate, we must carefully consider its possible weaknesses, which may be subtle. As a rule of thumb, all functions from the mathematical library are safe. Most functions from the string library are safe; just be careful with resource-consuming ones. The debug and package libraries are off-limits; almost everything there can be dangerous. The functions setmetatable and getmetatable are also tricky: first, they can allow access to otherwise inaccessible values; moreover, they allow the creation of tables with finalizers, where someone can install all sorts of “time bombs” (code that can be executed outside the sandbox, when the table is collected).

Exercises

Exercise 25.1: Adapt getvarvalue (Figure 25.1, “Getting the value of a variable”) to work with different coroutines (like the functions from the debug library).

Exercise 25.2: Write a function setvarvalue similar to getvarvalue (Figure 25.1, “Getting the value of a variable”).

Exercise 25.3: Write a version of getvarvalue (Figure 25.1, “Getting the value of a variable”) that returns a table with all variables that are visible at the calling function. (The returned table should not include environmental variables; instead, it should inherit them from the original environment.)

Exercise 25.4: Write an improved version of debug.debug that runs the given commands as if they were in the lexical scope of the calling function. (Hint: run the commands in an empty environment and use the __index metamethod attached to the function getvarvalue to do all accesses to variables.)

Exercise 25.5: Improve the previous exercise to handle updates, too.

Exercise 25.6: Implement some of the suggested improvements for the basic profiler that we developed in the section called “Profiles”.

Exercise 25.7: Write a library for breakpoints. It should offer at least two functions:

      setbreakpoint(function, line)    --> returns handle
      removebreakpoint(handle)

We specify a breakpoint by a function and a line inside that function. When the program hits a breakpoint, the library should call debug.debug. (Hint: for a basic implementation, use a line hook that checks whether it is in a breakpoint; to improve performance, use a call hook to trace program execution and only turn on the line hook when the program is running the target function.)

Exercise 25.8: One problem with the sandbox in Figure 25.6, “Using hooks to bar calls to unauthorized functions” is that sandboxed code cannot call its own functions. How can you correct this problem?