8 Filling some Gaps

We have already used most of Lua’s syntactical constructions in previous examples, but it is easy to miss some details. For completeness, this chapter closes the first part of the book with more details about them.

By default, variables in Lua are global. All local variables must be declared as such. Unlike global variables, a local variable has its scope limited to the block where it is declared. A block is the body of a control structure, the body of a function, or a chunk (the file or string where the variable is declared):

      x = 10
      local i = 1        -- local to the chunk
      
      while i <= x do
        local x = i * 2  -- local to the while body
        print(x)         --> 2, 4, 6, 8, ...
        i = i + 1
      end
      
      if i > 20 then
        local x          -- local to the "then" body
        x = 20
        print(x + 2)     -- (would print 22 if test succeeded)
      else
        print(x)         --> 10  (the global one)
      end
      
      print(x)           --> 10  (the global one)

Beware that this last example will not work as expected if you enter it in interactive mode. In interactive mode, each line is a chunk by itself (unless it is not a complete command). As soon as you enter the second line of the example (local i = 1), Lua runs it and starts a new chunk in the next line. By then, the local declaration is already out of scope. To solve this problem, we can delimit the whole block explicitly, bracketing it with the keywords doend. Once you enter the do, the command completes only at the corresponding end, so Lua will not execute each line by itself.

These do blocks are useful also when we need finer control over the scope of some local variables:

      local x1, x2
      do
        local a2 = 2*a
        local d = (b^2 - 4*a*c)^(1/2)
        x1 = (-b + d)/a2
        x2 = (-b - d)/a2
      end                      -- scope of 'a2' and 'd' ends here
      print(x1, x2)            -- 'x1' and 'x2' still in scope

It is good programming style to use local variables whenever possible. Local variables avoid cluttering the global environment with unnecessary names; they also avoid name clashes between different parts of a program. Moreover, the access to local variables is faster than to global ones. Finally, a local variable vanishes as soon as its scope ends, allowing the garbage collector to release its value.

Given that local variables are better than global ones, some people argue that Lua should use local by default. However, local by default has its own set of problems (e.g., issues with accessing non-local variables). A better approach would be no default, that is, all variables should be declared before used. The Lua distribution comes with a module strict.lua for global-variable checks; it raises an error if we try to assign to a non-existent global inside a function or to use a non-existent global. It is a good habit to use it when developing Lua code.

Each local declaration can include an initial assignment, which works the same way as a conventional multiple assignment: extra values are thrown away, extra variables get nil. If a declaration has no initial assignment, it initializes all its variables with nil:

      local a, b = 1, 10
      if a < b then
        print(a)   --> 1
        local a    -- '= nil' is implicit
        print(a)   --> nil
      end          -- ends the block started at 'then'
      print(a, b)  --> 1   10

A common idiom in Lua is

      local foo = foo

This code creates a local variable, foo, and initializes it with the value of the global variable foo. (The local foo becomes visible only after its declaration.) This idiom is useful to speed up the access to foo. It is also useful when the chunk needs to preserve the original value of foo even if later some other function changes the value of the global foo; in particular, it makes the code resistant to monkey patching. Any piece of code preceded by local print = print will use the original function print even if print is monkey patched to something else.

Some people think it is a bad practice to use declarations in the middle of a block. Quite the opposite: by declaring a variable only when we need it, we seldom need to declare it without an initial value (and therefore we seldom forget to initialize it). Moreover, we shorten the scope of the variable, which increases readability.

Lua provides a small and conventional set of control structures, with if for conditional execution and while, repeat, and for for iteration. All control structures have a syntax with an explicit terminator: end terminates if, for and while structures; until terminates repeat structures.

The condition expression of a control structure can result in any value. Remember that Lua treats as true all values different from false and nil. (In particular, Lua treats both zero and the empty string as true.)

An if statement tests its condition and executes its then-part or its else-part accordingly. The else-part is optional.

      if a < 0 then a = 0 end
      
      if a < b then return a else return b end
      
      if line > MAXLINES then
        showpage()
        line = 0
      end

To write nested ifs we can use elseif. It is similar to an else followed by an if, but it avoids the need for multiple ends:

      if op == "+" then
        r = a + b
      elseif op == "-" then
        r = a - b
      elseif op == "*" then
        r = a*b
      elseif op == "/" then
        r = a/b
      else
        error("invalid operation")
      end

Because Lua has no switch statement, such chains are somewhat common.

As the name implies, a while loop repeats its body while a condition is true. As usual, Lua first tests the while condition; if the condition is false, then the loop ends; otherwise, Lua executes the body of the loop and repeats the process.

      local i = 1
      while a[i] do
        print(a[i])
        i = i + 1
      end

As the name implies, a repeatuntil statement repeats its body until its condition is true. This statement does the test after the body, so that it always executes the body at least once.

      -- print the first non-empty input line
      local line
      repeat
        line = io.read()
      until line ~= ""
      print(line)

Differently from most other languages, in Lua the scope of a local variable declared inside the loop includes the condition:

      -- computes the square root of 'x' using Newton-Raphson method
      local sqr = x / 2
      repeat
        sqr = (sqr + x/sqr) / 2
        local error = math.abs(sqr^2 - x)
      until error < x/10000      -- local 'error' still visible here

The for statement has two variants: the numerical for and the generic for.

A numerical for has the following syntax:

      for var = exp1, exp2, exp3 do
        something
      end

This loop will execute something for each value of var from exp1 to exp2, using exp3 as the step to increment var. This third expression is optional; when absent, Lua assumes one as the step value. If we want a loop without an upper limit, we can use the constant math.huge:

      for i = 1, math.huge do
        if (0.3*i^3 - 20*i^2 - 500 >= 0) then
          print(i)
          break
        end
      end

The for loop has some subtleties that you should learn in order to make good use of it. First, all three expressions are evaluated once, before the loop starts. Second, the control variable is a local variable automatically declared by the for statement, and it is visible only inside the loop. A typical mistake is to assume that the variable still exists after the loop ends:

      for i = 1, 10 do print(i) end
      max = i      -- probably wrong!

If you need the value of the control variable after the loop (usually when you break the loop), you must save its value into another variable:

      -- find a value in a list
      local found = nil
      for i = 1, #a do
        if a[i] < 0 then
          found = i      -- save value of 'i'
          break
        end
      end
      print(found)

Third, you should not change the value of the control variable: the effect of such changes is unpredictable. If you want to end a for loop before its normal termination, use break (as we did in the previous example).

The break and return statements allow us to jump out of a block. The goto statement allows us to jump to almost any point in a function.

We use the break statement to finish a loop. This statement breaks the inner loop (for, repeat, or while) that contains it; it cannot be used outside a loop. After the break, the program continues running from the point immediately after the broken loop.

A return statement returns the results from a function or simply finishes the function. There is an implicit return at the end of any function, so we do not need to write one for functions that end naturally, without returning any value.

For syntactic reasons, a return can appear only as the last statement of a block: in other words, as the last statement in our chunk or just before an end, an else, or an until. For instance, in the next example, return is the last statement of the then block:

      local i = 1
      while a[i] do
        if a[i] == v then return i end
        i = i + 1
      end

Usually, these are the places where we use a return, because any statement following it would be unreachable. Sometimes, however, it may be useful to write a return in the middle of a block; for instance, we may be debugging a function and want to avoid its execution. In such cases, we can use an explicit do block around the statement:

      function foo ()
        return                --<< SYNTAX ERROR
        -- 'return' is the last statement in the next block
        do return end         -- OK
        other statements
      end

A goto statement jumps the execution of a program to a corresponding label. There has been a long going debate about goto, with some people arguing even today that they are harmful to programming and should be banned from programming languages. Nonetheless, several current languages offer goto, with good reason. They are a powerful mechanism and, when used with care, can only improve the quality of our code.

In Lua, the syntax for a goto statement is quite conventional: it is the reserved word goto followed by the label name, which can be any valid identifier. The syntax for a label is a little more convoluted: it has two colons followed by the label name followed by more two colons, like in ::name::. This convolution is intentional, to highlight labels in a program.

Lua poses some restrictions to where we can jump with a goto. First, labels follow the usual visibility rules, so we cannot jump into a block (because a label inside a block is not visible outside it). Second, we cannot jump out of a function. (Note that the first rule already excludes the possibility of jumping into a function.) Third, we cannot jump into the scope of a local variable.

A typical and well-behaved use of a goto is to simulate some construction that you learned from another language but that is absent from Lua, such as continue, multi-level break, multi-level continue, redo, local error handling, etc. A continue statement is simply a goto to a label at the end of a loop block; a redo statement jumps to the beginning of the block:

      while some_condition do
        ::redo::
        if some_other_condition then goto continue
        else if yet_another_condition then goto redo
        end
        some code
        ::continue::
      end

A useful detail in the specification of Lua is that the scope of a local variable ends on the last non-void statement of the block where the variable is defined; labels are considered void statements. To see the usefulness of this detail, consider the next fragment:

      while some_condition do
        if some_other_condition then goto continue end
        local var = something
        some code
        ::continue::
      end

You may think that this goto jumps into the scope of the variable var. However, the continue label appears after the last non-void statement of the block, and therefore it is not inside the scope of var.

The goto is also useful for writing state machines. As an example, Figure 8.1, “An example of a state machine with goto” shows a program that checks whether its input has an even number of zeros.

There are better ways to write this specific program, but this technique is useful if we want to translate a finite automaton into Lua code automatically (think about dynamic code generation).

As another example, let us consider a simple maze game. The maze has several rooms, each with up to four doors: north, south, east, and west. At each step, the user enters a movement direction. If there is a door in this direction, the user goes to the corresponding room; otherwise, the program prints a warning. The goal is to go from an initial room to a final room.

This game is a typical state machine, where the current room is the state. We can implement this maze with one block for each room, using a goto to move from one room to another. Figure 8.2, “A maze game” shows how we could write a small maze with four rooms.

For this simple game, you may find that a data-driven program, where you describe the rooms and movements with tables, is a better design. However, if the game has several special situations in each room, then this state-machine design is quite appropriate.

Exercise 8.1: Most languages with a C-like syntax do not offer an elseif construct. Why does Lua need this construct more than those languages?

Exercise 8.2: Describe four different ways to write an unconditional loop in Lua. Which one do you prefer?

Exercise 8.3: Many people argue that repeatuntil is seldom used, and therefore it should not be present in a minimalistic language like Lua. What do you think?

Exercise 8.4: As we saw in the section called “Proper Tail Calls”, a tail call is a goto in disguise. Using this idea, reimplement the simple maze game from the section called “break, return, and goto using tail calls. Each block should become a new function, and each goto becomes a tail call.

Exercise 8.5: Can you explain why Lua has the restriction that a goto cannot jump out of a function? (Hint: how would you implement that feature?)

Exercise 8.6: Assuming that a goto could jump out of a function, explain what the program in Figure 8.3, “A strange (and invalid) use of a goto” would do.

(Try to reason about the label using the same scoping rules used for local variables.)

Personal copy of Eric Taylor <jdslkgjf.iapgjflksfg@yandex.com>