Global variables are a necessary evil of most programming languages. On one hand, the use of global variables can easily lead to complex code, entangling apparently unrelated parts of a program. On the other hand, the judicious use of global variables can better express truly global aspects of a program; moreover, global constants are innocuous, but dynamic languages like Lua have no way to distinguish constants from variables. An embedded language like Lua adds another ingredient to this mix: a global variable is a variable that is visible in the whole program, but Lua has no clear concept of a program, having instead pieces of code (chunks) called by the host application.
Lua solves this conundrum by not having global variables, but going to great lengths to pretend it has. In a first approximation, we can think that Lua keeps all its global variables in a regular table, called the global environment. Later in this chapter, we will see that Lua can keep its “global” variables in several environments. For now, we will stick to that first approximation.
The use of a table to store global variables
simplifies the internal implementation of Lua,
because there is no need for
a different data structure for global variables.
Another advantage is that
we can manipulate this table like any other table.
To help such manipulations,
Lua stores the global environment itself
in the global variable _G
.
(As a result, _G._G
is equal to _G
.)
For instance,
the following code prints the names of
all the variables defined in the global environment:
for n in pairs(_G) do print(n) end
Usually, assignment is enough for accessing and setting global variables. However, sometimes we need some form of meta-programming, such as when we need to manipulate a global variable whose name is stored in another variable or is somehow computed at run time. To get the value of such a variable, some programmers are tempted to write something like this:
value = load("return " .. varname)()
If varname
is x
, for example,
the concatenation will result in "return x"
,
which when run achieves the desired result.
However,
this code involves the creation and compilation of a new chunk,
which is somewhat expensive.
We can accomplish the same effect with the following code,
which is more than an order of magnitude more efficient
than the previous one:
value = _G[varname]
Because the environment is a regular table, we can simply index it with the desired key (the variable name).
In a similar way,
we can assign a value to a global variable
whose name is computed dynamically
by writing _G[varname] = value
.
Beware, however:
some programmers get a little excited with these facilities
and end up writing code like
_G["a"] = _G["b"]
,
which is just a complicated way to write a = b
.
A generalization of the previous problem
is to allow fields in the dynamic name,
such as "io.read"
or "a.b.c.d"
.
If we write _G["io.read"]
,
clearly we will not get the field read
from the table io
.
But we can write a function getfield
such
that getfield("io.read")
returns the expected result.
This function is mainly a loop,
which starts at _G
and evolves field by field:
function getfield (f) local v = _G -- start with the table of globals for w in string.gmatch(f, "[%a_][%w_]*") do v = v[w] end return v end
We rely on gmatch
to iterate over all identifiers in f
.
The corresponding function to set fields is a little more complex.
An assignment like a.b.c.d = v
is equivalent to
the following code:
local temp = a.b.c temp.d = v
That is, we must retrieve up to the last name and
then handle this last name separately.
The function setfield
, in Figure 22.1, “The function setfield
”,
does the task and also creates intermediate tables
in a path when they do not exist.
Figure 22.1. The function setfield
function setfield (f, v) local t = _G -- start with the table of globals for w, d in string.gmatch(f, "([%a_][%w_]*)(%.?)") do if d == "." then -- not last name? t[w] = t[w] or {} -- create table if absent t = t[w] -- get the table else -- last name t[w] = v -- do the assignment end end end
The pattern there captures
the field name in the variable w
and an optional following dot in the variable d
.
If a field name is not followed by a dot, then it is the last name.
With the previous functions in place,
the next call creates a global table t
,
another table t.x
,
and assigns 10 to t.x.y
:
setfield("t.x.y", 10) print(t.x.y) --> 10 print(getfield("t.x.y")) --> 10
Global variables in Lua do not need declarations. Although this behavior is handy for small programs, in larger programs a simple typo can cause bugs that are difficult to find. However, we can change this behavior if we like. Because Lua keeps its global variables in a regular table, we can use metatables to detect when Lua accesses non-existent variables.
A first approach simply detects any access to absent keys in the global table:
setmetatable(_G, { __newindex = function (_, n) error("attempt to write to undeclared variable " .. n, 2) end, __index = function (_, n) error("attempt to read undeclared variable " .. n, 2) end, })
After this code, any attempt to access a non-existent global variable will trigger an error:
> print(a) stdin:1: attempt to read undeclared variable a
But how do we declare new variables?
One option is to use rawset
,
which bypasses the metamethod:
function declare (name, initval) rawset(_G, name, initval or false) end
(The or with false ensures that the new global always gets a value different from nil.)
A simpler option is to restrict assignments to new global variables only inside functions, allowing free assignments in the outer level of a chunk.
To check whether an assignment is in the main chunk,
we must use the debug library.
The call debug.getinfo(2, "S")
returns a table
whose field what
tells whether the function that called
the metamethod is a main chunk, a regular Lua function,
or a C function.
(We will see debug.getinfo
in more detail in
the section called “Introspective Facilities”.)
Using this function,
we can rewrite the __newindex
metamethod like this:
__newindex = function (t, n, v) local w = debug.getinfo(2, "S").what if w ~= "main" and w ~= "C" then error("attempt to write to undeclared variable " .. n, 2) end rawset(t, n, v) end
This new version also accepts assignments from C code, as this kind of code usually knows what it is doing.
If we need to test whether a variable exists,
we cannot simply compare it to nil because,
if it is nil, the access will raise an error.
Instead, we use rawget
,
which avoids the metamethod:
if rawget(_G, var) == nil then -- 'var' is undeclared ... end
As it is, our scheme does not allow global variables with nil values, as they would be automatically considered undeclared. But it is not difficult to correct this problem. All we need is an auxiliary table that keeps the names of declared variables. Whenever a metamethod is called, it checks in this table whether the variable is undeclared. The code can be like the one in Figure 22.2, “Checking global-variable declaration”.
Figure 22.2. Checking global-variable declaration
local declaredNames = {} setmetatable(_G, { __newindex = function (t, n, v) if not declaredNames[n] then local w = debug.getinfo(2, "S").what if w ~= "main" and w ~= "C" then error("attempt to write to undeclared variable "..n, 2) end declaredNames[n] = true end rawset(t, n, v) -- do the actual set end, __index = function (_, n) if not declaredNames[n] then error("attempt to read undeclared variable "..n, 2) else return nil end end, })
Now, even an assignment like x = nil
is enough to declare
a global variable.
The overhead for both solutions is negligible. With the first solution, the metamethods are never called during normal operation. In the second, they can be called, but only when the program accesses a variable holding a nil.
The Lua distribution comes with a module strict.lua
that implements a global-variable check that uses essentially
the code in Figure 22.2, “Checking global-variable declaration”.
It is a good habit to use it when developing Lua code.
In Lua, global variables do not need to be truly global. As I already hinted, Lua does not even have global variables. That may sound strange at first, as we have been using global variables all along this text. As I said, Lua goes to great lengths to give the programmer an illusion of global variables. Now we will see how Lua builds this illusion.[19]
First, let us forget about global variables.
Instead, we will start with the concept of free names.
A free name is a name that is not bound to
an explicit declaration,
that is,
it does not occur inside the scope of a
corresponding local variable.
For instance, both x
and y
are free names in the following chunk,
but z
is not:
local z = 10 x = y + z
Now comes the important part:
The Lua compiler translates
any free name x
in the chunk to _ENV.x
.
So, the previous chunk is fully equivalent to this one:
local z = 10 _ENV.x = _ENV.y + z
But what is this new _ENV
variable?
_ENV
cannot be a global variable;
we just said that Lua has no global variables.
Again, the compiler does the trick.
I already mentioned that Lua treats any chunk
as an anonymous function.
Actually, Lua compiles our original chunk as the following code:
local _ENV = some value
return function (...)
local z = 10
_ENV.x = _ENV.y + z
end
That is, Lua compiles any chunk in the presence of a predefined
upvalue (an external local variable) called _ENV
.
So, any variable is either local, if it is a bounded name,
or a field in _ENV
,
which itself is a local variable (an upvalue).
The initial value for _ENV
can be any table.
(Actually, it does not need to be a table;
more about that later.)
Any such table is called an environment.
To preserve the illusion of global variables,
Lua keeps internally a table
that it uses as a global environment.
Usually, when we load a chunk,
the function load
initializes this predefined upvalue
with that global environment.
So, our original chunk becomes equivalent to this one:
local _ENV = the global environment
return function (...)
local z = 10
_ENV.x = _ENV.y + z
end
The result of all these arrangements is that
the x
field of the global environment gets
the value of the y
field plus 10.
At first sight, this may seem a rather convoluted way to manipulate global variables. I will not argue that it is the simplest way, but it offers a flexibility that is difficult to achieve with a simpler implementation.
Before we go on, let us summarize the handling of global variables in Lua:
After all, it is not that complicated.
Some people get confused because they try to infer
extra magic from these rules.
There is no extra magic.
In particular,
the first two rules are done entirely by the compiler.
Except for being predefined by the compiler,
_ENV
is a plain regular variable.
Outside the compiler,
the name _ENV
has no special meaning at all to Lua.[20]
Similarly, the translation from x
to _ENV.x
is a plain syntactic translation,
with no hidden meanings.
In particular, after the translation,
_ENV
will refer to whatever
_ENV
variable is visible at that point in the code,
following the standard visibility rules.
In this section, we will see some ways to explore
the flexibility brought by _ENV
.
Keep in mind that
we must run most examples in this section
as a single chunk.
If we enter code line by line in interactive mode,
each line becomes a different chunk
and therefore each will have a distinct _ENV
variable.
To run a piece of code as a single chunk,
we can either run it from a file or
enclose it in a do—end block.
Because _ENV
is a regular variable,
we can assign to and access it as any other variable.
The assignment _ENV = nil
will invalidate
any direct access to global variables in the rest of the chunk.
This can be useful to control what variables our code uses:
local print, sin = print, math.sin _ENV = nil print(13) --> 13 print(sin(13)) --> 0.42016703682664 print(math.cos(13)) -- error!
Any assignment to a free name (a “global variable”) will raise a similar error.
We can write the _ENV
explicitly
to bypass a local declaration:
a = 13 -- global local a = 12 print(a) --> 12 (local) print(_ENV.a) --> 13 (global)
a = 13 -- global local a = 12 print(a) --> 12 (local) print(_G.a) --> 13 (global)
Usually,
_G
and _ENV
refer to the same table but,
despite that, they are quite different entities.
_ENV
is a local variable,
and all accesses to “global variables”
in reality are accesses to it.
_G
is a global variable with no special status whatsoever.
By definition,
_ENV
always refers to the current environment;
_G
usually refers to the global environment,
provided it is visible and no one changed its value.
The main use for _ENV
is
to change the environment used by a piece of code.
Once we change the environment,
all global accesses will use the new table:
-- change current environment to a new empty table _ENV = {} a = 1 -- create a field in _ENV print(a) --> stdin:4: attempt to call global 'print' (a nil value)
If the new environment is empty,
we have lost all our global variables,
including print
.
So, we should first populate it with some useful values,
for instance with the global environment:
a = 15 -- create a global variable _ENV = {g = _G} -- change current environment a = 1 -- create a field in _ENV g.print(_ENV.a, g.a) --> 1 15
Now, when we access the “global” g
(which lives in _ENV
,
not in the global environment)
we get the global environment,
wherein Lua will find the function print
.
We can rewrite the previous example using the name _G
instead of g
:
a = 15 -- create a global variable _ENV = {_G = _G} -- change current environment a = 1 -- create a field in _ENV _G.print(_ENV.a, _G.a) --> 1 15
The only special status of _G
happens when Lua creates the initial global table and makes
its field _G
points to itself.
Lua does not care about the current value of this variable.
Nevertheless, it is customary to use this same name
whenever we have a reference to the global environment,
as we did in the rewritten example.
Another way to populate our new environment is with inheritance:
a = 1 local newgt = {} -- create new environment setmetatable(newgt, {__index = _G}) _ENV = newgt -- set it print(a) --> 1
In this code,
the new environment inherits both print
and a
from
the global one.
However, any assignment goes to the new table.
There is no danger of changing a variable
in the global environment by mistake,
although we still can change them through _G
:
-- continuing the previous chunk a = 10 print(a, _G.a) --> 10 1 _G.a = 20 print(_G.a) --> 20
Being a regular variable, _ENV
follows the usual scoping rules.
In particular,
functions defined inside a chunk access _ENV
as they access
any other external variable:
_ENV = {_G = _G} local function foo () _G.print(a) -- compiled as '_ENV._G.print(_ENV.a)' end a = 10 foo() --> 10 _ENV = {_G = _G, a = 20} foo() --> 20
If we define a new local variable called _ENV
,
references to free names will bind to that new variable:
a = 2 do local _ENV = {print = print, a = 14} print(a) --> 14 end print(a) --> 2 (back to the original _ENV)
Therefore, it is not difficult to define a function with a private environment:
function factory (_ENV) return function () return a end end f1 = factory{a = 6} f2 = factory{a = 7} print(f1()) --> 6 print(f2()) --> 7
The factory
function creates simple closures that
return the value of their “global” a
.
When the closure is created,
its visible _ENV
variable
is the parameter _ENV
of the enclosing factory
function;
therefore, each closure will use its own external variable
(as an upvalue) to access its free names.
Using the usual scoping rules, we can manipulate environments in several other ways. For instance, we may have several functions sharing a common environment, or a function that changes the environment that it shares with other functions.
In the section called “The Basic Approach for Writing Modules in Lua”,
when we discussed how to write modules,
I mentioned that one drawback of those methods was
that it was all too easy to pollute the global space,
for instance by forgetting a local in a private declaration.
Environments offer
an interesting technique for solving that problem.
Once the module main chunk has an exclusive environment,
not only all its functions share this table,
but also all its global variables go to this table.
We can declare all public functions as global variables
and they will go to a separate table automatically.
All the module has to do is to assign this table to
the _ENV
variable.
After that, when we declare a function add
,
it goes to M.add
:
local M = {} _ENV = M function add (c1, c2) return new(c1.r + c2.r, c1.i + c2.i) end
Moreover, we can call other functions from
the same module without any prefix.
In the previous code,
add
gets new
from its environment,
that is, it calls M.new
.
This method offers a good support for modules, with little extra work for the programmer. It needs no prefixes at all. There is no difference between calling an exported function and a private one. If the programmer forgets a local, he does not pollute the global namespace; instead, a private function simply becomes public.
Nevertheless, currently I still prefer the original basic method.
It may need more work,
but the resulting code states clearly what it does.
To avoid creating a global by mistake,
I use the simple method of assigning nil to _ENV
.
After that,
any assignment to a global name will raise an error.
This approach has the extra advantage that it works
without changes in older versions of Lua.
(In Lua 5.1, the assignment to _ENV
will not prevent errors,
but it will not cause any harm, either.)
To access other modules, we can use one of the methods we discussed in the previous section. For instance, we can declare a local variable that holds the global environment:
local M = {} local _G = _G _ENV = nil
We then prefix global names with _G
and module names with M
.
A more disciplined approach is to declare as locals only the functions we need or, at most, the modules we need:
-- module setup local M = {} -- Import Section: -- declare everything this module needs from outside local sqrt = math.sqrt local io = io -- no more external access after this point _ENV = nil
This technique demands more work, but it documents the module dependencies better.
As I mentioned earlier,
load
usually initializes the _ENV
upvalue
of a loaded chunk with the global environment.
However, load
has an optional fourth parameter that
allows us to give a different initial value for _ENV
.
(The function loadfile
has a similar parameter.)
For an initial example, consider that we have a typical configuration file, defining several constants and functions to be used by a program; it can be something like this:
-- file 'config.lua' width = 200 height = 300 ...
We can load it with the following code:
env = {} loadfile("config.lua", "t", env)()
The whole code in the configuration file
will run in the empty environment env
,
which works as a kind of sandbox.
In particular,
all definitions will go into this environment.
The configuration file has no way
to affect anything else,
even by mistake.
Even malicious code cannot do much damage.
It can do a denial of service (DoS) attack,
by wasting CPU time and memory,
but nothing else.
Sometimes, we may want to run a chunk several times,
each time with a different environment table.
In that case, the extra argument to load
is not useful.
Instead, we have two other options.
The first option is to use the function debug.setupvalue
,
from the debug library.
As its name implies,
setupvalue
allows us to change any upvalue
of a given function.
The next fragment illustrates its use:
f = load("b = 10; return a") env = {a = 20} debug.setupvalue(f, 1, env) print(f()) --> 20 print(env.b) --> 10
The first argument in the call to setupvalue
is the function,
the second is the upvalue index,
and the third is the new value for the upvalue.
For this kind of use, the second argument is always one:
when a function represents a chunk,
Lua assures that it has only one upvalue
and that this upvalue is _ENV
.
A small drawback of this option is
its dependence on the debug library.
This library breaks some usual assumptions about programs.
For instance, debug.setupvalue
breaks
Lua’s visibility rules,
which ensures that we cannot access a local variable
from outside its lexical scope.
Another option to run a chunk with several different environments is to twist the chunk a little when loading it. Imagine that we add the following line just before the chunk:
_ENV = ...;
Remember that Lua compiles any chunk as a variadic function.
So, that extra line of code
will assign to the _ENV
variable
the first argument passed to the chunk,
thereby setting that argument as the environment.
The following code snippet illustrates the idea,
using the function loadwithprefix
that you implemented in Exercise 16.1:
prefix = "_ENV = ...;" f = loadwithprefix(prefix, io.lines(filename, "*L")) ... env1 = {} f(env1) env2 = {} f(env2)
Exercise 22.1:
The function getfield
that we defined in the beginning
of this chapter is too forgiving,
as it accepts “fields” like math?sin
or string!!!gsub
.
Rewrite it so that it accepts
only single dots as name separators.
Exercise 22.2: Explain in detail what happens in the following program and what it will print.
local foo do local _ENV = _ENV function foo () print(X) end end X = 13 _ENV = nil foo() X = 0
Exercise 22.3: Explain in detail what happens in the following program and what it will print.
local print = print function foo (_ENV, a) print(a + b) end foo({b = 14}, 12) foo({b = 10}, 1)
Personal copy of Eric Taylor <jdslkgjf.iapgjflksfg@yandex.com>