In the previous chapter, we saw how to extend Lua with new functions written in C. Now, we will see how to extend Lua with new types written in C. We will start with a small example; through the chapter, we will extend it with metamethods and other goodies.
Our running example in this chapter will be a quite simple type: Boolean arrays. The main motivation for this example is that it does not involve complex algorithms, so we can concentrate on API issues. Nevertheless, the example is useful by itself. Of course, we can use tables to implement arrays of Booleans in Lua. But a C implementation, where we store each entry in one single bit, uses less than 3% of the memory used by a table.
Our implementation will need the following definitions:
#include <limits.h> #define BITS_PER_WORD (CHAR_BIT * sizeof(unsigned int)) #define I_WORD(i) ((unsigned int)(i) / BITS_PER_WORD) #define I_BIT(i) (1 << ((unsigned int)(i) % BITS_PER_WORD))
BITS_PER_WORD
is the number of bits in
an unsigned integer.
The macro I_WORD
computes the word that stores
the bit corresponding to a given index,
and I_BIT
computes a mask to access
the correct bit inside this word.
We will represent our arrays with the following structure:
typedef struct BitArray { int size; unsigned int values[1]; /* variable part */ } BitArray;
We declare the array values
with size 1 only as a placeholder,
because C 89 does not allow an array with size 0;
we will set the actual size when we allocate the array.
The next expression computes the total size
for an array with n
elements:
sizeof(BitArray) + I_WORD(n - 1) * sizeof(unsigned int)
(We subtract one from n
because the original structure already
includes space for one element.)
In this first version, we will use explicit calls to set and get values, as in the next example:
a = array.new(1000) for i = 1, 1000 do array.set(a, i, i % 2 == 0) -- a[i] = (i % 2 == 0) end print(array.get(a, 10)) --> true print(array.get(a, 11)) --> false print(array.size(a)) --> 1000
Later we will see how to support both an object-oriented style,
like a:get(i)
,
and a conventional syntax, like a[i]
.
For all versions, the underlying functions are the same,
defined in Figure 31.1, “Manipulating a Boolean array”.
Figure 31.1. Manipulating a Boolean array
static int newarray (lua_State *L) { int i; size_t nbytes; BitArray *a; int n = (int)luaL_checkinteger(L, 1); /* number of bits */ luaL_argcheck(L, n >= 1, 1, "invalid size"); nbytes = sizeof(BitArray) + I_WORD(n - 1)*sizeof(unsigned int); a = (BitArray *)lua_newuserdata(L, nbytes); a->size = n; for (i = 0; i <= I_WORD(n - 1); i++) a->values[i] = 0; /* initialize array */ return 1; /* new userdata is already on the stack */ } static int setarray (lua_State *L) { BitArray *a = (BitArray *)lua_touserdata(L, 1); int index = (int)luaL_checkinteger(L, 2) - 1; luaL_argcheck(L, a != NULL, 1, "'array' expected"); luaL_argcheck(L, 0 <= index && index < a->size, 2, "index out of range"); luaL_checkany(L, 3); if (lua_toboolean(L, 3)) a->values[I_WORD(index)] |= I_BIT(index); /* set bit */ else a->values[I_WORD(index)] &= ~I_BIT(index); /* reset bit */ return 0; } static int getarray (lua_State *L) { BitArray *a = (BitArray *)lua_touserdata(L, 1); int index = (int)luaL_checkinteger(L, 2) - 1; luaL_argcheck(L, a != NULL, 1, "'array' expected"); luaL_argcheck(L, 0 <= index && index < a->size, 2, "index out of range"); lua_pushboolean(L, a->values[I_WORD(index)] & I_BIT(index)); return 1; }
Let us see them, bit by bit.
Our first concern is how to represent a C structure in Lua. Lua provides a basic type specifically for this task, called userdata. A userdata offers a raw memory area, with no predefined operations in Lua, which we can use to store anything.
The function lua_newuserdata
allocates a block of memory
with a given size,
pushes the corresponding userdata on the stack,
and returns the block address:
void *lua_newuserdata (lua_State *L, size_t size);
If for some reason we need to allocate memory by other means, it is very easy to create a userdata with the size of a pointer and to store there a pointer to the real memory block. We will see examples of this technique in Chapter 32, Managing Resources.
Our first function in Figure 31.1, “Manipulating a Boolean array”,
newarray
, uses lua_newuserdata
to create new arrays.
Its code is straightforward.
It checks its sole parameter (the array size, in bits),
computes the array size in bytes,
creates a userdata with the appropriate size,
initializes its fields,
and returns the userdata to Lua.
The next function is setarray
,
which receives three arguments:
the array, the index, and the new value.
It assumes that indices start at one, as usual in Lua.
Because Lua accepts any value for a Boolean,
we use luaL_checkany
for the third parameter:
it ensures only that there is a value (any value)
for this parameter.
If we call setarray
with bad arguments,
we get explanatory error messages,
as in the following examples:
array.set(0, 11, 0) --> stdin:1: bad argument #1 to 'set' ('array' expected) array.set(a, 1) --> stdin:1: bad argument #3 to 'set' (value expected)
The last function in Figure 31.1, “Manipulating a Boolean array” is getarray
,
the function to retrieve an entry.
It is similar to setarray
.
We will also define a function to retrieve the size of an array and some extra code to initialize our library; see Figure 31.2, “Extra code for the Boolean array library”.
Figure 31.2. Extra code for the Boolean array library
static int getsize (lua_State *L) { BitArray *a = (BitArray *)lua_touserdata(L, 1); luaL_argcheck(L, a != NULL, 1, "'array' expected"); lua_pushinteger(L, a->size); return 1; } static const struct luaL_Reg arraylib [] = { {"new", newarray}, {"set", setarray}, {"get", getarray}, {"size", getsize}, {NULL, NULL} }; int luaopen_array (lua_State *L) { luaL_newlib(L, arraylib); return 1; }
Again, we use luaL_newlib
, from the auxiliary library.
It creates a table
and fills it with the pairs name–function specified by the
array arraylib
.
Our current implementation has a major vulnerability.
Suppose the user writes something like
array.set(io.stdin, 1, false)
.
The value of io.stdin
is a userdata
with a pointer to a stream (FILE *
).
Because it is a userdata,
array.set
will gladly accept it as a valid argument;
the probable result will be a memory corruption
(with luck we will get an index-out-of-range error instead).
Such behavior is unacceptable for any Lua library.
No matter how we use a library,
it should neither corrupt C data
nor cause the Lua system to crash.
The usual method to distinguish one type of userdata from another is to create a unique metatable for that type. Every time we create a userdata, we mark it with the corresponding metatable; every time we get a userdata, we check whether it has the right metatable. Because Lua code cannot change the metatable of a userdata, it cannot deceive these checks.
We also need a place to store this new metatable,
so that we can access it to create new userdata and
to check whether a given userdata has the correct type.
As we saw earlier,
there are two options for storing the metatable:
in the registry
or as an upvalue for the functions in the library.
It is customary, in Lua,
to register any new C type into the registry,
using a type name
as the index and the metatable as the value.
As with any other registry index,
we must choose a type name with care, to avoid clashes.
Our example
will use the name "LuaBook.array"
for its new type.
As usual, the auxiliary library offers some functions to help us here. The new auxiliary functions we will use are these:
int luaL_newmetatable (lua_State *L, const char *tname); void luaL_getmetatable (lua_State *L, const char *tname); void *luaL_checkudata (lua_State *L, int index, const char *tname);
The function luaL_newmetatable
creates a new table (to be used as a metatable),
leaves the new table on the top of the stack,
and maps the table to the given name in the registry.
The function luaL_getmetatable
retrieves
the metatable associated with tname
from the registry.
Finally, luaL_checkudata
checks whether the object at the
given stack position is a userdata with a metatable that matches
the given name.
It raises an error if the object is not a userdata
or if it does not have the correct metatable;
otherwise, it returns the userdata address.
Now we can start our modifications. The first step is to change the function that opens the library so that it creates the metatable for arrays:
int luaopen_array (lua_State *L) { luaL_newmetatable(L, "LuaBook.array"); luaL_newlib(L, arraylib); return 1; }
The next step is to change newarray
so that it sets
this metatable in all arrays that it creates:
static int newarray (lua_State *L) {
as before
luaL_getmetatable(L, "LuaBook.array");
lua_setmetatable(L, -2);
return 1; /* new userdata is already on the stack */
}
The function lua_setmetatable
pops a table from the stack
and sets it as the metatable of the object at the given index.
In our case, this object is the new userdata.
Finally, setarray
, getarray
,
and getsize
have to check
whether they have got a valid array as their first argument.
To simplify their tasks, we define the following macro:
#define checkarray(L) \ (BitArray *)luaL_checkudata(L, 1, "LuaBook.array")
Using this macro,
the new definition for getsize
is straightforward:
static int getsize (lua_State *L) { BitArray *a = checkarray(L); lua_pushinteger(L, a->size); return 1; }
Because setarray
and getarray
also share code
to read and check the index as their second argument,
we factor out their common parts in
a new auxiliary function (getparams
).
Figure 31.3. New versions for setarray
/getarray
static unsigned int *getparams (lua_State *L, unsigned int *mask) { BitArray *a = checkarray(L); int index = (int)luaL_checkinteger(L, 2) - 1; luaL_argcheck(L, 0 <= index && index < a->size, 2, "index out of range"); *mask = I_BIT(index); /* mask to access correct bit */ return &a->values[I_WORD(index)]; /* word address */ } static int setarray (lua_State *L) { unsigned int mask; unsigned int *entry = getparams(L, &mask); luaL_checkany(L, 3); if (lua_toboolean(L, 3)) *entry |= mask; else *entry &= ~mask; return 0; } static int getarray (lua_State *L) { unsigned int mask; unsigned int *entry = getparams(L, &mask); lua_pushboolean(L, *entry & mask); return 1; }
With this new function,
setarray
and getarray
are straightforward,
see Figure 31.3, “New versions for setarray
/getarray
”.
Now, if we call them with an invalid userdata,
we will get a proper error message:
a = array.get(io.stdin, 10) --> bad argument #1 to 'get' (LuaBook.array expected, got FILE*)
Our next step is to transform our new type into an object, so that we can operate on its instances using the usual object-oriented syntax, like this:
a = array.new(1000) print(a:size()) --> 1000 a:set(10, true) print(a:get(10)) --> true
Remember that a:size()
is equivalent to a.size(a)
.
Therefore, we have to arrange for the expression a.size
to return our function getsize
.
The key mechanism here is the __index
metamethod.
For tables, Lua calls this metamethod whenever it cannot find
a value for a given key.
For userdata, Lua calls it in every access,
because userdata have no keys at all.
Assume that we run the following code:
do local metaarray = getmetatable(array.new(1)) metaarray.__index = metaarray metaarray.set = array.set metaarray.get = array.get metaarray.size = array.size end
In the first line,
we create an array only to get its metatable,
which we assign to metaarray
.
(We cannot set the metatable of a userdata from Lua,
but we can get it.)
Then we set metaarray.__index
to metaarray
.
When we evaluate a.size
,
Lua cannot find the key "size"
in the object a
,
because the object is a userdata.
Therefore, Lua tries to get this value from
the field __index
of the metatable of a
,
which happens to be metaarray
itself.
But metaarray.size
is array.size
,
so a.size(a)
results in array.size(a)
,
as we wanted.
Of course, we can write the same thing in C.
We can do even better:
now that arrays are objects,
with their own operations,
we do not need to have these operations in the table
array
anymore.
The only function that our library still has to export is new
,
to create new arrays.
All other operations come only as methods.
The C code can register them directly as such.
The operations getsize
, getarray
, and setarray
do not change from our previous approach.
What will change is how we register them.
That is, we have to change the code that opens the library.
First, we need two separate function lists:
one for regular functions and one for methods.
static const struct luaL_Reg arraylib_f [] = { {"new", newarray}, {NULL, NULL} }; static const struct luaL_Reg arraylib_m [] = { {"set", setarray}, {"get", getarray}, {"size", getsize}, {NULL, NULL} };
The new version of the open function luaopen_array
has to create the metatable,
assign it to its own __index
field,
register all the methods there,
and create and fill the array
table:
int luaopen_array (lua_State *L) { luaL_newmetatable(L, "LuaBook.array"); /* create metatable */ lua_pushvalue(L, -1); /* duplicate the metatable */ lua_setfield(L, -2, "__index"); /* mt.__index = mt */ luaL_setfuncs(L, arraylib_m, 0); /* register metamethods */ luaL_newlib(L, arraylib_f); /* create lib table */ return 1; }
Here we use luaL_setfuncs
again,
to set the functions from the list arraylib_m
into the metatable,
which is on the top of the stack.
Then we call luaL_newlib
to create a new table and register the functions
from the list arraylib_f
there.
As a final touch,
we will add a __tostring
method to our new type,
so that print(a)
prints "array"
plus the size of
the array inside parentheses.
The function itself is here:
int array2string (lua_State *L) { BitArray *a = checkarray(L); lua_pushfstring(L, "array(%d)", a->size); return 1; }
The call to lua_pushfstring
formats the string
and leaves it on the top of the stack.
We also have to add array2string
to the list arraylib_m
,
to include it in the metatable of array objects:
static const struct luaL_Reg arraylib_m [] = {
{"__tostring", array2string},
other methods
};
A better alternative to the object-oriented notation
is to use a regular array notation to access our arrays.
Instead of writing a:get(i)
,
we could simply write a[i]
.
For our example, this is easy to do,
because our functions setarray
and getarray
already receive their arguments in the order that they
are given to the corresponding metamethods.
A quick solution is to define these
metamethods directly in Lua:
local metaarray = getmetatable(array.new(1)) metaarray.__index = array.get metaarray.__newindex = array.set metaarray.__len = array.size
(We must run this code on the original implementation for arrays, without the modifications for object-oriented access.) That is all we need to use the standard syntax:
a = array.new(1000) a[10] = true -- 'setarray' print(a[10]) -- 'getarray' --> true print(#a) -- 'getsize' --> 1000
If we prefer, we can register these metamethods in our C code. For this, we again modify our initialization function; see Figure 31.4, “New initialization code for the Bit Array library”.
Figure 31.4. New initialization code for the Bit Array library
static const struct luaL_Reg arraylib_f [] = { {"new", newarray}, {NULL, NULL} }; static const struct luaL_Reg arraylib_m [] = { {"__newindex", setarray}, {"__index", getarray}, {"__len", getsize}, {"__tostring", array2string}, {NULL, NULL} }; int luaopen_array (lua_State *L) { luaL_newmetatable(L, "LuaBook.array"); luaL_setfuncs(L, arraylib_m, 0); luaL_newlib(L, arraylib_f); return 1; }
In this new version,
again we have only one public function, new
.
All other functions are available only as metamethods
for specific operations.
The kind of userdata that we have been using until now is called full userdata. Lua offers another kind of userdata, called light userdata.
A light userdata is a value that represents a C pointer,
that is, a void *
value.
A light userdata is a value,
not an object;
we do not create them
(in the same way that we do not create numbers).
To put a light userdata onto the stack,
we call lua_pushlightuserdata
:
void lua_pushlightuserdata (lua_State *L, void *p);
Despite their common name, light userdata and full userdata are quite different things. Light userdata are not buffers, but bare pointers. They have no metatables. Like numbers, light userdata are not managed by the garbage collector.
Sometimes, people use light userdata as a
cheap alternative to full userdata.
This is not a typical use, however.
First,
light userdata do not have metatables,
so there is no way to know their types.
Second, despite the name, full userdata are inexpensive, too.
They add little overhead compared to a malloc
for the given memory size.
The real use of light userdata comes from equality. As a full userdata is an object, it is only equal to itself. A light userdata, on the other hand, represents a C pointer value. As such, it is equal to any userdata that represents the same pointer. Therefore, we can use light userdata to find C objects inside Lua.
We have already seen a typical use of light userdata,
as keys in the registry (the section called “The registry”).
There, the equality of light userdata was fundamental.
Every time we push the same address with lua_pushlightuserdata
,
we get the same Lua value and,
therefore, the same entry in the registry.
Another typical scenario in Lua is to have Lua objects acting as proxies to corresponding C objects. For instance, the I/O library uses Lua userdata to represent C streams inside Lua. When the action goes from Lua to C, the mapping from the Lua object to the C object is easy. Again using the example of the I/O library, each Lua stream keeps a pointer to its corresponding C stream. However, when the action goes from C to Lua, the mapping can be tricky. As an example, suppose we have some kind of callback in our I/O system (e.g., to tell that there is data to be read). The callback receives the C stream where is should operate. From that, how can we find its corresponding Lua object? Because the C stream is defined by the C standard library, not by us, we cannot store anything there.
Light userdata provide a nice solution for this mapping. We keep a table where the indices are light userdata with the stream addresses, and the values are the full userdata that represent the streams in Lua. In a callback, once we have a stream address, we use it —as a light userdata— as an index into that table to retrieve its corresponding Lua object. (The table should probably have weak values; otherwise, those full userdata would never be collected.)
Exercise 31.1:
Modify the implementation of setarray
so that it accepts only Boolean values.
Exercise 31.2: We can see a Boolean array as a set of integers (the indices with true values in the array). Add to the implementation of Boolean arrays functions to compute the union and intersection of two arrays. These functions should receive two arrays and return a new one, without modifying its parameters.
Exercise 31.3: Extend the previous exercise so that we can use addition to get the union of two arrays and multiplication for the intersection.
Exercise 31.4:
Modify the implementation of the __tostring
metamethod
so that it shows the full contents of the array
in an appropriate way.
Use the buffer facility (the section called “String Manipulation”)
to create the resulting string.
Exercise 31.5: Based on the example for Boolean arrays, implement a small C library for integer arrays.
Personal copy of Eric Taylor <jdslkgjf.iapgjflksfg@yandex.com>