Both the official API and the auxiliary library provide several mechanisms to help writing C functions. In this chapter, we cover the mechanisms for array manipulation, string manipulation, and storing Lua values in C.
An “array”, in Lua,
is just a table used in a specific way.
We can manipulate arrays using the same generic functions
we use to manipulate tables,
namely lua_settable
and lua_gettable
.
However, the API provides special functions
to access and update tables with integer keys:
void lua_geti (lua_State *L, int index, int key); void lua_seti (lua_State *L, int index, int key);
Lua versions prior to 5.3 offered only
raw versions of these functions,
lua_rawgeti
and lua_rawseti
.
They are similar to lua_geti
and lua_seti
,
but do raw accesses (that is, without invoking metamethods).
When the difference in unimportant
(e.g., the table has no metamethods),
the raw versions can be slightly faster.
The description of lua_geti
and lua_seti
is a little confusing,
as it involves two indices:
index
refers to where the table is on the stack;
key
refers to where the element is in the table.
The call lua_geti(L, t, key)
is equivalent to the following
sequence when t
is positive
(otherwise, we must compensate for the new item on the stack):
lua_pushnumber(L, key); lua_gettable(L, t);
The call lua_seti(L, t, key)
(again for t
positive)
is equivalent to this sequence:
lua_pushnumber(L, key); lua_insert(L, -2); /* put 'key' below previous value */ lua_settable(L, t);
As a concrete example of the use of these functions,
Figure 30.1, “The function map
in C” implements the function map:
it applies a given function to all elements of an array,
replacing each element by the result of the call.
Figure 30.1. The function map
in C
int l_map (lua_State *L) { int i, n; /* 1st argument must be a table (t) */ luaL_checktype(L, 1, LUA_TTABLE); /* 2nd argument must be a function (f) */ luaL_checktype(L, 2, LUA_TFUNCTION); n = luaL_len(L, 1); /* get size of table */ for (i = 1; i <= n; i++) { lua_pushvalue(L, 2); /* push f */ lua_geti(L, 1, i); /* push t[i] */ lua_call(L, 1, 1); /* call f(t[i]) */ lua_seti(L, 1, i); /* t[i] = result */ } return 0; /* no results */ }
This example also introduces three new functions:
luaL_checktype
, luaL_len
, and lua_call
.
The function luaL_checktype
(from lauxlib.h
)
ensures that a given argument has a given type;
otherwise, it raises an error.
The primitive lua_len
(not used in the example)
is equivalent to the length operator.
Because of metamethods,
this operator may result in any kind of object,
not only numbers;
therefore, lua_len
returns its result on the stack.
The function luaL_len
(the one used in the example, from the auxiliary library)
returns the length as an integer,
raising an error if the coercion is not possible.
The function lua_call
does an unprotected call.
It is similar to lua_pcall
,
but it propagates errors,
instead of returning an error code.
When we are writing the main code in an application,
we should not use lua_call
,
because we want to catch any errors.
When we are writing functions, however,
it is usually a good idea to use lua_call
;
if there is an error,
just leave it to someone who cares about it.
When a C function receives a string argument from Lua, there are only two rules that it must observe: not to pop the string from the stack while using it and never to modify the string.
Things get more demanding when a C function needs to create a string to return to Lua. Now, it is up to the C code to take care of buffer allocation/deallocation, buffer overflows, and other tasks that are difficult in C. So, the Lua API provides some functions to help with these tasks.
The standard API provides support for two
of the most basic string operations:
substring extraction and string concatenation.
To extract a substring,
remember that the basic operation lua_pushlstring
gets the string length as an extra argument.
Therefore,
if we want to pass to Lua a substring
of a string s
ranging from position i
to j
(inclusive),
all we have to do is this:
lua_pushlstring(L, s + i, j - i + 1);
As an example, suppose we want a function
that splits a string according
to a given separator (a single character)
and returns a table with the substrings.
For instance, the call split("hi:ho:there", ":")
should return the table {"hi", "ho", "there"}
.
Figure 30.2, “Splitting a string” presents a simple implementation
for this function.
Figure 30.2. Splitting a string
static int l_split (lua_State *L) { const char *s = luaL_checkstring(L, 1); /* subject */ const char *sep = luaL_checkstring(L, 2); /* separator */ const char *e; int i = 1; lua_newtable(L); /* result table */ /* repeat for each separator */ while ((e = strchr(s, *sep)) != NULL) { lua_pushlstring(L, s, e - s); /* push substring */ lua_rawseti(L, -2, i++); /* insert it in table */ s = e + 1; /* skip separator */ } /* insert last substring */ lua_pushstring(L, s); lua_rawseti(L, -2, i); return 1; /* return the table */ }
It uses no buffers and can handle arbitrarily long strings: Lua takes care of all the memory allocation. (As we created the table, we know it has no metatable; so, we can manipulate it with the raw operations.)
To concatenate strings,
Lua provides a specific function,
called lua_concat
.
It is equivalent to the concatenation operator (..
) in Lua:
it converts numbers to strings and triggers metamethods
when necessary.
Moreover, it can concatenate more than two strings at once.
The call lua_concat(L, n)
will concatenate (and pop) the
top-most n
values on the stack and push the result.
Another helpful function is lua_pushfstring
:
const char *lua_pushfstring (lua_State *L, const char *fmt, ...);
It is somewhat similar to the C function sprintf
,
in that it creates a string according to a format string
and some extra arguments.
Unlike sprintf
, however, we do not need to provide a buffer.
Lua dynamically creates the string for us,
as large as it needs to be.
The function pushes the resulting string on the stack
and returns a pointer to it.
This function accepts the following directives:
| inserts a zero-terminated string |
| inserts an |
| inserts a Lua float |
| inserts a pointer |
| inserts a Lua integer |
| inserts an |
| inserts an |
|
It accepts no modifiers, such as width or precision.[30]
Both lua_concat
and lua_pushfstring
are useful when we want to concatenate only a few strings.
However,
if we need to concatenate many strings (or characters) together,
a one-by-one approach can be quite inefficient,
as we saw in the section called “String Buffers”.
Instead,
we can use the buffer facility provided by the auxiliary library.
In its simpler usage,
the buffer facility works with two functions:
one gives us a buffer of any size where we can
compose our string;
the other converts the contents of the buffer into a Lua string.[31]
Figure 30.3, “The function string.upper
” illustrates those functions
with the implementation of string.upper
,
right from the source file lstrlib.c
.
The first step for using a buffer from the auxiliary library
is to declare a variable with type luaL_Buffer
.
The next step is to call luaL_buffinitsize
to get a pointer for a buffer with the given size;
we can then use this buffer freely to create our string.
The last step is to call luaL_pushresultsize
to convert the buffer contents into
a new Lua string and push that sting onto the stack.
The size in this second call is the final size of the string.
Often, as in our example,
this size is equal to the size of the buffer,
but it can be smaller.
If we do not know the exact size of the resulting string,
but have an upper bound,
we can conservatively allocate a larger size.
Note that luaL_pushresultsize
does not get a Lua state as its first argument.
After the initialization,
a buffer keeps a reference to the state,
so we do not need to pass it
when calling other functions that manipulate buffers.
We can also use the auxlib buffers
by adding content to them piecemeal,
without
knowing an upper bound on the size of the result.
The auxiliary library offers several functions
to add things to a buffer:
luaL_addvalue
adds a Lua string that is on the top of the stack;
luaL_addlstring
adds strings with an explicit length;
luaL_addstring
adds zero-terminated strings;
and luaL_addchar
adds single characters.
These functions have the following prototypes:
void luaL_buffinit (lua_State *L, luaL_Buffer *B); void luaL_addvalue (luaL_Buffer *B); void luaL_addlstring (luaL_Buffer *B, const char *s, size_t l); void luaL_addstring (luaL_Buffer *B, const char *s); void luaL_addchar (luaL_Buffer *B, char c); void luaL_pushresult (luaL_Buffer *B);
Figure 30.4, “A simplified implementation for table.concat
” illustrates the use of these functions
with a simplified implementation
of the function table.concat
.
Figure 30.4. A simplified implementation for table.concat
static int tconcat (lua_State *L) { luaL_Buffer b; int i, n; luaL_checktype(L, 1, LUA_TTABLE); n = luaL_len(L, 1); luaL_buffinit(L, &b); for (i = 1; i <= n; i++) { lua_geti(L, 1, i); /* get string from table */ luaL_addvalue(b); /* add it to the buffer */ } luaL_pushresult(&b); return 1; }
In that function,
we first call luaL_buffinit
to
initialize the buffer.
We then add elements to the buffer one by one,
in this example using luaL_addvalue
.
Finally, luaL_pushresult
flushes the buffer and leaves the
final string on the top of the stack.
When we use the auxlib buffer,
we have to worry about one detail.
After we initialize a buffer,
it may keep some internal data in the Lua stack.
Therefore, we cannot assume that the stack top will remain
where it was before we started using the buffer.
Moreover, although we can use the stack for other tasks while
using a buffer,
the push/pop count for these uses must be balanced
every time we access the buffer.
The only exception to this rule is luaL_addvalue
,
which assumes that the string to be added to the buffer
is on the top of the stack.
Frequently, C functions need to keep some non-local data,
that is, data that outlive their invocation.
In C, we typically use global (extern
)
or static variables for this need.
When we are programming library functions for Lua, however,
neither works well.
First, we cannot store a generic Lua value in a C variable.
Second, a library that uses such variables will not work
with multiple Lua states.
A better approach is to get some help from Lua. A Lua function has two places to store non-local data: global variables and non-local variables. The C API offers two similar places to store non-local data: the registry and upvalues.
The registry is a global table that can be accessed only by C code.[32] Typically, we use it to store data to be shared among several modules.
The registry is always located at the pseudo-index
LUA_REGISTRYINDEX
.
A pseudo-index is like an index into the stack,
except that its associated value is not on the stack.
Most functions in the Lua API that accept indices as arguments
also accept pseudo-indices
—the exceptions
being those functions that manipulate the stack itself,
such as lua_remove
and lua_insert
.
For instance,
to get a value stored with key "Key"
in the registry,
we can use the following call:
lua_getfield(L, LUA_REGISTRYINDEX, "Key");
The registry is a regular Lua table.
As such, we can index it with any non-nil Lua value.
However, because all C modules share the same registry,
we must choose with care what values we use as keys,
to avoid collisions.
String keys are particularly useful when we want to allow
other independent libraries to access our data,
because all they need to know is the key name.
For those keys, there is no bulletproof method of choosing names,
but there are some good practices,
such as avoiding common names
and prefixing our names with the library name or something like it.
(Prefixes like lua
or lualib
are not good choices.)
We should never use our own numbers as keys in the registry,
because Lua reserves numeric keys for its reference system.
This system comprises a pair of functions in the auxiliary
library that allow us to store values in a table without
worrying about how to create unique keys.
The function luaL_ref
creates new references:
int ref = luaL_ref(L, LUA_REGISTRYINDEX);
The previous call pops a value from the stack, stores it into the registry with a fresh integer key, and returns this key. We call this key a reference.
As the name implies, we use references mainly when we need to store a reference to a Lua value inside a C structure. As we have seen, we should never store pointers to Lua strings outside the C function that retrieved them. Moreover, Lua does not even offer pointers to other objects, such as tables or functions. So, we cannot refer to Lua objects through pointers. Instead, when we need such pointers, we create a reference and store it in C.
To push the value associated with
a reference ref
onto the stack,
we simply write this:
lua_rawgeti(L, LUA_REGISTRYINDEX, ref);
Finally, to release both the value and the reference,
we call luaL_unref
:
luaL_unref(L, LUA_REGISTRYINDEX, ref);
After this call, a new call to luaL_ref
may return
this reference again.
The reference system treats nil as a special case.
Whenever we call luaL_ref
for a nil value,
it does not create a new reference,
but instead returns the constant reference LUA_REFNIL
.
The following call has no effect:
luaL_unref(L, LUA_REGISTRYINDEX, LUA_REFNIL);
The next one pushes a nil, as expected:
lua_rawgeti(L, LUA_REGISTRYINDEX, LUA_REFNIL);
The reference system also defines the constant LUA_NOREF
,
which is an integer different from any valid reference.
It is useful to signal that a value treated as a reference is invalid.
When we create a Lua state, the registry comes with two predefined references:
Another safe way to create unique keys in the
registry is to use as key the address of a static
variable in our code:
The C link editor ensures that this key
is unique across all loaded libraries.
To use this option,
we need the function lua_pushlightuserdata
,
which pushes on the stack a value representing a C pointer.
The following code shows how to store and retrieve a string from
the registry using this method:
/* variable with a unique address */ static char Key = 'k'; /* store a string */ lua_pushlightuserdata(L, (void *)&Key); /* push address */ lua_pushstring(L, myStr); /* push value */ lua_settable(L, LUA_REGISTRYINDEX); /* registry[&Key] = myStr */ /* retrieve a string */ lua_pushlightuserdata(L, (void *)&Key); /* push address */ lua_gettable(L, LUA_REGISTRYINDEX); /* retrieve value */ myStr = lua_tostring(L, -1); /* convert to string */
We will discuss light userdata in more detail in the section called “Light Userdata”.
To simplify the use of variable addresses as unique keys,
Lua 5.2 introduced two new functions:
lua_rawgetp
and lua_rawsetp
.
They are similar to lua_rawgeti
and lua_rawseti
,
but they use C pointers (translated to light userdata) as keys.
With them,
we can write the previous code like this:
static char Key = 'k'; /* store a string */ lua_pushstring(L, myStr); lua_rawsetp(L, LUA_REGISTRYINDEX, (void *)&Key); /* retrieve a string */ lua_rawgetp(L, LUA_REGISTRYINDEX, (void *)&Key); myStr = lua_tostring(L, -1);
Both functions use raw accesses. As the registry does not have a metatable, a raw access has the same behavior as a regular access, and it is slightly more efficient.
While the registry offers global variables, the upvalue mechanism implements an equivalent of C static variables that are visible only inside a particular function. Every time we create a new C function in Lua, we can associate with it any number of upvalues, each one holding a single Lua value. Later, when we call the function, it has free access to any of its upvalues, using pseudo-indices.
We call this association of a C function with its upvalues a closure. A C closure is a C approximation to a Lua closure. In particular, we can create different closures using the same function code, but with different upvalues.
To see a simple example,
let us create a function newCounter
in C.
(We defined a similar function in Lua
in Chapter 9, Closures.)
This function is a factory:
it returns a new counter function each time it is called,
as in this example:
c1 = newCounter() print(c1(), c1(), c1()) --> 1 2 3 c2 = newCounter() print(c2(), c2(), c1()) --> 1 2 4
Although all counters share the same C code, each one keeps its own independent counter. The factory function is like this:
static int counter (lua_State *L); /* forward declaration */ int newCounter (lua_State *L) { lua_pushinteger(L, 0); lua_pushcclosure(L, &counter, 1); return 1; }
The key function here is lua_pushcclosure
,
which creates a new closure.
Its second argument is the base function
(counter
, in the example)
and the third is the number of upvalues (1, in the example).
Before creating a new closure,
we must push on the stack the initial values for its upvalues.
In our example,
we push zero as the initial value for the
single upvalue.
As expected,
lua_pushcclosure
leaves the new closure on the stack,
so the closure is ready to be returned
as the result of newCounter
.
Now, let us see the definition of counter
:
static int counter (lua_State *L) { int val = lua_tointeger(L, lua_upvalueindex(1)); lua_pushinteger(L, ++val); /* new value */ lua_copy(L, -1, lua_upvalueindex(1)); /* update upvalue */ return 1; /* return new value */ }
Here, the key element is the macro lua_upvalueindex
,
which produces the pseudo-index of an upvalue.
In particular,
the expression lua_upvalueindex(1)
gives the pseudo-index of
the first upvalue of the running function.
Again, this pseudo-index is like any stack index,
except that it does not live on the stack.
So, the call to lua_tointeger
retrieves the current value of the first (and only)
upvalue as an integer.
Then, the function counter
pushes the new value ++val
,
copies it as the new upvalue’s value,
and returns it.
As a more advanced example, we will implement tuples using upvalues. A tuple is a kind of constant structure with anonymous fields; we can retrieve a specific field with a numerical index, or we can retrieve all fields at once. In our implementation, we represent tuples as functions that store their values in their upvalues. When called with a numerical argument, the function returns that specific field. When called without arguments, it returns all its fields. The following code illustrates the use of tuples:
x = tuple.new(10, "hi", {}, 3) print(x(1)) --> 10 print(x(2)) --> hi print(x()) --> 10 hi table: 0x8087878 3
In C, we will represent all tuples by the same function t_tuple
,
presented in Figure 30.5, “An implementation of tuples”.
Figure 30.5. An implementation of tuples
#include "lauxlib.h" int t_tuple (lua_State *L) { lua_Integer op = luaL_optinteger(L, 1, 0); if (op == 0) { /* no arguments? */ int i; /* push each valid upvalue onto the stack */ for (i = 1; !lua_isnone(L, lua_upvalueindex(i)); i++) lua_pushvalue(L, lua_upvalueindex(i)); return i - 1; /* number of values */ } else { /* get field 'op' */ luaL_argcheck(L, 0 < op && op <= 256, 1, "index out of range"); if (lua_isnone(L, lua_upvalueindex(op))) return 0; /* no such field */ lua_pushvalue(L, lua_upvalueindex(op)); return 1; } } int t_new (lua_State *L) { int top = lua_gettop(L); luaL_argcheck(L, top < 256, top, "too many fields"); lua_pushcclosure(L, t_tuple, top); return 1; } static const struct luaL_Reg tuplelib [] = { {"new", t_new}, {NULL, NULL} }; int luaopen_tuple (lua_State *L) { luaL_newlib(L, tuplelib); return 1; }
Because we can call a tuple with or without a numeric argument,
t_tuple
uses luaL_optinteger
to get its optional argument.
This function is similar to luaL_checkinteger
,
but it does not complain if the argument is absent;
instead, it returns a given default value
(0, in the example).
The maximum number of upvalues to a C function is 255,
and the maximum index we can use with lua_upvalueindex
is 256.
So, we use luaL_argcheck
to ensure these limits.
When we index a non-existent upvalue,
the result is a pseudo-value whose type is LUA_TNONE
.
(When we access a stack index above the current top,
we also get a pseudo-value with this type LUA_TNONE
.)
Our function t_tuple
uses lua_isnone
to test whether it has a given upvalue.
However, we should never use lua_upvalueindex
with
a negative index or with an index greater than 256
(which is one plus the maximum number of upvalues for a C function),
so we must check for this condition
when the user provides the index.
The function luaL_argcheck
checks a given condition,
raising an error with a nice message if the condition fails:
> t = tuple.new(2, 4, 5) > t(300) --> stdin:1: bad argument #1 to 't' (index out of range)
The third argument to luaL_argcheck
provides the argument number for the error message
(1, in the example),
and the fourth argument provides a complement to the message
("index out of range"
).
The function to create tuples, t_new
(also in Figure 30.5, “An implementation of tuples”), is trivial:
because its arguments are already on the stack,
it first checks that the number of fields respects
the limit for upvalues in a closure
and then call lua_pushcclosure
to create a closure of t_tuple
with all its arguments
as upvalues.
Finally, the array tuplelib
and the function luaopen_tuple
(also in Figure 30.5, “An implementation of tuples”) are the standard code
to create a library tuple
with that single function new
.
Often, we need to share some values or variables among all functions in a library. Although we can use the registry for that task, we can also use upvalues.
Unlike Lua closures, C closures cannot share upvalues. Each closure has its own independent upvalues. However, we can set the upvalues of different functions to refer to a common table, so that this table becomes a common environment where the functions can share data.
Lua offers a function that eases the task of
sharing an upvalue among all functions of a library.
We have been opening C libraries
with luaL_newlib
.
Lua implements this function as the following macro:
#define luaL_newlib(L,lib) \ (luaL_newlibtable(L,lib), luaL_setfuncs(L,lib,0))
The macro luaL_newlibtable
just creates a new table
for the library.
(This table has a preallocated size equal to
the number of functions in the given library.)
The function luaL_setfuncs
then adds the functions in the
list lib
to that new table,
which is on the top of the stack.
The third parameter to luaL_setfuncs
is what we are interested in here.
It gives the number of shared upvalues the new functions
in the library will have.
The initial values for these upvalues should be on the stack,
as happens with lua_pushcclosure
.
Therefore,
to create a library where all functions share a common table
as their single upvalue,
we can use the following code:
/* create library table ('lib' is its list of functions) */ luaL_newlibtable(L, lib); /* create shared upvalue */ lua_newtable(L); /* add functions in list 'lib' to the new library, sharing previous table as upvalue */ luaL_setfuncs(L, lib, 1);
The last call also removes the shared table from the stack, leaving there only the new library.
Exercise 30.1: Implement a filter function in C. It should receive a list and a predicate and return a new list with all elements from the given list that satisfy the predicate:
t = filter({1, 3, 20, -4, 5}, function (x) return x < 5 end) -- t = {1, 3, -4}
(A predicate is just a function that tests some condition, returning a Boolean.)
Exercise 30.2:
Modify the function l_split
(from Figure 30.2, “Splitting a string”)
so that it can work with strings containing zeros.
(Among other changes,
it should use memchr
instead of strchr
.)
Exercise 30.3:
Reimplement the function transliterate
(Exercise 10.3) in C.
Exercise 30.4:
Implement a library with a
modification of transliterate
so that the transliteration table is not given as an argument,
but instead is kept by the library.
Your library should offer the following functions:
lib.settrans (table) -- set the transliteration table lib.gettrans () -- get the transliteration table lib.transliterate(s) -- transliterate 's' according to the current table
Use the registry to keep the transliteration table.
Exercise 30.5: Repeat the previous exercise using an upvalue to keep the transliteration table.
Exercise 30.6:
Do you think it is a good design to keep
the transliteration table as part of the state
of the library,
instead of being a parameter to transliterate
?
Personal copy of Eric Taylor <jdslkgjf.iapgjflksfg@yandex.com>