17 Modules and Packages

Usually, Lua does not set policies. Instead, Lua provides mechanisms that are powerful enough for groups of developers to implement the policies that best suit them. However, this approach does not work well for modules. One of the main goals of a module system is to allow different groups to share code. The lack of a common policy impedes this sharing.

Starting in version 5.1, Lua has defined a set of policies for modules and packages (a package being a collection of modules). These policies do not demand any extra facility from the language; programmers can implement them using what we have seen so far. Programmers are free to use different policies. Of course, alternative implementations may lead to programs that cannot use foreign modules and modules that cannot be used by foreign programs.

From the point of view of the user, a module is some code (either in Lua or in C) that can be loaded through the function require and that creates and returns a table. Everything that the module exports, such as functions and constants, it defines inside this table, which works as a kind of namespace.

As an example, all standard libraries are modules. We can use the mathematical library like this:

      local m = require "math"
      print(m.sin(3.14))          --> 0.0015926529164868

However, the stand-alone interpreter preloads all standard libraries with code equivalent to this:

      math = require "math"
      string = require "string"
      ...

This preloading allows us to write the usual notation math.sin, without bothering to require the module math.

An obvious benefit of using tables to implement modules is that we can manipulate modules like any other table and use the whole power of Lua to create extra facilities. In most languages, modules are not first-class values (that is, they cannot be stored in variables, passed as arguments to functions, etc.); those languages need special mechanisms for each extra facility they want to offer for modules. In Lua, we get extra facilities for free.

For instance, there are several ways for a user to call a function from a module. The usual way is this:

      local mod = require "mod"
      mod.foo()

The user can set any local name for the module:

      local m = require "mod"
      m.foo()

She can also provide alternative names for individual functions:

      local m = require "mod"
      local f = m.foo
      f()

She can also import only a specific function:

      local f = require "mod".foo        -- (require("mod")).foo
      f()

The nice thing about these facilities is that they involve no special support from Lua. They use what the language already offers.

Despite its central role in the implementation of modules in Lua, require is a regular function, with no special privileges. To load a module, we simply call it with a single argument, the module name. Remember that, when the single argument to a function is a literal string, the parentheses are optional, and it is customary to omit them in regular uses of require. Nevertheless, the following uses are all correct, too:

      local m = require('math')
      
      local modname = 'math'
      local m = require(modname)

The function require tries to keep to a minimum its assumptions about what a module is. For it, a module is just any code that defines some values, such as functions or tables containing functions. Typically, that code returns a table comprising the module functions. However, because this action is done by the module code, not by require, some modules may choose to return other values or even to have side effects (e.g., by creating global variables).

The first step of require is to check in the table package.loaded whether the module is already loaded. If so, require returns its corresponding value. Therefore, once a module is loaded, other calls requiring the same module simply return the same value, without running any code again.

If the module is not loaded yet, require searches for a Lua file with the module name. (This search is guided by the variable package.path, which we will discuss later.) If it finds such a file, it loads it with loadfile. The result is a function that we call a loader. (The loader is a function that, when called, loads the module.)

If require cannot find a Lua file with the module name, it searches for a C library with that name.[17] (In that case, the search is guided by the variable package.cpath.) If it finds a C library, it loads it with the low-level function package.loadlib, looking for a function called luaopen_modname. The loader in this case is the result of loadlib, which is the C function luaopen_modname represented as a Lua function.

No matter whether the module was found in a Lua file or a C library, require now has a loader for it. To finally load the module, require calls the loader with two arguments: the module name and the name of the file where it got the loader. (Most modules just ignore these arguments.) If the loader returns any value, require returns this value and stores it in the package.loaded table, to return the same value in future calls for this same module. If the loader returns no value, and the table entry package.loaded[@rep{modname}] is still empty, require behaves as if the module returned true. Without this correction, a subsequent call to require would run the module again.

To force require into loading the same module twice, we can erase the library entry from package.loaded:

      package.loaded.modname = nil

The next time the module is required, require will do all its work again.

A common complaint against require is that it cannot pass arguments to the module being loaded. For instance, the mathematical module might have an option for choosing between degrees and radians:

      -- bad code
      local math = require("math", "degree")

The problem here is that one of the main goals of require is to avoid loading a module multiple times. Once a module is loaded, it will be reused by whatever part of the program that requires it again. There would be a conflict if the same module were required with different parameters. In case you really want your module to have parameters, it is better to create an explicit function to set them, like here:

      local mod = require "mod"
      mod.init(0, 0)

If the initialization function returns the module itself, we can write that code like this:

      local mod = require "mod".init(0, 0)

In any case, remember that the module itself is loaded only once; it is up to it to handle conflicting initializations.

When searching for a Lua file, the path that guides require is a little different from typical paths. A typical path is a list of directories wherein to search for a given file. However, ISO C (the abstract platform where Lua runs) does not have the concept of directories. Therefore, the path used by require is a list of templates, each of them specifying an alternative way to transform a module name (the argument to require) into a file name. More specifically, each template in the path is a file name containing optional question marks. For each template, require substitutes the module name for each question mark and checks whether there is a file with the resulting name; if not, it goes to the next template. The templates in a path are separated by semicolons, a character seldom used for file names in most operating systems. For instance, consider the following path:

      ?;?.lua;c:\windows\?;/usr/local/lua/?/?.lua

With this path, the call require "sql" will try to open the following Lua files:

      sql
      sql.lua
      c:\windows\sql
      /usr/local/lua/sql/sql.lua

The function require assumes only the semicolon (as the component separator) and the question mark; everything else, including directory separators and file extensions, is defined by the path itself.

The path that require uses to search for Lua files is always the current value of the variable package.path. When the module package is initialized, it sets this variable with the value of the environment variable LUA_PATH_5_3; if this environment variable is undefined, Lua tries the environment variable LUA_PATH. If both are unefined, Lua uses a compiled-defined default path.[18] When using the value of an environment variable, Lua substitutes the default path for any substring ";;". For instance, if we set LUA_PATH_5_3 to "mydir/?.lua;;", the final path will be the template "mydir/?.lua" followed by the default path.

The path used to search for a C library works exactly in the same way, but its value comes from the variable package.cpath, instead of package.path. Similarly, this variable gets its initial value from the environment variables LUA_CPATH_5_3 or LUA_CPATH. A typical value for this path in POSIX is like this:

      ./?.so;/usr/local/lib/lua/5.2/?.so

Note that the path defines the file extension. The previous example uses .so for all templates; in Windows, a typical path would be more like this one:

      .\?.dll;C:\Program Files\Lua502\dll\?.dll

The function package.searchpath encodes all those rules for searching libraries. It takes a module name and a path, and looks for a file following the rules described here. It returns either the name of the first file that exists or nil plus an error message describing all files it unsuccessfully tried to open, as in the next example:

      > path = ".\\?.dll;C:\\Program Files\\Lua502\\dll\\?.dll"
      > print(package.searchpath("X", path))
      nil
              no file '.\X.dll'
              no file 'C:\Program Files\Lua502\dll\X.dll'

As an interesting exercise, in Figure 17.1, “A homemade package.searchpath we implement a function similar to package.searchpath.

The first step is to substitute the directory separator, assumed to be a slash in this example, for any dots. (As we will see later, a dot has a special meaning in a module name.) Then the function loops over all components of the path, wherein each component is a maximum expansion of non-semicolon characters. For each component, the function substitutes the module name for the question marks to get the final file name, and then it checks whether there is such a file. If so, the function closes the file and returns its name. Otherwise, it stores the failed name for a possible error message. (Note the use of a string buffer to avoid creating useless long strings.) If no file is found, then it returns nil plus the final error message.

In reality, require is a little more complex than we have described. The search for a Lua file and the search for a C library are just two instances of a more general concept of searchers. A searcher is simply a function that takes the module name and returns either a loader for that module or nil if it cannot find one.

The array package.searchers lists the searchers that require uses. When looking for a module, require calls each searcher in the list passing the module name, until one of them finds a loader for the module. If the list ends without a positive response, require raises an error.

The use of a list to drive the search for a module allows great flexibility to require. For instance, if we want to store modules compressed in zip files, we only need to provide a proper searcher function for that and add it to the list. In its default configuration, the searcher for Lua files and the searcher for C libraries that we described earlier are respectively the second and the third elements in the list. Before them, there is the preload searcher.

The preload searcher allows the definition of an arbitrary function to load a module. It uses a table, called package.preload, to map module names to loader functions. When searching for a module name, this searcher simply looks for the given name in the table. If it finds a function there, it returns this function as the module loader. Otherwise, it returns nil. This searcher provides a generic method to handle some non-conventional situations. For instance, a C library statically linked to Lua can register its luaopen_ function into the preload table, so that it will be called only when (and if) the user requires that module. In this way, the program does not waste resources opening the module if it is not used.

The default content of package.searchers includes a fourth function that is relevant only for submodules. We will discuss it at the section called “Submodules and Packages”.

The simplest way to create a module in Lua is really simple: we create a table, put all functions we want to export inside it, and return this table. Figure 17.2, “A simple module for complex numbers” illustrates this approach.

Note how we define new and inv as private functions simply by declaring them local to the chunk.

Some people do not like the final return statement. One way of eliminating it is to assign the module table directly into package.loaded:

      local M = {}
      package.loaded[...] = M
        as before, without the return statement

Remember that require calls the loader passing the module name as the first argument. So, the vararg expression ... in the table index results in that name. After this assignment, we do not need to return M at the end of the module: if a module does not return a value, require will return the current value of package.loaded[modname] (if it is not nil). Anyway, I find it clearer to write the final return. If we forget it, any trivial test with the module will detect the error.

Another approach to write a module is to define all functions as locals and build the returning table at the end, as in Figure 17.3, “Module with export list”.

What are the advantages of this approach? We do not need to prefix each name with M. or something similar; there is an explicit export list; and we define and use exported and internal functions in the same way inside the module. What are the disadvantages? The export list is at the end of the module instead of at the beginning, where it would be more useful as a quick documentation; and the export list is somewhat redundant, as we must write each name twice. (This last disadvantage may become an advantage, as it allows functions to have different names inside and outside the module, but I think programmers seldom do this.)

Anyway, remember that no matter how we define a module, users should be able to use it in a standard way:

      local cpx = require "complex"
      print(cpx.tostring(cpx.add(cpx.new(3,4), cpx.i)))
        --> (3,5)

Later, we will see how we can use some advanced Lua features, such as metatables and environments, for writing modules. However, except for a nice technique to detect global variables created by mistake, I use only the basic approach in my modules.

Lua allows module names to be hierarchical, using a dot to separate name levels. For instance, a module named mod.sub is a submodule of mod. A package is a complete tree of modules; it is the unit of distribution in Lua.

When we require a module called mod.sub, the function require will query first the table package.loaded and then the table package.preload, using the original module name "mod.sub" as the key. Here, the dot is just a character like any other in the module name.

However, when searching for a file that defines that submodule, require translates the dot into another character, usually the system’s directory separator (e.g., a slash for POSIX or a backslash for Windows). After the translation, require searches for the resulting name like any other name. For instance, assume the slash as the directory separator and the following path:

      ./?.lua;/usr/local/lua/?.lua;/usr/local/lua/?/init.lua

The call require "a.b" will try to open the following files:

      ./a/b.lua
      /usr/local/lua/a/b.lua
      /usr/local/lua/a/b/init.lua

This behavior allows all modules of a package to live in a single directory. For instance, if a package has modules p, p.a, and p.b, their respective files can be p/init.lua, p/a.lua, and p/b.lua, with the directory p within some appropriate directory.

The directory separator used by Lua is configured at compile time and can be any string (remember, Lua knows nothing about directories). For instance, systems without hierarchical directories can use an underscore as the directory separator, so that require "a.b" will search for a file a_b.lua.

Names in C cannot contain dots, so a C library for submodule a.b cannot export a function luaopen_a.b. Here, require translates the dot into another character, an underscore. So, a C library named a.b should name its initialization function luaopen_a_b.

As an extra facility, require has one more searcher for loading C submodules. When it cannot find either a Lua file or a C file for a submodule, this last searcher searches again the C path, but this time looking for the package name. For example, if the program requires a submodule a.b.c this searcher will look for a. If it finds a C library for this name, then require looks into this library for an appropriate open function, luaopen_a_b_c in this example. This facility allows a distribution to put several submodules together, each with its own open function, into a single C library.

From the point of view of Lua, submodules in the same package have no explicit relationship. Requiring a module does not automatically load any of its submodules; similarly, requiring a submodule does not automatically load its parent. Of course, the package implementer is free to create these links if she wants. For instance, a particular module may start by explicitly requiring one or all of its submodules.

Exercise 17.1: Rewrite the implementation of double-ended queues (Figure 14.2, “A double-ended queue”) as a proper module.

Exercise 17.2: Rewrite the implementation of the geometric-region system (the section called “A Taste of Functional Programming”) as a proper module.

Exercise 17.3: What happens in the search for a library if the path has some fixed component (that is, a component without a question mark)? Can this behavior be useful?

Exercise 17.4: Write a searcher that searches for Lua files and C libraries at the same time. For instance, the path used for this searcher could be something like this:

      ./?.lua;./?.so;/usr/lib/lua5.2/?.so;/usr/share/lua5.2/?.lua

(Hint: use package.searchpath to find a proper file and then try to load it, first with loadfile and next with package.loadlib.)



[17] In the section called “C Modules”, we will discuss how to write C libraries.

[18] Since Lua 5.2, the stand-alone interpreter accepts the command-line option -E to prevent the use of those environment variables and force the default.

Personal copy of Eric Taylor <jdslkgjf.iapgjflksfg@yandex.com>