Cyberspace. Unthinkable complexity. Lines of light ranged in the non-space of the mind, clusters and constellations of data. Like city lights, receding . . .”
William Gibson, Neuromancer
Tragically, not every program in the world is written in Rust. There are many critical libraries and interfaces implemented in other languages that we would like to be able to use in our Rust programs. Rust’s foreign function interface (FFI) lets Rust code call functions written in C, and in some cases C++. Since most operating systems offer C interfaces, Rust’s foreign function interface allows immediate access to all sorts of low-level facilities.
In this chapter, we’ll write a program that links with libgit2
, a C library for working with the Git version control system. First, we’ll show what it’s like to use C functions directly from Rust, using the unsafe features demonstrated in the previous chapter. Then, we’ll show how to construct a safe interface to libgit2
, taking inspiration from the open source git2-rs
crate, which does exactly that.
We’ll assume that you’re familiar with C and the mechanics of compiling and linking C programs. Working with C++ is similar. We’ll also assume that you’re somewhat familiar with the Git version control system.
There do exist Rust crates for communicating with many other languages, including Python, JavaScript, Lua, and Java. We don’t have room to cover them here, but ultimately, all these interfaces are built using the C foreign function interface, so this chapter should give you a head start no matter which language you need to work with.
The common denominator of Rust and C is machine language, so in order to anticipate what Rust values look like to C code, or vice versa, you need to consider their machine-level representations. Throughout the book, we’ve made a point of showing how values are actually represented in memory, so you’ve probably noticed that the data worlds of C and Rust have a lot in common: a Rust usize
and a C size_t
are identical, for example, and structs are fundamentally the same idea in both languages. To establish a correspondence between Rust and C types, we’ll start with primitives and then work our way up to more complicated types.
Given its primary use as a systems programming language, C has always been surprisingly loose about its types’ representations: an int
is typically 32 bits long, but could be longer, or as short as 16 bits; a C char
may be signed or unsigned; and so on. To cope with this variability, Rust’s std::os::raw
module defines a set of Rust types that are guaranteed to have the same representation as certain C types (Table 23-1). These cover the primitive integer and character types.
C type | Corresponding std::os::raw type |
---|---|
short |
c_short |
int |
c_int |
long |
c_long |
long long |
c_longlong |
unsigned short |
c_ushort |
unsigned , unsigned int |
c_uint |
unsigned long |
c_ulong |
unsigned long long |
c_ulonglong |
char |
c_char |
signed char |
c_schar |
unsigned char |
c_uchar |
float |
c_float |
double |
c_double |
void * , const void * |
*mut c_void , *const c_void |
Some notes about Table 23-1:
Except for c_void
, all the Rust types here are aliases for some primitive Rust type: c_char
, for example, is either i8
or u8
.
A Rust bool
is equivalent to a C or C++ bool
.
Rust’s 32-bit char
type is not the analogue of wchar_t
, whose width and encoding vary from one implementation to another. C’s char32_t
type is closer, but its encoding is still not guaranteed to be Unicode.
Rust’s primitive usize
and isize
types have the same representations as C’s size_t
and ptrdiff_t
.
C and C++ pointers and C++ references correspond to Rust’s raw pointer types, *mut T
and *const T
.
Technically, the C standard permits implementations to use representations for which Rust has no corresponding type: 36-bit integers, sign-and-magnitude representations for signed values, and so on. In practice, on every platform Rust has been ported to, every common C integer type has a match in Rust.
For defining Rust struct types compatible with C structs, you can use the #[repr(C)]
attribute. Placing #[repr(C)]
above a struct definition asks Rust to lay out the struct’s fields in memory the same way a C compiler would lay out the analogous C struct type. For example, libgit2
’s git2/errors.h header file defines the following C struct to provide details about a previously reported error:
typedef
struct
{
char
*
message
;
int
klass
;
}
git_error
;
You can define a Rust type with an identical representation as follows:
use
std
::os
::raw
::{
c_char
,
c_int
};
#[repr(C)]
pub
struct
git_error
{
pub
message
:*
const
c_char
,
pub
klass
:c_int
}
The #[repr(C)]
attribute affects only the layout of the struct itself, not the representations of its individual fields, so to match the C struct, each field must use the C-like type as well: *const c_char
for char *
, c_int
for int
, and so on.
In this particular case, the #[repr(C)]
attribute probably doesn’t change the layout of git_error
. There really aren’t too many interesting ways to lay out a pointer and an integer. But whereas C and C++ guarantee that a structure’s members appear in memory in the order they’re declared, each at a distinct address, Rust reorders fields to minimize the overall size of the struct, and zero-sized types take up no space. The #[repr(C)]
attribute tells Rust to follow C’s rules for the given type.
You can also use #[repr(C)]
to control the representation of C-style enums:
#[repr(C)]
#[allow(non_camel_case_types)]
enum
git_error_code
{
GIT_OK
=
0
,
GIT_ERROR
=
-
1
,
GIT_ENOTFOUND
=
-
3
,
GIT_EEXISTS
=
-
4
,
...
}
Normally, Rust plays all sorts of games when choosing how to represent enums. For example, we mentioned the trick Rust uses to store Option<&T>
in a single word (if T
is sized). Without #[repr(C)]
, Rust would use a single byte to represent the git_error_code
enum; with #[repr(C)]
, Rust uses a value the size of a C int
, just as C would.
You can also ask Rust to give an enum the same representation as some integer type. Starting the preceding definition with #[repr(i16)]
would give you a 16-bit type with the same representation as the following C++ enum:
#include <stdint.h>
enum
git_error_code
:
int16_t
{
GIT_OK
=
0
,
GIT_ERROR
=
-
1
,
GIT_ENOTFOUND
=
-
3
,
GIT_EEXISTS
=
-
4
,
...
};
As mentioned earlier, #[repr(C)]
applies to unions as well. Fields of #[repr(C)]
unions always start at the first bit of the union’s memory—index 0.
Suppose you have a C struct that uses a union to hold some data and a tag value to indicate which field of the union should be used, similar to a Rust enum.
enum
tag
{
FLOAT
=
0
,
INT
=
1
,
};
union
number
{
float
f
;
short
i
;
};
struct
tagged_number
{
tag
t
;
number
n
;
};
Rust code can interoperate with this structure by applying #[repr(C)]
to the enum, structure, and union types, and using a match
statement that selects a union field within a larger struct based on the tag:
#[repr(C)]
enum
Tag
{
Float
=
0
,
Int
=
1
}
#[repr(C)]
union
FloatOrInt
{
f
:f32
,
i
:i32
,
}
#[repr(C)]
struct
Value
{
tag
:Tag
,
union
:FloatOrInt
}
fn
is_zero
(
v
:Value
)
->
bool
{
use
self
::Tag
::*
;
unsafe
{
match
v
{
Value
{
tag
:Int
,
union
:FloatOrInt
{
i
:0
}
}
=>
true
,
Value
{
tag
:Float
,
union
:FloatOrInt
{
f
:num
}
}
=>
(
num
==
0.0
),
_
=>
false
}
}
}
Even complex structures can be easily used across the FFI boundary using this kind of technique.
Passing strings between Rust and C is a little harder. C represents a string as a pointer to an array of characters, terminated by a null character. Rust, on the other hand, stores the length of a string explicitly, either as a field of a String
or as the second word of a fat reference &str
. Rust strings are not null-terminated; in fact, they may include null characters in their contents, like any other character.
This means that you can’t borrow a Rust string as a C string: if you pass C code a pointer into a Rust string, it could mistake an embedded null character for the end of the string or run off the end looking for a terminating null that isn’t there. Going the other direction, you may be able to borrow a C string as a Rust &str
, as long as its contents are well-formed UTF-8.
This situation effectively forces Rust to treat C strings as types entirely distinct from String
and &str
. In the std::ffi
module, the CString
and CStr
types represent owned and borrowed null-terminated arrays of bytes. Compared to String
and str
, the methods on CString
and CStr
are quite limited, restricted to construction and conversion to other types. We’ll show these types in action in the next section.
An extern
block declares functions or variables defined in some other library that the final Rust executable will be linked with. For example, on most platforms, every Rust program is linked against the standard C library, so we can tell Rust about the C library’s strlen
function like this:
use
std
::os
::raw
::c_char
;
extern
{
fn
strlen
(
s
:*
const
c_char
)
->
usize
;
}
This gives Rust the function’s name and type, while leaving the definition to be linked in later.
Rust assumes that functions declared inside extern
blocks use C conventions for passing arguments and accepting return values. They are defined as unsafe
functions. These are the right choices for strlen
: it is indeed a C function, and its specification in C requires that you pass it a valid pointer to a properly terminated string, which is a contract that Rust cannot enforce. (Almost any function that takes a raw pointer must be unsafe
: safe Rust can construct raw pointers from arbitrary integers, and dereferencing such a pointer would be undefined behavior.)
With this extern
block, we can call strlen
like any other Rust function, although its type gives it away as a tourist:
use
std
::ffi
::CString
;
let
rust_str
=
"I'll be back"
;
let
null_terminated
=
CString
::new
(
rust_str
).
unwrap
();
unsafe
{
assert_eq
!
(
strlen
(
null_terminated
.
as_ptr
()),
12
);
}
The CString::new
function builds a null-terminated C string. It first checks its argument for embedded null characters, since those cannot be represented in a C string, and returns an error if it finds any (hence the need to unwrap
the result). Otherwise, it adds a null byte to the end and returns a CString
owning the resulting characters.
The cost of CString::new
depends on what type you pass it. It accepts anything that implements Into<Vec<u8>>
. Passing a &str
entails an allocation and a copy, as the conversion to Vec<u8>
builds a heap-allocated copy of the string for the vector to own. But passing a String
by value simply consumes the string and takes over its buffer, so unless appending the null character forces the buffer to be resized, the conversion requires no copying of text or allocation at all.
CString
dereferences to CStr
, whose as_ptr
method returns a *const c_char
pointing at the start of the string. This is the type that strlen
expects. In the example, strlen
runs down the string, finds the null character that CString::new
placed there, and returns the length, as a byte count.
You can also declare global variables in extern
blocks. POSIX systems have a global variable named environ
that holds the values of the process’s environment variables. In C, it’s declared:
extern
char
**
environ
;
In Rust, you would say:
use
std
::ffi
::CStr
;
use
std
::os
::raw
::c_char
;
extern
{
static
environ
:*
mut
*
mut
c_char
;
}
To print the environment’s first element, you could write:
unsafe
{
if
!
environ
.
is_null
()
&&
!
(
*
environ
).
is_null
()
{
let
var
=
CStr
::from_ptr
(
*
environ
);
println
!
(
"first environment variable: {}"
,
var
.
to_string_lossy
())
}
}
After making sure environ
has a first element, the code calls CStr::from_ptr
to build a CStr
that borrows it. The to_string_lossy
method returns a Cow<str>
: if the C string contains well-formed UTF-8, the Cow
borrows its content as a &str
, not including the terminating null byte. Otherwise, to_string_lossy
makes a copy of the text in the heap, replaces the ill-formed UTF-8 sequences with the official Unicode replacement character, �
, and builds an owning Cow
from that. Either way, the result implements Display
, so you can print it with the {}
format parameter.
To use functions provided by a particular library, you can place a #[link]
attribute atop the extern
block that names the library Rust should link the executable with. For example, here’s a program that calls libgit2
’s initialization and shutdown methods, but does nothing else:
use
std
::os
::raw
::c_int
;
#[link(name =
"git2"
)]
extern
{
pub
fn
git_libgit2_init
()
->
c_int
;
pub
fn
git_libgit2_shutdown
()
->
c_int
;
}
fn
main
()
{
unsafe
{
git_libgit2_init
();
git_libgit2_shutdown
();
}
}
The extern
block declares the extern functions as before. The #[link(name = "git2")]
attribute leaves a note in the crate to the effect that, when Rust creates the final executable or shared library, it should link against the git2
library. Rust uses the system linker to build executables; on Unix, this passes the argument -lgit2
on the linker command line; on Windows, it passes git2.LIB
.
#[link]
attributes work in library crates, too. When you build a program that depends on other crates, Cargo gathers together the link notes from the entire dependency graph and includes them all in the final link.
In this example, if you would like to follow along on your own machine, you’ll need to build libgit2
for yourself. We used libgit2
version 0.25.1. To compile libgit2
, you will need to install the CMake build tool and the Python language; we used CMake version 3.8.0 and Python version 2.7.13.
The full instructions for building libgit2
are available on its website, but they’re simple enough that we’ll show the essentials here. On Linux, assume you’ve already unzipped the library’s source into the directory /home/jimb/libgit2-0.25.1:
$
cd
/home/jimb/libgit2-0.25.1$
mkdir build$
cd
build$
cmake ..$
cmake --build .
On Linux, this produces a shared library /home/jimb/libgit2-0.25.1/build/libgit2.so.0.25.1 with the usual nest of symlinks pointing to it, including one named libgit2.so. On macOS, the results are similar, but the library is named libgit2.dylib.
On Windows, things are also straightforward. Assume you’ve unzipped the source into the directory C:\Users\JimB\libgit2-0.25.1. In a Visual Studio command prompt:
>
cd
C:\Users\JimB\libgit2-0.25.1>
mkdir
build>
cd
build>
cmake -A x64 ..>
cmake --build .
These are the same commands as used on Linux, except that you must request a 64-bit build when you run CMake the first time to match your Rust compiler. (If you have installed the 32-bit Rust toolchain, then you should omit the -A x64
flag to the first cmake
command.) This produces an import library git2.LIB and a dynamic-link library git2.DLL, both in the directory C:\Users\JimB\libgit2-0.25.1\build\Debug. (The remaining instructions are shown for Unix, except where Windows is substantially different.)
Create the Rust program in a separate directory:
$
cd
/home/jimb$
cargo new --bin git-toyCreated binary (application) `git-toy` package
Take the code shown earlier and put it in src/main.rs. Naturally, if you try to build this, Rust has no idea where to find the libgit2
you built:
$
cd
git-toy$
cargo runCompiling git-toy v0.1.0 (/home/jimb/git-toy)
error: linking with `cc` failed: exit code: 1
|
= note: /usr/bin/ld: error: cannot find -lgit2
src/main.rs:11: error: undefined reference to 'git_libgit2_init'
src/main.rs:12: error: undefined reference to 'git_libgit2_shutdown'
collect2: error: ld returned 1 exit status
error: aborting due to previous error
error: could not compile `git-toy`.
To learn more, run the command again with --verbose.
You can tell Rust where to search for libraries by writing a build script, Rust code that Cargo compiles and runs at build time. Build scripts can do all sorts of things: generate code dynamically, compile C code to be included in the crate, and so on. In this case, all you need is to add a library search path to the executable’s link command. When Cargo runs the build script, it parses the build script’s output for information of this sort, so the build script simply needs to print the right magic to its standard output.
To create your build script, add a file named build.rs in the same directory as the Cargo.toml file, with the following contents:
fn
main
()
{
println
!
(
r"cargo:rustc-link-search=native=/home/jimb/libgit2-0.25.1/build"
);
}
This is the right path for Linux; on Windows, you would change the path following the text native=
to C:\Users\JimB\libgit2-0.25.1\build\Debug
. (We’re cutting some corners to keep this example simple; in a real application, you should avoid using absolute paths in your build script. We cite documentation that shows how to do it right at the end of this section.)
Now you can almost run the program. On macOS it may work immediately; on a Linux system you will probably see something like the following:
$
cargo runCompiling git-toy v0.1.0 (/tmp/rustbook-transcript-tests/git-toy)
Finished dev [unoptimized + debuginfo] target(s)
Running `target/debug/git-toy`
target/debug/git-toy: error while loading shared libraries:
libgit2.so.25: cannot open shared object file: No such file or directory
This means that, although Cargo succeeded in linking the executable against the library, it doesn’t know where to find the shared library at run time. Windows reports this failure by popping up a dialog box. On Linux, you must set the LD_LIBRARY_PATH
environment variable:
$
export
LD_LIBRARY_PATH
=
/home/jimb/libgit2-0.25.1/build:$LD_LIBRARY_PATH
$
cargo runFinished dev [unoptimized + debuginfo] target(s) in 0.0 secs
Running `target/debug/git-toy`
On macOS, you may need to set DYLD_LIBRARY_PATH
instead.
On Windows, you must set the PATH
environment variable:
>
set
PATH
=
C:\Users\JimB\libgit2-0.25.1\build\Debug;%PATH%
>
cargo runFinished dev [unoptimized + debuginfo] target(s) in 0.0 secs
Running `target/debug/git-toy`
>
Naturally, in a deployed application you’d want to avoid having to set environment variables just to find your library’s code. One alternative is to statically link the C library into your crate. This copies the library’s object files into the crate’s .rlib file, alongside the object files and metadata for the crate’s Rust code. The entire collection then participates in the final link.
It is a Cargo convention that a crate that provides access to a C library should be named LIB-sys
, where LIB
is the name of the C library. A -sys
crate should contain nothing but the statically linked library and Rust modules containing extern
blocks and type definitions. Higher-level interfaces then belong in crates that depend on the -sys
crate. This allows multiple upstream crates to depend on the same -sys
crate, assuming there is a single version of the -sys
crate that meets everyone’s needs.
For the full details on Cargo’s support for build scripts and linking with system libraries, see the online Cargo documentation. It shows how to avoid absolute paths in build scripts, control compilation flags, use tools like pkg-config
, and so on. The git2-rs
crate also provides good examples to emulate; its build script handles some complex situations.
Figuring out how to use libgit2
properly breaks down into two questions:
What does it take to use libgit2
functions in Rust?
How can we build a safe Rust interface around them?
We’ll take these questions one at a time. In this section, we’ll write a program that’s essentially a single giant unsafe
block filled with nonidiomatic Rust code, reflecting the clash of type systems and conventions that is inherent in mixing languages. We’ll call this the raw interface. The code will be messy, but it will make plain all the steps that must occur for Rust code to use libgit2
.
Then, in the next section, we’ll build a safe interface to libgit2
that puts Rust’s types to use enforcing the rules libgit2
imposes on its users. Fortunately, libgit2
is an exceptionally well-designed C library, so the questions that Rust’s safety requirements force us to ask all have pretty good answers, and we can construct an idiomatic Rust interface with no unsafe
functions.
The program we’ll write is very simple: it takes a path as a command-line argument, opens the Git repository there, and prints out the head commit. But this is enough to illustrate the key strategies for building safe and idiomatic Rust interfaces.
For the raw interface, the program will end up needing a somewhat larger collection of functions and types from libgit2
than we used before, so it makes sense to move the extern
block into its own module. We’ll create a file named raw.rs in git-toy/src whose contents are as follows:
#![allow(non_camel_case_types)]
use
std
::os
::raw
::{
c_int
,
c_char
,
c_uchar
};
#[link(name =
"git2"
)]
extern
{
pub
fn
git_libgit2_init
()
->
c_int
;
pub
fn
git_libgit2_shutdown
()
->
c_int
;
pub
fn
giterr_last
()
->
*
const
git_error
;
pub
fn
git_repository_open
(
out
:*
mut
*
mut
git_repository
,
path
:*
const
c_char
)
->
c_int
;
pub
fn
git_repository_free
(
repo
:*
mut
git_repository
);
pub
fn
git_reference_name_to_id
(
out
:*
mut
git_oid
,
repo
:*
mut
git_repository
,
reference
:*
const
c_char
)
->
c_int
;
pub
fn
git_commit_lookup
(
out
:*
mut
*
mut
git_commit
,
repo
:*
mut
git_repository
,
id
:*
const
git_oid
)
->
c_int
;
pub
fn
git_commit_author
(
commit
:*
const
git_commit
)
->
*
const
git_signature
;
pub
fn
git_commit_message
(
commit
:*
const
git_commit
)
->
*
const
c_char
;
pub
fn
git_commit_free
(
commit
:*
mut
git_commit
);
}
#[repr(C)]
pub
struct
git_repository
{
_private
:[
u8
;
0
]
}
#[repr(C)]
pub
struct
git_commit
{
_private
:[
u8
;
0
]
}
#[repr(C)]
pub
struct
git_error
{
pub
message
:*
const
c_char
,
pub
klass
:c_int
}
pub
const
GIT_OID_RAWSZ
:usize
=
20
;
#[repr(C)]
pub
struct
git_oid
{
pub
id
:[
c_uchar
;
GIT_OID_RAWSZ
]
}
pub
type
git_time_t
=
i64
;
#[repr(C)]
pub
struct
git_time
{
pub
time
:git_time_t
,
pub
offset
:c_int
}
#[repr(C)]
pub
struct
git_signature
{
pub
name
:*
const
c_char
,
pub
*
const
c_char
,
pub
when
:git_time
}
Each item here is modeled on a declaration from libgit2
’s own header files. For example, libgit2-0.25.1/include/git2/repository.h includes this declaration:
extern
int
git_repository_open
(
git_repository
**
out
,
const
char
*
path
);
This function tries to open the Git repository at path
. If all goes well, it creates a git_repository
object and stores a pointer to it in the location pointed to by out
. The equivalent Rust declaration is the following:
pub
fn
git_repository_open
(
out
:*
mut
*
mut
git_repository
,
path
:*
const
c_char
)
->
c_int
;
The libgit2
public header files define the git_repository
type as a typedef for an incomplete struct type:
typedef
struct
git_repository
git_repository
;
Since the details of this type are private to the library, the public headers never define struct git_repository
, ensuring that the library’s users can never build an instance of this type themselves. One possible analogue to an incomplete struct type in Rust is this:
#[repr(C)]
pub
struct
git_repository
{
_private
:[
u8
;
0
]
}
This is a struct type containing an array with no elements. Since the _private
field isn’t pub
, values of this type cannot be constructed outside this module, which is perfect as the reflection of a C type that only libgit2
should ever construct, and which is manipulated solely through raw pointers.
Writing large extern
blocks by hand can be a chore. If you are creating a Rust interface to a complex C library, you may want to try using the bindgen
crate, which has functions you can use from your build script to parse C header files and generate the corresponding Rust declarations automatically. We don’t have space to show bindgen
in action here, but bindgen
’s page on crates.io includes links to its documentation.
Next we’ll rewrite main.rs completely. First, we need to declare the raw
module:
mod
raw
;
According to libgit2
’s conventions, fallible functions return an integer code that is positive or zero on success, and negative on failure. If an error occurs, the giterr_last
function will return a pointer to a git_error
structure providing more details about what went wrong. libgit2
owns this structure, so we don’t need to free it ourselves, but it could be overwritten by the next library call we make. A proper Rust interface would use Result
, but in the raw version, we want to use the libgit2
functions just as they are, so we’ll have to roll our own function for handling errors:
use
std
::ffi
::CStr
;
use
std
::os
::raw
::c_int
;
fn
check
(
activity
:&
'static
str
,
status
:c_int
)
->
c_int
{
if
status
<
0
{
unsafe
{
let
error
=
&*
raw
::giterr_last
();
println
!
(
"error while {}: {} ({})"
,
activity
,
CStr
::from_ptr
(
error
.
message
).
to_string_lossy
(),
error
.
klass
);
std
::process
::exit
(
1
);
}
}
status
}
We’ll use this function to check the results of libgit2
calls like this:
check
(
"initializing library"
,
raw
::git_libgit2_init
());
This uses the same CStr
methods used earlier: from_ptr
to construct the CStr
from a C string and to_string_lossy
to turn that into something Rust can print.
Next, we need a function to print out a commit:
unsafe
fn
show_commit
(
commit
:*
const
raw
::git_commit
)
{
let
author
=
raw
::git_commit_author
(
commit
);
let
name
=
CStr
::from_ptr
((
*
author
).
name
).
to_string_lossy
();
let
=
CStr
::from_ptr
((
*
author
).
).
to_string_lossy
();
println
!
(
"{} <{}>
\n
"
,
name
,
);
let
message
=
raw
::git_commit_message
(
commit
);
println
!
(
"{}"
,
CStr
::from_ptr
(
message
).
to_string_lossy
());
}
Given a pointer to a git_commit
, show_commit
calls git_commit_author
and git_commit_message
to retrieve the information it needs. These two functions follow a convention that the libgit2
documentation explains as follows:
If a function returns an object as a return value, that function is a getter and the object’s lifetime is tied to the parent object.
In Rust terms, author
and message
are borrowed from commit
: show_commit
doesn’t need to free them itself, but it must not hold on to them after commit
is freed. Since this API uses raw pointers, Rust won’t check their lifetimes for us: if we do accidentally create dangling pointers, we probably won’t find out about it until the program crashes.
The preceding code assumes these fields hold UTF-8 text, which is not always correct. Git permits other encodings as well. Interpreting these strings properly would probably entail using the encoding
crate. For brevity’s sake, we’ll gloss over those issues here.
Our program’s main
function reads as follows:
use
std
::ffi
::CString
;
use
std
::mem
;
use
std
::ptr
;
use
std
::os
::raw
::c_char
;
fn
main
()
{
let
path
=
std
::env
::args
().
skip
(
1
).
next
()
.
expect
(
"usage: git-toy PATH"
);
let
path
=
CString
::new
(
path
)
.
expect
(
"path contains null characters"
);
unsafe
{
check
(
"initializing library"
,
raw
::git_libgit2_init
());
let
mut
repo
=
ptr
::null_mut
();
check
(
"opening repository"
,
raw
::git_repository_open
(
&
mut
repo
,
path
.
as_ptr
()));
let
c_name
=
b"HEAD
\0
"
.
as_ptr
()
as
*
const
c_char
;
let
oid
=
{
let
mut
oid
=
mem
::MaybeUninit
::uninit
();
check
(
"looking up HEAD"
,
raw
::git_reference_name_to_id
(
oid
.
as_mut_ptr
(),
repo
,
c_name
));
oid
.
assume_init
()
};
let
mut
commit
=
ptr
::null_mut
();
check
(
"looking up commit"
,
raw
::git_commit_lookup
(
&
mut
commit
,
repo
,
&
oid
));
show_commit
(
commit
);
raw
::git_commit_free
(
commit
);
raw
::git_repository_free
(
repo
);
check
(
"shutting down library"
,
raw
::git_libgit2_shutdown
());
}
}
This starts with code to handle the path argument and initialize the library, all of which we’ve seen before. The first novel code is this:
let
mut
repo
=
ptr
::null_mut
();
check
(
"opening repository"
,
raw
::git_repository_open
(
&
mut
repo
,
path
.
as_ptr
()));
The call to git_repository_open
tries to open the Git repository at the given path. If it succeeds, it allocates a new git_repository
object for it and sets repo
to point to that. Rust implicitly coerces references into raw pointers, so passing &mut repo
here provides the *mut *mut git_repository
the call expects.
This shows another libgit2
convention in use (from the libgit2
documentation):
Objects which are returned via the first argument as a pointer-to-pointer are owned by the caller and it is responsible for freeing them.
In Rust terms, functions like git_repository_open
pass ownership of the new value to the caller.
Next, consider the code that looks up the object hash of the repository’s current head commit:
let
oid
=
{
let
mut
oid
=
mem
::MaybeUninit
::uninit
();
check
(
"looking up HEAD"
,
raw
::git_reference_name_to_id
(
oid
.
as_mut_ptr
(),
repo
,
c_name
));
oid
.
assume_init
()
};
The git_oid
type stores an object identifier—a 160-bit hash code that Git uses internally (and throughout its delightful user interface) to identify commits, individual versions of files, and so on. This call to git_reference_name_to_id
looks up the object identifier of the current "HEAD"
commit.
In C it’s perfectly normal to initialize a variable by passing a pointer to it to some function that fills in its value; this is how git_reference_name_to_id
expects to treat its first argument. But Rust won’t let us borrow a reference to an uninitialized variable. We could initialize oid
with zeros, but this is a waste: any value stored there will simply be overwritten.
It is possible to ask Rust to give us uninitialized memory, but because reading uninitialized memory at any time is instant undefined behavior, Rust provides an abstraction, MaybeUninit
, to ease its use. MaybeUninit<T>
tells the compiler to set aside enough memory for your type T
, but not to touch it until you say that it’s safe to do so. While this memory is owned by the MaybeUninit
, the compiler will also avoid certain optimizations that could otherwise cause undefined behavior even without any explicit access to the uninitialized memory in your code.
MaybeUninit
provides a method, as_mut_ptr()
, that produces a *mut T
pointing to the potentially uninitialized memory it wraps. By passing that pointer to a foreign function that initializes the memory and then calling the unsafe method assume_init
on the MaybeUninit
to produce a fully initialized T
, you can avoid undefined behavior without the additional overhead that comes from initializing and immediately throwing away a value. assume_init
is unsafe because calling it on a MaybeUninit
without being certain that the memory is actually initialized will immediately cause undefined behavior.
In this case, it is safe because git_reference_name_to_id
initializes the memory owned by the MaybeUninit
. We could use MaybeUninit
for the repo
and commit
variables as well, but since these are just single words, we just go ahead and initialize them to null:
let
mut
commit
=
ptr
::null_mut
();
check
(
"looking up commit"
,
raw
::git_commit_lookup
(
&
mut
commit
,
repo
,
&
oid
));
This takes the commit’s object identifier and looks up the actual commit, storing a git_commit
pointer in commit
on success.
The remainder of the main
function should be self-explanatory. It calls the show_commit
function defined earlier, frees the commit and repository objects, and shuts down the library.
Now we can try out the program on any Git repository ready at hand:
$
cargo run /home/jimb/rbattleFinished dev [unoptimized + debuginfo] target(s) in 0.0 secs
Running `target/debug/git-toy /home/jimb/rbattle`
Jim Blandy <jimb@red-bean.com>
Animate goop a bit.
The raw interface to libgit2
is a perfect example of an unsafe feature: it certainly can be used correctly (as we do here, so far as we know), but Rust can’t enforce the rules you must follow. Designing a safe API for a library like this is a matter of identifying all these rules and then finding ways to turn any violation of them into a type or borrow-checking error.
Here, then, are libgit2
’s rules for the features the program uses:
You must call git_libgit2_init
before using any other library function. You must not use any library function after calling git_libgit2_shutdown
.
All values passed to libgit2
functions must be fully initialized, except for output parameters.
When a call fails, output parameters passed to hold the results of the call are left uninitialized, and you must not use their values.
A git_commit
object refers to the git_repository
object it is derived from, so the former must not outlive the latter. (This isn’t spelled out in the libgit2
documentation; we inferred it from the presence of certain functions in the interface and then verified it by reading the source code.)
Similarly, a git_signature
is always borrowed from a given git_commit
, and the former must not outlive the latter. (The documentation does cover this case.)
The message associated with a commit and the name and email address of the author are all borrowed from the commit and must not be used after the commit is freed.
Once a libgit2
object has been freed, it must never be used again.
As it turns out, you can build a Rust interface to libgit2
that enforces all of these rules, either through Rust’s type system or by managing details internally.
Before we get started, let’s restructure the project a little bit. We’d like to have a git
module that exports the safe interface, of which the raw interface from the previous program is a private submodule.
The whole source tree will look like this:
git-toy/
├── Cargo.toml
├── build.rs
└── src/
├── main.rs
└── git/
├── mod.rs
└── raw.rs
Following the rules we explained in “Modules in Separate Files”, the source for the git
module appears in git/mod.rs, and the source for its git::raw
submodule goes in git/raw.rs.
Once again, we’re going to rewrite main.rs completely. It should start with a declaration of the git
module:
mod
git
;
Then, we’ll need to create the git subdirectory and move raw.rs into it:
$
cd
/home/jimb/git-toy$
mkdir src/git$
mv src/raw.rs src/git/raw.rs
The git
module needs to declare its raw
submodule. The file src/git/mod.rs must say:
mod
raw
;
Since it’s not pub
, this submodule is not visible to the main program.
In a bit we’ll need to use some functions from the libc
crate, so we must add a dependency in Cargo.toml. The full file now reads:
[package] name = "git-toy" version = "0.1.0" authors = ["You <you@example.com>"] edition = "2018" [dependencies] libc = "0.2"
Now that we’ve restructured our modules, let’s consider error handling. Even libgit2
’s initialization function can return an error code, so we’ll need to have this sorted out before we can get started. An idiomatic Rust interface needs its own Error
type that captures the libgit2
failure code as well as the error message and class from giterr_last
. A proper error type must implement the usual Error
, Debug
, and Display
traits. Then, it needs its own Result
type that uses this Error
type. Here are the necessary definitions in src/git/mod.rs:
use
std
::error
;
use
std
::fmt
;
use
std
::result
;
#[derive(Debug)]
pub
struct
Error
{
code
:i32
,
message
:String
,
class
:i32
}
impl
fmt
::Display
for
Error
{
fn
fmt
(
&
self
,
f
:&
mut
fmt
::Formatter
)
->
result
::Result
<
(),
fmt
::Error
>
{
// Displaying an `Error` simply displays the message from libgit2.
self
.
message
.
fmt
(
f
)
}
}
impl
error
::Error
for
Error
{
}
pub
type
Result
<
T
>
=
result
::Result
<
T
,
Error
>
;
To check the result from raw library calls, the module needs a function that turns a libgit2
return code into a Result
:
use
std
::os
::raw
::c_int
;
use
std
::ffi
::CStr
;
fn
check
(
code
:c_int
)
->
Result
<
c_int
>
{
if
code
>=
0
{
return
Ok
(
code
);
}
unsafe
{
let
error
=
raw
::giterr_last
();
// libgit2 ensures that (*error).message is always non-null and null
// terminated, so this call is safe.
let
message
=
CStr
::from_ptr
((
*
error
).
message
)
.
to_string_lossy
()
.
into_owned
();
Err
(
Error
{
code
:code
as
i32
,
message
,
class
:(
*
error
).
klass
as
i32
})
}
}
The main difference between this and the check
function from the raw version is that this constructs an Error
value instead of printing an error message and exiting immediately.
Now we’re ready to tackle libgit2
initialization. The safe interface will provide a Repository
type that represents an open Git repository, with methods for resolving references, looking up commits, and so on. Continuing in git/mod.rs, here’s the definition of Repository
:
/// A Git repository.
pub
struct
Repository
{
// This must always be a pointer to a live `git_repository` structure.
// No other `Repository` may point to it.
raw
:*
mut
raw
::git_repository
}
A Repository
’s raw
field is not public. Since only code in this module can access the raw::git_repository
pointer, getting this module right should ensure the pointer is always used correctly.
If the only way to create a Repository
is to successfully open a fresh Git repository, that will ensure that each Repository
points to a distinct git_repository
object:
use
std
::path
::Path
;
use
std
::ptr
;
impl
Repository
{
pub
fn
open
<
P
:AsRef
<
Path
>>
(
path
:P
)
->
Result
<
Repository
>
{
ensure_initialized
();
let
path
=
path_to_cstring
(
path
.
as_ref
())
?
;
let
mut
repo
=
ptr
::null_mut
();
unsafe
{
check
(
raw
::git_repository_open
(
&
mut
repo
,
path
.
as_ptr
()))
?
;
}
Ok
(
Repository
{
raw
:repo
})
}
}
Since the only way to do anything with the safe interface is to start with a Repository
value, and Repository::open
starts with a call to ensure_initialized
, we can be confident that ensure_initialized
will be called before any libgit2
functions. Its definition is as follows:
fn
ensure_initialized
()
{
static
ONCE
:std
::sync
::Once
=
std
::sync
::Once
::new
();
ONCE
.
call_once
(
||
{
unsafe
{
check
(
raw
::git_libgit2_init
())
.
expect
(
"initializing libgit2 failed"
);
assert_eq
!
(
libc
::atexit
(
shutdown
),
0
);
}
});
}
extern
fn
shutdown
()
{
unsafe
{
if
let
Err
(
e
)
=
check
(
raw
::git_libgit2_shutdown
())
{
eprintln
!
(
"shutting down libgit2 failed: {}"
,
e
);
std
::process
::abort
();
}
}
}
The std::sync::Once
type helps run initialization code in a thread-safe way. Only the first thread to call ONCE.call_once
runs the given closure. Any subsequent calls, by this thread or any other, block until the first has completed and then return immediately, without running the closure again. Once the closure has finished, calling ONCE.call_once
is cheap, requiring nothing more than an atomic load of a flag stored in ONCE
.
In the preceding code, the initialization closure calls git_libgit2_init
and checks the result. It punts a bit and just uses expect
to make sure initialization succeeded, instead of trying to propagate errors back to the caller.
To make sure the program calls git_libgit2_shutdown
, the initialization closure uses the C library’s atexit
function, which takes a pointer to a function to invoke before the process exits. Rust closures cannot serve as C function pointers: a closure is a value of some anonymous type carrying the values of whatever variables it captures or references to them; a C function pointer is just a pointer. However, Rust fn
types work fine, as long as you declare them extern
so that Rust knows to use the C calling conventions. The local function shutdown
fits the bill and ensures libgit2
gets shut down properly.
In “Unwinding”, we mentioned that it is undefined behavior for a panic to cross language boundaries. The call from atexit
to shutdown
is such a boundary, so it is essential that shutdown
not panic. This is why shutdown
can’t simply use .expect
to handle errors reported from raw::git_libgit2_shutdown
. Instead, it must report the error and terminate the process itself. POSIX forbids calling exit
within an atexit
handler, so shutdown
calls std::process::abort
to terminate the program abruptly.
It might be possible to arrange to call git_libgit2_shutdown
sooner—say, when the last Repository
value is dropped. But no matter how we arrange things, calling git_libgit2_shutdown
must be the safe API’s responsibility. The moment it is called, any extant libgit2
objects become unsafe to use, so a safe API must not expose this function directly.
A Repository
’s raw pointer must always point to a live git_repository
object. This implies that the only way to close a repository is to drop the Repository
value that owns it:
impl
Drop
for
Repository
{
fn
drop
(
&
mut
self
)
{
unsafe
{
raw
::git_repository_free
(
self
.
raw
);
}
}
}
By calling git_repository_free
only when the sole pointer to the raw::git_repository
is about to go away, the Repository
type also ensures the pointer will never be used after it’s freed.
The Repository::open
method uses a private function called path_to_cstring
, which has two definitions—one for Unix-like systems and one for Windows:
use
std
::ffi
::CString
;
#[cfg(unix)]
fn
path_to_cstring
(
path
:&
Path
)
->
Result
<
CString
>
{
// The `as_bytes` method exists only on Unix-like systems.
use
std
::os
::unix
::ffi
::OsStrExt
;
Ok
(
CString
::new
(
path
.
as_os_str
().
as_bytes
())
?
)
}
#[cfg(windows)]
fn
path_to_cstring
(
path
:&
Path
)
->
Result
<
CString
>
{
// Try to convert to UTF-8. If this fails, libgit2 can't handle the path
// anyway.
match
path
.
to_str
()
{
Some
(
s
)
=>
Ok
(
CString
::new
(
s
)
?
),
None
=>
{
let
message
=
format
!
(
"Couldn't convert path '{}' to UTF-8"
,
path
.
display
());
Err
(
message
.
into
())
}
}
}
The libgit2
interface makes this code a little tricky. On all platforms, libgit2
accepts paths as null-terminated C strings. On Windows, libgit2
assumes these C strings hold well-formed UTF-8 and converts them internally to the 16-bit paths Windows actually requires. This usually works, but it’s not ideal. Windows permits filenames that are not well-formed Unicode and thus cannot be represented in UTF-8. If you have such a file, it’s impossible to pass its name to libgit2
.
In Rust, the proper representation of a filesystem path is a std::path::Path
, carefully designed to handle any path that can appear on Windows or POSIX. This means that there are Path
values on Windows that one cannot pass to libgit2
, because they are not well-formed UTF-8. So although path_to_cstring
’s behavior is less than ideal, it’s actually the best we can do given libgit2
’s interface.
The two path_to_cstring
definitions just shown rely on conversions to our Error
type: the ?
operator attempts such conversions, and the Windows version explicitly calls .into()
. These conversions are unremarkable:
impl
From
<
String
>
for
Error
{
fn
from
(
message
:String
)
->
Error
{
Error
{
code
:-
1
,
message
,
class
:0
}
}
}
// NulError is what `CString::new` returns if a string
// has embedded zero bytes.
impl
From
<
std
::ffi
::NulError
>
for
Error
{
fn
from
(
e
:std
::ffi
::NulError
)
->
Error
{
Error
{
code
:-
1
,
message
:e
.
to_string
(),
class
:0
}
}
}
Next, let’s figure out how to resolve a Git reference to an object identifier. Since an object identifier is just a 20-byte hash value, it’s perfectly fine to expose it in the safe API:
/// The identifier of some sort of object stored in the Git object
/// database: a commit, tree, blob, tag, etc. This is a wide hash of the
/// object's contents.
pub
struct
Oid
{
pub
raw
:raw
::git_oid
}
We’ll add a method to Repository
to perform the lookup:
use
std
::mem
;
use
std
::os
::raw
::c_char
;
impl
Repository
{
pub
fn
reference_name_to_id
(
&
self
,
name
:&
str
)
->
Result
<
Oid
>
{
let
name
=
CString
::new
(
name
)
?
;
unsafe
{
let
oid
=
{
let
mut
oid
=
mem
::MaybeUninit
::uninit
();
check
(
raw
::git_reference_name_to_id
(
oid
.
as_mut_ptr
(),
self
.
raw
,
name
.
as_ptr
()
as
*
const
c_char
))
?
;
oid
.
assume_init
()
};
Ok
(
Oid
{
raw
:oid
})
}
}
}
Although oid
is left uninitialized when the lookup fails, this function guarantees that its caller can never see the uninitialized value simply by following Rust’s Result
idiom: either the caller gets an Ok
carrying a properly initialized Oid
value, or it gets an Err
.
Next, the module needs a way to retrieve commits from the repository. We’ll define a Commit
type as follows:
use
std
::marker
::PhantomData
;
pub
struct
Commit
<
'repo
>
{
// This must always be a pointer to a usable `git_commit` structure.
raw
:*
mut
raw
::git_commit
,
_marker
:PhantomData
<&
'repo
Repository
>
}
As we mentioned earlier, a git_commit
object must never outlive the git_repository
object it was retrieved from. Rust’s lifetimes let the code capture this rule precisely.
The RefWithFlag
example earlier in this chapter used a PhantomData
field to tell Rust to treat a type as if it contained a reference with a given lifetime, even though the type apparently contained no such reference. The Commit
type needs to do something similar. In this case, the _marker
field’s type is PhantomData<&'repo Repository>
, indicating that Rust should treat Commit<'repo>
as if it held a reference with lifetime 'repo
to some Repository
.
The method for looking up a commit is as follows:
impl
Repository
{
pub
fn
find_commit
(
&
self
,
oid
:&
Oid
)
->
Result
<
Commit
>
{
let
mut
commit
=
ptr
::null_mut
();
unsafe
{
check
(
raw
::git_commit_lookup
(
&
mut
commit
,
self
.
raw
,
&
oid
.
raw
))
?
;
}
Ok
(
Commit
{
raw
:commit
,
_marker
:PhantomData
})
}
}
How does this relate the Commit
’s lifetime to the Repository
’s? The signature of find_commit
omits the lifetimes of the references involved according to the rules outlined in “Omitting Lifetime Parameters”. If we were to write the lifetimes out, the full signature would read:
fn
find_commit
<
'repo
,
'id
>
(
&
'repo
self
,
oid
:&
'id
Oid
)
->
Result
<
Commit
<
'repo
>>
This is exactly what we want: Rust treats the returned Commit
as if it borrows something from self
, which is the Repository
.
When a Commit
is dropped, it must free its raw::git_commit
:
impl
<
'repo
>
Drop
for
Commit
<
'repo
>
{
fn
drop
(
&
mut
self
)
{
unsafe
{
raw
::git_commit_free
(
self
.
raw
);
}
}
}
From a Commit
, you can borrow a Signature
(a name and email address) and the text of the commit message:
impl
<
'repo
>
Commit
<
'repo
>
{
pub
fn
author
(
&
self
)
->
Signature
{
unsafe
{
Signature
{
raw
:raw
::git_commit_author
(
self
.
raw
),
_marker
:PhantomData
}
}
}
pub
fn
message
(
&
self
)
->
Option
<&
str
>
{
unsafe
{
let
message
=
raw
::git_commit_message
(
self
.
raw
);
char_ptr_to_str
(
self
,
message
)
}
}
}
Here’s the Signature
type:
pub
struct
Signature
<
'text
>
{
raw
:*
const
raw
::git_signature
,
_marker
:PhantomData
<&
'text
str
>
}
A git_signature
object always borrows its text from elsewhere; in particular, signatures returned by git_commit_author
borrow their text from the git_commit
. So our safe Signature
type includes a PhantomData<&'text str>
to tell Rust to behave as if it contained a &str
with a lifetime of 'text
. Just as before, Commit::author
properly connects this 'text
lifetime of the Signature
it returns to that of the Commit
without us needing to write a thing. The Commit::message
method does the same with the Option<&str>
holding the commit message.
A Signature
includes methods for retrieving the author’s name and email address:
impl
<
'text
>
Signature
<
'text
>
{
/// Return the author's name as a `&str`,
/// or `None` if it is not well-formed UTF-8.
pub
fn
name
(
&
self
)
->
Option
<&
str
>
{
unsafe
{
char_ptr_to_str
(
self
,
(
*
self
.
raw
).
name
)
}
}
/// Return the author's email as a `&str`,
/// or `None` if it is not well-formed UTF-8.
pub
fn
(
&
self
)
->
Option
<&
str
>
{
unsafe
{
char_ptr_to_str
(
self
,
(
*
self
.
raw
).
)
}
}
}
The preceding methods depend on a private utility function char_ptr_to_str
:
/// Try to borrow a `&str` from `ptr`, given that `ptr` may be null or
/// refer to ill-formed UTF-8. Give the result a lifetime as if it were
/// borrowed from `_owner`.
///
/// Safety: if `ptr` is non-null, it must point to a null-terminated C
/// string that is safe to access for at least as long as the lifetime of
/// `_owner`.
unsafe
fn
char_ptr_to_str
<
T
>
(
_owner
:&
T
,
ptr
:*
const
c_char
)
->
Option
<&
str
>
{
if
ptr
.
is_null
()
{
return
None
;
}
else
{
CStr
::from_ptr
(
ptr
).
to_str
().
ok
()
}
}
The _owner
parameter’s value is never used, but its lifetime is. Making the lifetimes in this function’s signature explicit gives us:
fn
char_ptr_to_str
<
'o
,
T
:'o
>
(
_owner
:&
'o
T
,
ptr
:*
const
c_char
)
->
Option
<&
'o
str
>
The CStr::from_ptr
function returns a &CStr
whose lifetime is completely unbounded, since it was borrowed from a dereferenced raw pointer. Unbounded lifetimes are almost always inaccurate, so it’s good to constrain them as soon as possible. Including the _owner
parameter causes Rust to attribute its lifetime to the return value’s type, so callers can receive a more accurately bounded reference.
It is not clear from the libgit2
documentation whether a git_signature
’s email
and author
pointers can be null, despite the documentation for libgit2
being quite good. Your authors dug around in the source code for some time without being able to persuade themselves one way or the other and finally decided that char_ptr_to_str
had better be prepared for null pointers just in case. In Rust, this sort of question is answered immediately by the type: if it’s &str
, you can count on the string to be there; if it’s Option<&str>
, it’s optional.
Finally, we’ve provided safe interfaces for all the functionality we need. The new main
function in src/main.rs is slimmed down quite a bit and looks like real Rust code:
fn
main
()
{
let
path
=
std
::env
::args_os
().
skip
(
1
).
next
()
.
expect
(
"usage: git-toy PATH"
);
let
repo
=
git
::Repository
::open
(
&
path
)
.
expect
(
"opening repository"
);
let
commit_oid
=
repo
.
reference_name_to_id
(
"HEAD"
)
.
expect
(
"looking up 'HEAD' reference"
);
let
commit
=
repo
.
find_commit
(
&
commit_oid
)
.
expect
(
"looking up commit"
);
let
author
=
commit
.
author
();
println
!
(
"{} <{}>
\n
"
,
author
.
name
().
unwrap_or
(
"(none)"
),
author
.
().
unwrap_or
(
"none"
));
println
!
(
"{}"
,
commit
.
message
().
unwrap_or
(
"(none)"
));
}
In this chapter, we’ve gone from simplistic interfaces that don’t provide many safety guarantees to a safe API wrapping an inherently unsafe API by arranging for any violation of the latter’s contract to be a Rust type error. The result is an interface that Rust can ensure you use correctly. For the most part, the rules we’ve made Rust enforce are the sorts of rules that C and C++ programmers end up imposing on themselves anyway. What makes Rust feel so much stricter than C and C++ is not that the rules are so foreign, but that this enforcement is mechanical and comprehensive.
Rust is not a simple language. Its goal is to span two very different worlds. It’s a modern programming language, safe by design, with conveniences like closures and iterators, yet it aims to put you in control of the raw capabilities of the machine it runs on, with minimal run-time overhead.
The contours of the language are determined by these goals. Rust manages to bridge most of the gap with safe code. Its borrow checker and zero-cost abstractions put you as close to the bare metal as possible without risking undefined behavior. When that’s not enough or when you want to leverage existing C code, unsafe code and the foreign function interface stand ready. But again, the language doesn’t just offer you these unsafe features and wish you luck. The goal is always to use unsafe features to build safe APIs. That’s what we did with libgit2
. It’s also what the Rust team has done with Box
, Vec
, the other collections, channels, and more: the standard library is full of safe abstractions, implemented with some unsafe code behind the scenes.
A language with Rust’s ambitions was, perhaps, not destined to be the simplest of tools. But Rust is safe, fast, concurrent—and effective. Use it to build large, fast, secure, robust systems that take advantage of the full power of the hardware they run on. Use it to make software better.