Chapter 2. Test for Echo

By the time you get this note / We’ll no longer be alive /
We’ll have all gone up in smoke / There’ll be no way to reply

They Might Be Giants, “By the Time You Get This” (2018)

In Chapter 1, you wrote three programs—hello, true, and false—that take no arguments and always produce the same output. In this chapter, I’ll show you how to use arguments from the command line to change the behavior of the program at runtime. The challenge program you’ll write is a clone of echo, which will print its arguments on the command line, optionally terminated with a newline.

In this chapter, you’ll learn how to do the following:

Process command-line arguments with the clap crate
Use Rust types like strings, vectors, slices, and the unit type
Use expressions like match, if, and return
Use Option to represent an optional value
Handle errors using the Result variants of Ok and Err
Understand the difference between stack and heap memory
Test for text that is printed to STDOUT and STDERR

How echo Works

In each chapter, you will be writing a Rust version of an existing command-line tool, so I will begin each chapter by describing how the tool works so that you understand what you’ll be creating. The features I describe are also the substance of the test suite I provide. For this challenge, you will create a Rust version of the echo program, which is blissfully simple. To start, echo will print its arguments to STDOUT:

$ echo Hello
Hello

I’m using the bash shell, which assumes that any number of spaces delimit the arguments, so arguments that have spaces must be enclosed in quotes. In the following command, I’m providing four words as a single argument:

$ echo "Rust has assumed control"
Rust has assumed control

Without the quotes, I’m providing four separate arguments. Note that I can use a varying number of spaces when I provide the arguments, but echo prints them using a single space between each argument:

$ echo Rust  has assumed   control
Rust has assumed control

If I want the spaces to be preserved, I must enclose them in quotes:

$ echo "Rust  has assumed   control"
Rust  has assumed   control

It’s extremely common—but not mandatory—for command-line programs to respond to the flags -h or --help to print a helpful usage statement. If I try that with echo, it will simply print the flag:

$ echo --help
--help

Instead, I can read the manual page for echo by executing man echo. You’ll see that I’m using the BSD version of the program from 2003:

ECHO(1)                   BSD General Commands Manual                  ECHO(1)

NAME
     echo -- write arguments to the standard output

SYNOPSIS
     echo [-n] [string ...]

DESCRIPTION
     The echo utility writes any specified operands, separated by single blank
     (' ') characters and followed by a newline ('\n') character, to the stan-
     dard output.

     The following option is available:

     -n    Do not print the trailing newline character.  This may also be
           achieved by appending '\c' to the end of the string, as is done by
           iBCS2 compatible systems.  Note that this option as well as the
           effect of '\c' are implementation-defined in IEEE Std 1003.1-2001
           (''POSIX.1'') as amended by Cor. 1-2002.  Applications aiming for
           maximum portability are strongly encouraged to use printf(1) to
           suppress the newline character.

     Some shells may provide a builtin echo command which is similar or iden-
     tical to this utility.  Most notably, the builtin echo in sh(1) does not
     accept the -n option.  Consult the builtin(1) manual page.

EXIT STATUS
     The echo utility exits 0 on success, and >0 if an error occurs.

SEE ALSO
     builtin(1), csh(1), printf(1), sh(1)

STANDARDS
     The echo utility conforms to IEEE Std 1003.1-2001 (''POSIX.1'') as
     amended by Cor. 1-2002.

BSD                             April 12, 2003                             BSD

By default, the text that echo prints on the command line is terminated by a newline character. As shown in the preceding manual page, the program has a single -n option to omit the final newline. Depending on the version of echo you have, this may not appear to affect the output. For instance, the BSD version I’m using shows this:

$ echo -n Hello
Hello
$

: The BSD echo shows my command prompt, $, on the next line.

The GNU version on Linux shows this:

$ echo -n Hello
Hello$

: The GNU echo shows my command prompt immediately after Hello.

Regardless of which version of echo you have, you can use the bash redirect operator > to send STDOUT to a file:

$ echo Hello > hello
$ echo -n Hello > hello-n

The diff tool will display the differences between two files. This output shows that the second file (hello-n) does not have a newline at the end:

$ diff hello hello-n
1c1
< Hello
---
> Hello
\ No newline at end of file

Getting Started

This challenge program will be called echor, for echo plus r for Rust. (I can’t decide if I pronounce this like eh-core or eh-koh-ar.) Change into the directory for your solutions and start a new project using Cargo:

$ cargo new echor
     Created binary (application) `echor` package

Change into the new directory to see a familiar structure:

$ cd echor
$ tree
.
├── Cargo.toml
└── src
    └── main.rs

Use Cargo to run the program:

$ cargo run
Hello, world!

: The default program always prints “Hello, world!”

You’ve already seen this source code in Chapter 1, but I’d like to point out a couple more things about the code in src/main.rs:

fn main() {
    println!("Hello, world!");
}



As you saw in Chapter 1, Rust will start the program by executing the main function in src/main.rs.
All functions return a value, and the return type may be indicated with an arrow and the type, such as -> u32 to say the function returns an unsigned 32-bit integer.
The lack of any return type for main implies that the function returns what Rust calls the unit type.
Also, note that the println! macro will automatically append a newline to the output, which is a feature you’ll need to control when the user requests no terminating newline.

Note
The unit type is like an empty value and is signified with a set of empty parentheses: (). The documentation says this “is used when there is no other meaningful value that could be returned.” It’s not quite like a null pointer or undefined value in other languages, a concept first introduced by Tony Hoare (no relation to Rust creator Graydon Hoare), who called the null reference his “billion-dollar mistake.” Since Rust does not (normally) allow you to dereference a null pointer, it must logically be worth at least a billion dollars.











Accessing the Command-Line Arguments

The first order of business is getting the command-line arguments to print.
In Rust you can use std::env::args for this.
In Chapter 1, you used the std::process crate to handle external processes.
Here, you’ll use std::env to interact with the environment, which is where the program will find the arguments.
If you look at the documentation for the function, you’ll see it returns something of the type Args:


pub fn args() -> Args


If you go to the link for the Args documentation, you’ll find it is a struct, which is a kind of data structure in Rust.
If you look along the lefthand side of the page, you’ll see things like trait implementations, other related structs, functions, and more.
We’ll explore these ideas later, but for now, just poke around the docs and try to absorb what you see.


Edit src/main.rs to print the arguments.
You can call the function by using the full path followed by an empty set of parentheses:

fn main() {
    println!(std::env::args()); // This will not work
}


Execute the program using cargo run, and you should see the following error:

error: format argument must be a string literal
 --> src/main.rs:2:14
  |
2 |     println!(std::env::args()); // This will not work
  |              ^^^^^^^^^^^^^^^^
  |
help: you might be missing a string literal to format with
  |
2 |     println!("{}", std::env::args()); // This will not work
  |              +++++

error: could not compile `echor` due to previous error

Here is your first spat with the compiler.
It’s saying that you cannot directly print 
the value that is returned from that function, but it’s also suggesting how to fix the 
problem.
It wants you to first provide a literal string that has a set of curly braces ({}) that will serve as a placeholder for the printed value, so change the code accordingly:


fn main() {
    println!("{}", std::env::args()); // This will not work either
}


Run the program again and see that you’re not out of the woods, because there is another compiler error.
Note that I  omit the “compiling” and other lines to focus on the important output:


$ cargo run
error[E0277]: `Args` doesn't implement `std::fmt::Display`
 --> src/main.rs:2:20
  |
2 |     println!("{}", std::env::args()); // This will not work
  |                    ^^^^^^^^^^^^^^^^ `Args` cannot be formatted with
  |                                     the default formatter
  |
  = help: the trait `std::fmt::Display` is not implemented for `Args`
  = note: in format strings you may be able to use `{:?}` (or {:#?} for
    pretty-print) instead
  = note: this error originates in the macro `$crate::format_args_nl`
    (in Nightly builds, run with -Z macro-backtrace for more info)

There’s a lot of information in that compiler message.
First off, there’s something about the trait std::fmt::Display not being implemented for Args.
A trait in Rust is a way to define the behavior of an object in an abstract way.
If an object implements the Display trait, then it can be formatted for user-facing output.
Look again at the “Trait Implementations” section of the Args documentation and notice that, indeed, Display is not mentioned there.


The compiler suggests you should use {:?} instead of {} for the placeholder.
This is an instruction to print a Debug version of the structure, which will format the output in a debugging context.
Refer again to the Args documentation to see that Debug is listed under “Trait Implementations.”
Change the code to the 
following:

fn main() {
    println!("{:?}", std::env::args()); // Success at last!
}


Now the program compiles and prints something vaguely useful:

$ cargo run
Args { inner: ["target/debug/echor"] }

If you are unfamiliar with command-line arguments, it’s common for the first value to be the path of the program itself.
It’s not an argument per se, but it is useful information.
Let’s see what happens when I pass some arguments:

$ cargo run Hello world
Args { inner: ["target/debug/echor", "Hello", "world"] }

Huzzah!
It would appear that I’m able to get the arguments to my program.
I passed two arguments, Hello and world, and they showed up as additional values after the binary name.
I know I’ll need to pass the -n flag, so I’ll try that next:

$ cargo run Hello world -n
Args { inner: ["target/debug/echor", "Hello", "world", "-n"] }

It’s also common to place the flag before the values, so let me try that:

$ cargo run -n Hello world
error: Found argument '-n' which wasn't expected, or isn't valid in this context

USAGE:
    cargo run [OPTIONS] [--] [args]...

For more information try --help

That doesn’t work because Cargo thinks the -n argument is for itself, not the program I’m running.
To fix this, I need to separate Cargo’s options using two dashes:

$ cargo run -- -n Hello world
Args { inner: ["target/debug/echor", "-n", "Hello", "world"] }

In the parlance of command-line program parameters, the -n is an optional argument because you can leave it out.
Typically, program options start with one or two dashes.
It’s common to have short names with one dash and a single character, like -h for the help flag, and long names with two dashes and a word, like --help.
You will commonly see these concatenated like -h|--help to indicate one or the other.
The options -n and -h are often called flags because they don’t take a value.
Flags have one meaning when present and the opposite when absent.
In this case, -n says to omit the trailing newline; otherwise, print as normal.


All the other arguments to echo are positional because their position relative to the name of the program (the first element in the arguments) determines their meaning.
Consider the command chmod to change the mode of a file or directory.
It takes two positional arguments, a mode like 755 first and a file or directory name second.
In the case of echo, all the positional arguments are interpreted as the text to print, and they should be printed in the same order they are given.
This is not a bad start, but the arguments to the programs in this book are going to become much more complex.
We will need a more robust method for parsing the program’s arguments.
















Adding clap as a Dependency

Although there are various methods and crates for parsing command-line arguments, I will exclusively use the clap (command-line argument parser) crate in this book because it’s fairly simple and extremely effective.
To get started, I need to tell Cargo that I want to download this crate and use it in my project.
I can do this by adding it as a dependency to Cargo.toml, specifying the version:


[package]
name = "echor"
version = "0.1.0"
edition = "2021"

[dependencies]
clap = "2.33"
Note
The version “2.33” means I want to use exactly this version. I could use just “2” to indicate that I’m fine using the latest version in 
the major version “2.x” line. There are many other ways to indicate 
the version, and I recommend you read about how to specify dependencies.


The next time I try to build the program, Cargo will download the clap source code (if needed) and all of its dependencies.
For instance, I can run cargo build to just build the new binary and not run it:

$ cargo build
    Updating crates.io index
   Compiling libc v0.2.104
   Compiling unicode-width v0.1.9
   Compiling vec_map v0.8.2
   Compiling bitflags v1.3.2
   Compiling ansi_term v0.11.0
   Compiling strsim v0.8.0
   Compiling textwrap v0.11.0
   Compiling atty v0.2.14
   Compiling clap v2.33.3
   Compiling echor v0.1.0 (/Users/kyclark/work/cmdline-rust/playground/echor)
    Finished dev [unoptimized + debuginfo] target(s) in 12.66s

You may be curious where these packages went.
Cargo places the downloaded source code into .cargo in your home directory, and the build artifacts go into the target/⁠debug/deps directory of the project.
This brings up an interesting part of building Rust projects: each program you build can use different versions of crates, and each program is built in a separate directory.
If you have ever suffered through using shared modules, as is common with Perl and Python, you’ll appreciate that you don’t have to worry about conflicts where one program requires some old obscure version and another requires the latest bleeding-edge version in GitHub.
Python, of course, offers virtual environments to combat this problem, and other languages have similar solutions.
Still, I find Rust’s approach to be quite comforting.


A consequence of Rust placing the dependencies into target is that this directory is now quite large.
You can use the disk usage command du -shc . to find that the project now weighs in at about 25 MB, and almost all of that lives in target.
If you run cargo help, you will see that the clean command will remove the target directory.
You might do this to reclaim disk space if you aren’t going to work on the project for a while, at the expense of having to recompile again in the future.
















Parsing Command-Line Arguments Using clap

To learn how to use clap to parse the arguments, you need to read the documentation, and I like to use Docs.rs for this.
After consulting the clap docs, I wrote the following version of src/main.rs that creates a new clap::App struct to parse the command-line arguments:


use clap::App; 

fn main() {
    let _matches = App::new("echor") 
        .version("0.1.0") 
        .author("Ken Youens-Clark <kyclark@gmail.com>") 
        .about("Rust echo") 
        .get_matches(); 
}


Import the clap::App struct.

Create a new App with the name echor.

Use semantic version information.

Include your name and email address so people know where to send the money.

This is a short description of the program.

Tell the App to parse the arguments.

Note
In the preceding code, the leading underscore in the variable name _matches is functional. It tells the Rust compiler that I do not intend to use this variable right now. Without the underscore, the compiler would warn about an unused variable.


With this code in place, I can run the echor program with the -h or --help flags to get a usage document.
Note that I didn’t have to define this argument, as clap did this for me:

$ cargo run -- -h
echor 0.1.0 
Ken Youens-Clark <kyclark@gmail.com> 
Rust echo 

USAGE:
    echor

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information


The app name and version number appear here.

Here is the author information.

This is the about text.


In addition to the help flags, I see that clap also automatically handles the flags -V and --version to print the program’s version:

$ cargo run -- --version
echor 0.1.0

Next, I need to define the parameters using clap::Arg.
To do this, I expand src/⁠main.rs with the following code:

use clap::{App, Arg}; 

fn main() {
    let matches = App::new("echor")
        .version("0.1.0")
        .author("Ken Youens-Clark <kyclark@gmail.com>")
        .about("Rust echo")
        .arg(
            Arg::with_name("text") 
                .value_name("TEXT")
                .help("Input text")
                .required(true)
                .min_values(1),
        )
        .arg(
            Arg::with_name("omit_newline") 
                .short("n")
                .help("Do not print newline")
                .takes_value(false),
        )
        .get_matches();

    println!("{:#?}", matches); 
}


Import both the App and Arg structs from the clap crate.

Create a new Arg with the name text. This is a required positional argument that must appear at least once and can be repeated.

Create a new Arg with the name omit_newline. This is a flag that has only the short name -n and takes no value.

Pretty-print the arguments.

Note
Earlier I used {:?} to format the debug view of the arguments. Here I’m using {:#?} to include newlines and indentations to help me read the output. This is called pretty-printing because, well, it’s prettier.



If you request the usage again, you will see the new parameters:

$ cargo run -- --help
echor 0.1.0
Ken Youens-Clark <kyclark@gmail.com>
Rust echo

USAGE:
    echor [FLAGS] <TEXT>...

FLAGS:
    -h, --help       Prints help information
    -n               Do not print newline 
    -V, --version    Prints version information

ARGS:
    <TEXT>...    Input text 


The -n flag to omit the newline is optional.

The required input text is one or more positional arguments.


Run the program with some arguments and inspect the structure of the arguments:

$ cargo run -- -n Hello world
ArgMatches {
    args: {
        "text": MatchedArg {
            occurs: 2,
            indices: [
                2,
                3,
            ],
            vals: [
                "Hello",
                "world",
            ],
        },
        "omit_newline": MatchedArg {
            occurs: 1,
            indices: [
                1,
            ],
            vals: [],
        },
    },
    subcommand: None,
    usage: Some(
        "USAGE:\n    echor [FLAGS] <TEXT>...",
    ),
}

If you run the program with no arguments, you will get an error indicating that you failed to provide the required arguments:

$ cargo run
error: The following required arguments were not provided:
    <TEXT>...

USAGE:
    echor [FLAGS] <TEXT>...

For more information try --help

This was an error, and so you can inspect the exit value to verify that it’s not zero:

$ echo $?
1

If you try to provide any argument that isn’t defined, it will trigger an error and a nonzero exit value:

$ cargo run -- -x
error: Found argument '-x' which wasn't expected, or isn't valid in this context

USAGE:
    echor [FLAGS] <TEXT>...

For more information try --help
Note
You might wonder how this magical stuff is happening. Why is the program stopping and reporting these errors? If you read the documentation for App::get_matches, you’ll see that “upon a failed parse an error will be displayed to the user and the process will exit with the appropriate error code.”


There’s a subtle thing happening with the error messages.
When you use println!, the output appears on STDOUT, but the usage and error messages are all appearing on STDERR, which you first saw in Chapter 1.
To see this in the bash shell, you can run echor and redirect channel 1 (STDOUT) to a file called out and channel 2 (STDERR) to a file called err:

$ cargo run 1>out 2>err

You should see no output because it was all redirected to the out and err files.
The out file should be empty because there was nothing printed to STDOUT, but the err file should contain the output from Cargo and the error messages from the program:

$ cat err
    Finished dev [unoptimized + debuginfo] target(s) in 0.01s
     Running `target/debug/echor`
error: The following required arguments were not provided:
    <TEXT>...

USAGE:
    echor [FLAGS] <TEXT>...

For more information try --help

So you see that another hallmark of well-behaved command-line programs is to print regular output to STDOUT and error messages to STDERR.
Sometimes errors are severe enough that you should halt the program, but sometimes they should just be noted in the course of running.
For instance, in Chapter 3 you will write a program that processes input files, some of which will intentionally not exist or will be unreadable.
I will show you how to print warnings to STDERR about these files and skip to the next argument without halting.
















Creating the Program Output

Now that I’m able to parse the program’s arguments, the next step is to use these values to generate the same output as echo.
It’s common to copy the values out of the matches into variables.
To start, I want to extract the text argument.
Because this Arg was defined to accept one or more values, I can use either of these functions that return multiple values:


ArgMatches::values_of

Returns Option<Values>

ArgMatches::values_of_lossy

Returns Option<Vec<String>>



To decide which to use, I have to run down a few rabbit holes to understand the following concepts:

Option

A value that is either None or Some<T>, where T is any type like a string or an integer. In the case of ArgMatches::values_of_lossy, the type T will be a vector of strings.


Values

An iterator for getting multiple values out of an argument.


Vec

A vector, which is a contiguous growable array type.


String

A string of characters.



Both of the functions ArgMatches::values_of and ArgMatches::values_of_lossy will return an Option of something.
Since I ultimately want to print the strings, I will use the ArgMatches::values_of_lossy function to get an Option<Vec<String>>.
The Option::unwrap function will take the value out of Some<T> to get at the payload T.
Because the text argument is required by clap, I know it will be impossible to have None; therefore, I can safely call Option::unwrap to get the Vec<String> value:


let text = matches.values_of_lossy("text").unwrap();

Warning
If you call Option::unwrap on a None, it will cause a panic that will crash your program. You should only call unwrap if you are positive the value is the Some variant.


The omit_newline argument is a bit easier, as it’s either present or not.
The type of this value will be a bool, or Boolean, which is either true or false:

let omit_newline = matches.is_present("omit_newline");


Finally, I want to print the values.
Because text is a vector of strings, I can use Vec::join to join all the strings on a single space into a new string to print.
Inside the echor program, clap will be creating the vector.
To demonstrate how Vec::join works, I’ll show you how to create a vector using the vec! macro:

let text = vec!["Hello", "world"];

Note
The values in Rust vectors must all be of the same type. Dynamic languages often allow lists to mix types like strings and numbers, but Rust will complain about “mismatched types.” Here I want a list of literal strings, which must be enclosed in double quotes. The str type in Rust represents a valid UTF-8 string. I’ll have more to say about UTF in Chapter 4.


Vec::join will insert the given string between all the elements of the vector to create a new string.
I can use println! to print the new string to STDOUT followed by a 
newline:

println!("{}", text.join(" "));


It’s common practice in Rust documentation to present facts using assert! to say that something is true or assert_eq! to demonstrate that one thing is equivalent to another.
In the following code, I can assert that the result of text.join(" ") is equal to the string "Hello world":


assert_eq!(text.join(" "), "Hello world");


When the -n flag is present, the output should omit the newline.
I will instead use the print! macro, which does not add a newline, and I will choose to add either a newline or the empty string depending on the value of omit_newline.
You might expect me to write something like this:

fn main() {
    let matches = ...; // Same as before
    let text = matches.values_of_lossy("text").unwrap();
    let omit_newline = matches.is_present("omit_newline");

    let ending = "\n"; 
    if omit_newline {
        ending = ""; // This will not work 
    }
    print!("{}{}", text.join(" "), ending); 
}


Assume a default value of the newline.

Change the value to the empty string if the newline should be omitted.

Use print!, which will not add a newline to the output.


But if I try to run this code, Rust complains that I cannot reassign the value of 
ending:

$ cargo run -- Hello world
error[E0384]: cannot assign twice to immutable variable `ending`
  --> src/main.rs:27:9
   |
25 |     let ending = "\n";
   |         ------
   |         |
   |         first assignment to `ending`
   |         help: make this binding mutable: `mut ending`
26 |     if omit_newline {
27 |         ending = ""; // This will not work
   |         ^^^^^^^^^^^ cannot assign twice to immutable variable

As you saw in Chapter 1, Rust variables are immutable by default.
The compiler suggests adding mut to make the ending variable mutable to fix this error:


fn main() {
    let matches = ...; // Same as before
    let text = matches.values_of_lossy("text").unwrap();
    let omit_newline = matches.is_present("omit_newline");

    let mut ending = "\n"; 
    if omit_newline {
        ending = "";
    }
    print!("{}{}", text.join(" "), ending);
}


Add mut to make this a mutable value.


There’s a much better way to write this.
In Rust, if is an expression, not a statement as it is in languages like C and Java.1
An expression returns a value, but a statement does not.
Here’s a more Rustic way to write this:


let ending = if omit_newline { "" } else { "\n" };

Note
An if without an else will return the unit type. The same is true for a function without a return type, so the main function in this program returns the unit type.



Since I use ending in only one place, I don’t need to assign it to a variable.
Here is the final way I would write the main function:

fn main() {
    let matches = ...; // Same as before
    let text = matches.values_of_lossy("text").unwrap();
    let omit_newline = matches.is_present("omit_newline");
    print!("{}{}", text.join(" "), if omit_newline { "" } else { "\n" });
}


With these changes, the program appears to work correctly; however, I’m not willing to stake my reputation on this.
I need to, as the Russian saying goes, “Доверяй, но проверяй.”2
This requires that I write some tests to run my program with various inputs and verify that it produces the same output as the original echo program.
















Writing Integration Tests

Again, we will use the assert_cmd crate for testing echor.
We’ll also use the 
predicates crate, as it will make writing some of the tests easier.
Update Cargo.toml with the following:


[package]
name = "echor"
version = "0.1.0"
edition = "2021"

[dependencies]
clap = "2.33"

[dev-dependencies]
assert_cmd = "2"
predicates = "2"

I often write tests that ensure my programs fail when run incorrectly.
For instance, this program ought to fail and print help documentation when provided no arguments.
Create the tests directory, and then start your tests/cli.rs with the following:


use assert_cmd::Command;
use predicates::prelude::*; 

#[test]
fn dies_no_args() {
    let mut cmd = Command::cargo_bin("echor").unwrap();
    cmd.assert() 
        .failure()
        .stderr(predicate::str::contains("USAGE"));
}


Import the predicates crate.

Run the program with no arguments and assert that it fails and prints a usage statement to STDERR.


Note
I often put the word dies somewhere in the test name to make it clear that the program is expected to fail under the given conditions. If I run cargo test dies, then Cargo will run all the tests with names containing the string dies.


Let’s also add a test to ensure the program exits successfully when provided an 
argument:

#[test]
fn runs() {
    let mut cmd = Command::cargo_bin("echor").unwrap();
    cmd.arg("hello").assert().success(); 
}


Run echor with the argument hello and verify it exits successfully.










Creating the Test Output Files

I can now run cargo test to verify that I have a program that runs, validates user input, and prints usage.
Next, I would like to ensure that the program creates the same output as echo.
To start, I need to capture the output from the original echo for various inputs so that I can compare these to the output from my program.
In the 02_echor directory of the GitHub repository for the book, you’ll find a bash script called mk-outs.sh that I used to generate the output from echo for various arguments.
You can see that, even with such a simple tool, there’s still a decent amount of cyclomatic complexity, which refers to the various ways all the parameters can be combined.
I need to check one or more text arguments both with and without the newline option:


$ cat mk-outs.sh
#!/usr/bin/env bash 

OUTDIR="tests/expected" 
[[ ! -d "$OUTDIR" ]] && mkdir -p "$OUTDIR" 

echo "Hello there" > $OUTDIR/hello1.txt 
echo "Hello"  "there" > $OUTDIR/hello2.txt 
echo -n "Hello  there" > $OUTDIR/hello1.n.txt 
echo -n "Hello" "there" > $OUTDIR/hello2.n.txt 


A special comment (aka a shebang) that tells the operating system to use the environment to execute bash for the following code.


Define a variable for the output directory.

Test if the output directory does not exist and create it if needed.

One argument with two words.

Two arguments separated by more than one space.

One argument with two spaces and no newline.

Two arguments with no newline.


If you are working on a Unix platform, you can copy this program to your project directory and run it like so:

$ bash mk-outs.sh

It’s also possible to execute the program directly, but you may need to execute chmod +x mk-outs.sh if you get a permission denied error:

$ ./mk-outs.sh

If this worked, you should now have a tests/expected directory with the following 
contents:

$ tree tests
tests
├── cli.rs
└── expected
    ├── hello1.n.txt
    ├── hello1.txt
    ├── hello2.n.txt
    └── hello2.txt

1 directory, 5 files

If you are working on a Windows platform, then I recommend you copy the directory and files into your project.















Comparing Program Output

Now that we have some test files, it’s time to compare the output from echor to the output from the original echo.
The first output file was generated with the input Hello there as a single string, and the output was captured into the file tests/expected/⁠hello1.txt.
In the following test, I will run echor with the same argument and compare the output to the contents of that file.
I must add use std::fs to tests/cli.rs to bring in the standard filesystem module.
I replace the runs function with the 
following:


#[test]
fn hello1() {
    let outfile = "tests/expected/hello1.txt"; 
    let expected = fs::read_to_string(outfile).unwrap(); 
    let mut cmd = Command::cargo_bin("echor").unwrap(); 
    cmd.arg("Hello there").assert().success().stdout(expected); 
}


This is the output from echo generated by mk-outs.sh.

Use fs::read_to_string to read the contents of the file. This returns a Result that might contain a string if all goes well. Use the Result::unwrap method with the assumption that this will work.


Create a Command to run echor in the current crate.

Run the program with the given argument and assert it finishes successfully and that STDOUT is the expected value.

Warning
Using fs::read_to_string is a convenient way to read a file into memory, but it’s also an easy way to crash your program—and possibly your computer—if you happen to read a file that exceeds your available memory. You should only use this function with small files. As Ted Nelson says, “The good news about computers is that they do what you tell them to do. The bad news is that they do what you tell them to do.”



If I run cargo test now, I should see output from two tests (in no particular order):

running 2 tests
test hello1 ... ok
test dies_no_args ... ok















Using the Result Type

I’ve been using the Result::unwrap method in a way that assumes each fallible call will succeed.
For example, in the hello1 function, I assumed that the output file exists and can be opened and read into a string.
During my limited testing, this may be the case, but it’s dangerous to make such assumptions.
I should be more cautious, so I’m going to create a type alias called TestResult.
This will be a specific type of Result that is either an Ok that always contains the unit type or some value that implements the std::error::Error trait:

type TestResult = Result<(), Box<dyn std::error::Error>>;


In the preceding code, Box indicates that the error will live inside a kind of pointer where the memory is dynamically allocated on the heap rather than the stack, and dyn indicates that the method calls on the std::error::Error trait are dynamically dispatched.
That’s really a lot of information, and I don’t blame you if your eyes glazed over.
In short, I’m saying that the Ok part of TestResult will only ever hold 
the unit type, and the Err part can hold anything that implements the std::error::Error trait.
These concepts are more thoroughly explained in Programming Rust.

Stack and Heap Memory
Before programming in Rust, I’d only ever considered one amorphous idea of computer memory.
Having studiously avoided languages that required me to allocate and free memory, I was only vaguely aware of the efforts that dynamic languages make to hide these complexities from me.
In Rust, I’ve learned that not all memory is accessed in the same way.
First there is the stack, where items of known sizes are accessed in a particular order.
The classic analogy is to a stack of cafeteria trays where new items go on top and are taken back off the top in last-in, first-out (LIFO) order.
Items on the stack have a fixed, known size, making it possible for Rust to set aside a particular chunk of memory and find it quickly.


The other type of memory is the heap, where the sizes of the values may change over time.
For instance, the documentation for the Vec (vector) type describes this structure as a “contiguous growable array type.”
Growable is the key word here, as the number and sizes of the elements in a vector can change during the lifetime of the program.
Rust makes an initial estimation of the amount of memory it needs for the vector.
If the vector grows beyond the original allocation, Rust will find another chunk of memory to hold the data.
To find the memory where the data lives, Rust stores the memory address on the stack.
This is called a pointer because it points to the actual data, and so is also said to be a reference to the data.
Rust knows how to dereference a Box to find the data.



Up to this point, my test functions have returned the unit type.
Now they will return a TestResult, changing my test code in some subtle ways.
Previously I used Result::unwrap to unpack Ok values and panic in the event of an Err, causing the test to fail.
In the following code, I replace unwrap with the ? operator to either unpack an Ok value or propagate the Err value to the return type.
That is, this will cause the function to return the Err variant of Option to the caller, which will in turn cause the test to fail.
If all the code in a test function runs successfully, I return Ok containing the unit type to indicate the test passes.
Note that while Rust does have the return keyword to return a value from a function, the idiom is to omit the semicolon from the last expression to implicitly return that result.
Update your tests/cli.rs to the following:


use assert_cmd::Command;
use predicates::prelude::*;
use std::fs;

type TestResult = Result<(), Box<dyn std::error::Error>>;

#[test]
fn dies_no_args() -> TestResult {
    let mut cmd = Command::cargo_bin("echor")?; 
    cmd.assert()
        .failure()
        .stderr(predicate::str::contains("USAGE"));
    Ok(()) 
}

#[test]
fn hello1() -> TestResult {
    let expected = fs::read_to_string("tests/expected/hello1.txt")?;
    let mut cmd = Command::cargo_bin("echor")?;
    cmd.arg("Hello there").assert().success().stdout(expected);
    Ok(())
}


Use ? instead of Result::unwrap to unpack an Ok value or propagate an Err.

Omit the final semicolon to return this value.


The next test passes two arguments, "Hello" and "there", and expects the program to print “Hello there”:

#[test]
fn hello2() -> TestResult {
    let expected = fs::read_to_string("tests/expected/hello2.txt")?;
    let mut cmd = Command::cargo_bin("echor")?;
    cmd.args(vec!["Hello", "there"]) 
        .assert()
        .success()
        .stdout(expected);
    Ok(())
}


Use the Command::args method to pass a vector of arguments rather than a single string value.



I have a total of four files to check, so it behooves me to write a helper function.
I’ll call it run and will pass it the argument strings along with the expected output file.
Rather than use vec! to create a vector for the arguments, I’m going to use a std::slice.
Slices are a bit like vectors in that they represent a list of values, but they cannot be resized after creation:

fn run(args: &[&str], expected_file: &str) -> TestResult { 
    let expected = fs::read_to_string(expected_file)?; 
    Command::cargo_bin("echor")? 
        .args(args)
        .assert()
        .success()
        .stdout(expected);
    Ok(()) 
}


The args will be a slice of &str values, and the expected_file will be a &str. The return value is a TestResult.

Try to read the contents of the expected_file into a string.

Attempt to run echor in the current crate with the given arguments and assert that STDOUT is the expected value.

If all the previous code worked, return Ok containing the unit type.

Note
You will find that Rust has many types of “string” variables. The type str is appropriate here for literal strings in the source code. The & shows that I intend only to borrow the string for a little while. I’ll have more to say about strings, borrowing, and ownership later.



Following is the final contents of tests/cli.rs showing how I use the helper function to run all four tests:

use assert_cmd::Command;
use predicates::prelude::*;
use std::fs;

type TestResult = Result<(), Box<dyn std::error::Error>>;

#[test]
fn dies_no_args() -> TestResult {
    Command::cargo_bin("echor")?
        .assert()
        .failure()
        .stderr(predicate::str::contains("USAGE"));
    Ok(())
}

fn run(args: &[&str], expected_file: &str) -> TestResult {
    let expected = fs::read_to_string(expected_file)?;
    Command::cargo_bin("echor")?
        .args(args)
        .assert()
        .success()
        .stdout(expected);
    Ok(())
}

#[test]
fn hello1() -> TestResult {
    run(&["Hello there"], "tests/expected/hello1.txt") 
}

#[test]
fn hello2() -> TestResult {
    run(&["Hello", "there"], "tests/expected/hello2.txt") 
}

#[test]
fn hello1_no_newline() -> TestResult {
    run(&["Hello  there", "-n"], "tests/expected/hello1.n.txt") 
}

#[test]
fn hello2_no_newline() -> TestResult {
    run(&["-n", "Hello", "there"], "tests/expected/hello2.n.txt") 
}


Run the program with a single string value as input. Note the lack of a terminating semicolon, as this function will return whatever the run function returns.


Run the program with two strings as input.

Run the program with a single string value as input and the -n flag to omit the newline. Note that there are two spaces between the words.

Run the program with two strings as input and the -n flag appearing first.


As you can see, I can write as many functions as I like in tests/cli.rs.
Only those marked with #[test] are run when testing.
If you run cargo test now, you should see five passing tests:

running 5 tests
test dies_no_args ... ok
test hello1 ... ok
test hello1_no_newline ... ok
test hello2_no_newline ... ok
test hello2 ... ok





















Summary

Now you have written about 30 lines of Rust code in src/main.rs for the echor program and five tests in tests/cli.rs to verify that your program meets some measure of specification.
Consider what you’ve achieved:




You learned that basic program output is printed to STDOUT and errors should be printed to STDERR.


You’ve written a program that takes the options -h or --help to produce help, -V or --version to show the program’s version, and -n to omit a newline along with one or more positional command-line arguments.


You wrote a program that will print usage documentation when run with the wrong arguments or with the -h|--help flag.


You learned how to print all the positional command-line arguments joined on spaces.


You learned to use the print! macro to omit the trailing newline if the -n flag is present.


You can run integration tests to confirm that your program replicates the output from echo for at least four test cases covering one or two inputs both with and without the trailing newline.


You learned to use several Rust types, including the unit type, strings, vectors, slices, Option, and Result, as well as how to create a type alias for a specific type of Result called a TestResult.


You used a Box to create a smart pointer to heap memory. This required digging a bit into the differences between the stack—where variables have a fixed, known size and are accessed in LIFO order—and the heap—where variables are accessed through a pointer and their sizes may change during program execution.


You learned how to read the entire contents of a file into a string.


You learned how to execute an external command from within a Rust program, check the exit status, and verify the contents of both STDOUT and STDERR.



All this, and you’ve done it while writing in a language that simply will not allow you to make common mistakes that lead to buggy programs or security vulnerabilities.
Feel free to give yourself a little high five or enjoy a slightly evil mwuhaha chuckle as you consider how Rust will help you conquer the world.
Now that I’ve shown you how to organize and write tests and data, I’ll use the tests earlier in the next program so I can start using test-driven development, where I write tests first then write code to satisfy the tests.









1 Python has both an if statement and an if expression.
² “Trust, but verify.” This rhymes in Russian and so sounds cooler than when Reagan used it in the 1980s during nuclear disarmament talks with the USSR.