Now I laugh and make a fortune / Off the same ones that I tortured
They Might Be Giants, “Kiss Me, Son of God” (1988)
In this chapter, you will create a Rust version of the fortune
program that will print a randomly selected aphorism or bit of trivia or interesting ASCII art1 from a database of text files.
The program gets its name from a fortune cookie, a crisp cookie that contains a small piece of paper printed with a short bit of text that might be a fortune like “You will take a trip soon” or that might be a short joke or saying.
When I was first learning to use a Unix terminal in my undergraduate days,2 a successful login would often include the output from fortune
.
You will learn how to do the following:
Use the Path
and PathBuf
structs to represent system paths
Parse records of text spanning multiple lines from a file
Use randomness and control it with seeds
Use the OsStr
and OsString
types to represent filenames
I will start by describing how fortune
works so you will have an idea of what your version will need to do.
You may first need to install the program,3 as it is not often present by default on most systems.
Here’s a bit of the manual page, which you can read with man fortune
:
NAME fortune - print a random, hopefully interesting, adage SYNOPSIS fortune [-acefilosuw] [-n length] [ -m pattern] [[n%] file/dir/all] DESCRIPTION When fortune is run with no arguments it prints out a random epigram. Epigrams are divided into several categories, where each category is sub-divided into those which are potentially offensive and those which are not.
The original program has many options, but the challenge program will be concerned only with the following:
-m pattern Print out all fortunes which match the basic regular expression pattern. The syntax of these expressions depends on how your system defines re_comp(3) or regcomp(3), but it should neverthe- less be similar to the syntax used in grep(1). The fortunes are output to standard output, while the names of the file from which each fortune comes are printed to standard error. Either or both can be redirected; if standard output is redirected to a file, the result is a valid fortunes database file. If standard error is also redirected to this file, the result is still valid, but there will be ''bogus'' fortunes, i.e. the filenames themselves, in parentheses. This can be use- ful if you wish to remove the gathered matches from their origi- nal files, since each filename-record will precede the records from the file it names. -i Ignore case for -m patterns.
When the fortune
program is run with no arguments, it will randomly choose and print some text:
$ fortune Laughter is the closest distance between two people. -- Victor Borge
Whence does this text originate? The manual page notes that you can supply one or more files or directories of the text sources. If no files are given, then the program will read from some default location. On my laptop, this is what the manual page says:
FILES Note: these are the defaults as defined at compile time. /opt/homebrew/Cellar/fortune/9708/share/games/fortunes Directory for inoffensive fortunes. /opt/homebrew/Cellar/fortune/9708/share/games/fortunes/off Directory for offensive fortunes.
I created a few representative files in the 12_fortuner/tests/inputs directory for testing purposes, along with an empty directory:
$ cd 12_fortuner $ ls tests/inputs/ ascii-art empty/ jokes literature quotes
Use head
to look at the structure of a file.
A fortune record can span multiple lines and is terminated with a percent sign (%
) on a line by itself:
$ head -n 9 tests/inputs/jokes Q. What do you call a head of lettuce in a shirt and tie? A. Collared greens. % Q: Why did the gardener quit his job? A: His celery wasn't high enough. % Q. Why did the honeydew couple get married in a church? A. Their parents told them they cantaloupe. %
You can tell fortune
to read a particular file like tests/inputs/ascii-art, but first you will need to use the program strfile
to create index files for randomly selecting the text records.
I have provided a bash
script called mk-dat.sh in the 12_fortuner directory that will index the files in the tests/inputs directory.
After running this program, each input file should have a companion file ending in .dat:
$ ls -1 tests/inputs/ ascii-art ascii-art.dat empty/ jokes jokes.dat literature literature.dat quotes quotes.dat
Now you should be able to run the following command to, for instance, randomly select a bit of ASCII art. You may or may not see a cute frog:
$ fortune tests/inputs/ascii-art .--._.--. ( O O ) / . . \ .`._______.'. /( )\ _/ \ \ / / \_ .~ ` \ \ / / ' ~. { -. \ V / .- } _ _`. \ | | | / .'_ _ >_ _} | | | {_ _< /. - ~ ,_-' .^. `-_, ~ - .\ '-'|/ \|`-`
You can also supply the tests/inputs directory to tell fortune
to select a record from any of the files therein:
$ fortune tests/inputs A classic is something that everyone wants to have read and nobody wants to read. -- Mark Twain, "The Disappearance of Literature"
If a provided path does not exist, fortune
will immediately halt with an error.
Here I’ll use blargh for a nonexistent file:
$ fortune tests/inputs/jokes blargh tests/inputs/ascii-art blargh: No such file or directory
Oddly, if the input source exists but is not readable, one version of fortune
will complain that the file does not exist and produces no further output:
$ touch hammer && chmod 000 hammer $ fortune hammer hammer: No such file or directory
Another version explains that the file is not readable and informs the user that no fortunes were available for choosing:
$ fortune hammer /home/u20/kyclark/hammer: Permission denied No fortunes found
Using the -m
option, I can search for all the text records matching a given string.
The output will include a header printed to STDERR
listing the filename that contains the records followed by the records printed to STDOUT
.
For instance, here are all the quotes by Yogi Berra:
$ fortune -m 'Yogi Berra' tests/inputs/ (quotes) % It's like deja vu all over again. -- Yogi Berra % You can observe a lot just by watching. -- Yogi Berra %
If I search for Mark Twain and redirect both STDERR
and STDOUT
to files, I find that quotes of his are found in the literature and quotes files.
Note that the headers printed to STDERR
include only the basename of the file, like literature, and not the full path, like tests/inputs/literature:
$ fortune -m 'Mark Twain' tests/inputs/ 1>out 2>err $ cat err (literature) % (quotes) %
Searching is case-sensitive by default, so searching for lowercase yogi berra will return no results.
I must use the -i
flag to perform case-insensitive matching:
$ fortune -i -m 'yogi berra' tests/inputs/ (quotes) % It's like deja vu all over again. -- Yogi Berra % You can observe a lot just by watching. -- Yogi Berra %
While fortune
can do a few more things, this is the extent that the challenge program will re-create.
The challenge program for this chapter will be called fortuner
(pronounced for-chu-ner) for a Rust version of fortune
.
You should begin with cargo new fortuner
, and then add the following dependencies to your Cargo.toml:
[dependencies]
clap
=
"2.33"
rand
=
"0.8"
walkdir
=
"2"
regex
=
"1"
[dev-dependencies]
assert_cmd
=
"2"
predicates
=
"2"
Copy the book’s 12_fortuner/tests directory into your project.
Run cargo test
to build the program and run the tests, all of which should fail.
Update your src/main.rs to the following:
fn
main
()
{
if
let
Err
(
e
)
=
fortuner
::get_args
().
and_then
(
fortuner
::run
)
{
eprintln
!
(
"{}"
,
e
);
std
::process
::exit
(
1
);
}
}
Start your src/lib.rs with the following code to define the program’s arguments:
use
clap
:
:
{
App
,
Arg
}
;
use
std
::
error
::
Error
;
use
regex
:
:
{
Regex
,
RegexBuilder
}
;
type
MyResult
<
T
>
=
Result
<
T
,
Box
<
dyn
Error
>
>
;
#[
derive(Debug)
]
pub
struct
Config
{
sources
:
Vec
<
String
>
,
pattern
:
Option
<
Regex
>
,
seed
:
Option
<
u64
>
,
}
The sources
argument is a list of files or directories.
The pattern
to filter fortunes is an optional regular expression.
The seed
is an optional u64
value to control random selections.
As in Chapter 9, I use the -i|--insensitive
flag with RegexBuilder
, so you’ll note that my Config
does not have a place for this flag.
You can start your get_args
with the following:
pub
fn
get_args
()
->
MyResult
<
Config
>
{
let
matches
=
App
::new
(
"fortuner"
)
.
version
(
"0.1.0"
)
.
author
(
"Ken Youens-Clark <kyclark@gmail.com>"
)
.
about
(
"Rust fortune"
)
// What goes here?
.
get_matches
();
Ok
(
Config
{
sources
:...,
seed
:...,
pattern
:...,
})
}
I suggest you start your run
by printing the config:
pub
fn
run
(
config
:Config
)
->
MyResult
<
()
>
{
println
!
(
"{:#?}"
,
config
);
Ok
(())
}
Your program should be able to print a usage statement like the following:
$ cargo run -- -h fortuner 0.1.0 Ken Youens-Clark <kyclark@gmail.com> Rust fortune USAGE: fortuner [FLAGS] [OPTIONS] <FILE>... FLAGS: -h, --help Prints help information -i, --insensitive Case-insensitive pattern matching -V, --version Prints version information OPTIONS: -m, --pattern <PATTERN> Pattern -s, --seed <SEED> Random seed ARGS: <FILE>... Input files or directories
Unlike the original fortune
, the challenge program will require one or more input files or directories.
When run with no arguments, it should halt and print the usage:
$ cargo run error: The following required arguments were not provided: <FILE>... USAGE: fortuner [FLAGS] [OPTIONS] <FILE>...
Verify that the arguments are parsed correctly:
$ cargo run -- ./tests/inputs -m 'Yogi Berra' -s 1 Config { sources: [ "./tests/inputs",], pattern: Some(
Yogi Berra, ), seed: Some(
1, ), }
An invalid regular expression should be rejected at this point. As noted in Chapter 9, for instance, a lone asterisk is not a valid regex:
$ cargo run -- ./tests/inputs -m "*" Invalid --pattern "*"
Likewise, any value for the --seed
that cannot be parsed as a u64
should also be
rejected:
$ cargo run -- ./tests/inputs -s blargh "blargh" not a valid integer
This means you will once again need some way to parse and validate a command-line argument as an integer.
You’ve written functions like this in several previous chapters, but parse_positive_int
from Chapter 4 is probably most similar to what you need.
In this case, however, 0
is an acceptable value.
You might start with this:
fn
parse_u64
(
val
:&
str
)
->
MyResult
<
u64
>
{
unimplemented
!
();
}
Add the following unit test to src/lib.rs:
#[cfg(test)]
mod
tests
{
use
super
::parse_u64
;
#[test]
fn
test_parse_u64
()
{
let
res
=
parse_u64
(
"a"
);
assert
!
(
res
.
is_err
());
assert_eq
!
(
res
.
unwrap_err
().
to_string
(),
"
\"
a
\"
not a valid integer"
);
let
res
=
parse_u64
(
"0"
);
assert
!
(
res
.
is_ok
());
assert_eq
!
(
res
.
unwrap
(),
0
);
let
res
=
parse_u64
(
"4"
);
assert
!
(
res
.
is_ok
());
assert_eq
!
(
res
.
unwrap
(),
4
);
}
}
Stop here and get your code working to this point. Be sure your program can pass cargo test parse_u64
.
Here is how I wrote the parse_u64
function:
fn
parse_u64
(
val
:
&
str
)
->
MyResult
<
u64
>
{
val
.
parse
(
)
.
map_err
(
|
_
|
format
!
(
"
\"
{}
\"
not a valid integer
"
,
val
)
.
into
(
)
)
}
Parse the value as a u64
, which Rust infers from the return type.
In the event of an error, create a useful error message using the given value.
Following is how I define the arguments in my get_args
:
pub
fn
get_args
()
->
MyResult
<
Config
>
{
let
matches
=
App
::new
(
"fortuner"
)
.
version
(
"0.1.0"
)
.
author
(
"Ken Youens-Clark <kyclark@gmail.com>"
)
.
about
(
"Rust fortune"
)
.
arg
(
Arg
::with_name
(
"sources"
)
.
value_name
(
"FILE"
)
.
multiple
(
true
)
.
required
(
true
)
.
help
(
"Input files or directories"
),
)
.
arg
(
Arg
::with_name
(
"pattern"
)
.
value_name
(
"PATTERN"
)
.
short
(
"m"
)
.
long
(
"pattern"
)
.
help
(
"Pattern"
),
)
.
arg
(
Arg
::with_name
(
"insensitive"
)
.
short
(
"i"
)
.
long
(
"insensitive"
)
.
help
(
"Case-insensitive pattern matching"
)
.
takes_value
(
false
),
)
.
arg
(
Arg
::with_name
(
"seed"
)
.
value_name
(
"SEED"
)
.
short
(
"s"
)
.
long
(
"seed"
)
.
help
(
"Random seed"
),
)
.
get_matches
();
I use the --insensitive
flag with regex::RegexBuilder
to create a regular expression that might be case-insensitive before returning the Config
:
let
pattern
=
matches
.
value_of
(
"
pattern
"
)
.
map
(
|
val
|
{
RegexBuilder
::
new
(
val
)
.
case_insensitive
(
matches
.
is_present
(
"
insensitive
"
)
)
.
build
(
)
.
map_err
(
|
_
|
format
!
(
"
Invalid --pattern
\"
{}
\"
"
,
val
)
)
}
)
.
transpose
(
)
?
;
Use Option::map
to handle Some(val)
.
The RegexBuilder::case_insensitive
method will cause the regex to disregard case in comparisons when the insensitive
flag is present.
The RegexBuilder::build
method will compile the regex.
If build
returns an error, use Result::map_err
to create an error message stating that the given pattern is invalid.
The result of Option::map
will be an Option<Result>
, and Option::transpose
will turn this into a Result<Option>
. Use ?
to fail on an invalid regex.
Finally, I return the Config
:
Ok
(
Config
{
sources
:
matches
.
values_of_lossy
(
"
sources
"
)
.
unwrap
(
)
,
seed
:
matches
.
value_of
(
"
seed
"
)
.
map
(
parse_u64
)
.
transpose
(
)
?
,
pattern
,
}
)
}
You are free to write your solution however you see fit so long as it passes the integration tests.
This is a rather complicated program, so I’m going to break it into many small, testable functions to help you arrive at a solution.
If you want to follow my lead, then the next order of business is finding the input files from the given sources, which might be filenames or directories.
When a source is a directory, all the files in the directory will be used.
To read the fortune files, the fortune
program requires the *.dat files created by strfile
.
These are binary files that contain data for randomly accessing the records.
The challenge program will not use these and so should skip them, if present.
If you ran the mk-dat.sh program, you can either remove the *.dat files from tests/inputs or include logic in your program to skip them.
I decided to write a function to find all the files in a list of paths provided by the user.
While I could return the files as strings, I want to introduce you to a couple of useful structs Rust has for representing paths.
The first is Path
, which, according to the documentation, “supports a number of operations for inspecting a path, including breaking the path into its components (separated by /
on Unix and by either /
or \
on Windows), extracting the file name, determining whether the path is absolute, and so on.”
That sounds really useful, so you might think my function should return the results as Path
objects, but the documentation notes: “This is an unsized type,
meaning that it must always be used behind a pointer like &
or Box
. For an owned version of this type, see PathBuf
.”
This leads us to PathBuf
, the second useful module for representing paths.
Just as String
is an owned, modifiable version of &str
, PathBuf
is an owned, modifiable version of Path
.
Returning a Path
from my function would lead to compiler errors, as my code would be trying to reference dropped values, but there will be no such problem returning a PathBuf
.
You are not required to use either of these structs, but they will make your program portable across operating systems and will save you a lot of work that’s been done to parse paths correctly.
Following is the signature of my find_files
function, which you are welcome to use.
Be sure to add use std::path::PathBuf
to your imports:
fn
find_files
(
paths
:&
[
String
])
->
MyResult
<
Vec
<
PathBuf
>>
{
unimplemented
!
();
}
Here is a unit test called test_find_files
that you can add to your tests
module:
#[
cfg(test)
]
mod
tests
{
use
super
:
:
{
find_files
,
parse_u64
}
;
#[
test
]
fn
test_parse_u64
(
)
{
}
// Same as before
#[
test
]
fn
test_find_files
(
)
{
// Verify that the function finds a file known to exist
let
res
=
find_files
(
&
[
"
./tests/inputs/jokes
"
.
to_string
(
)
]
)
;
assert
!
(
res
.
is_ok
(
)
)
;
let
files
=
res
.
unwrap
(
)
;
assert_eq
!
(
files
.
len
(
)
,
1
)
;
assert_eq
!
(
files
.
get
(
0
)
.
unwrap
(
)
.
to_string_lossy
(
)
,
"
./tests/inputs/jokes
"
)
;
// Fails to find a bad file
let
res
=
find_files
(
&
[
"
/path/does/not/exist
"
.
to_string
(
)
]
)
;
assert
!
(
res
.
is_err
(
)
)
;
// Finds all the input files, excludes ".dat"
let
res
=
find_files
(
&
[
"
./tests/inputs
"
.
to_string
(
)
]
)
;
assert
!
(
res
.
is_ok
(
)
)
;
// Check number and order of files
let
files
=
res
.
unwrap
(
)
;
assert_eq
!
(
files
.
len
(
)
,
5
)
;
let
first
=
files
.
get
(
0
)
.
unwrap
(
)
.
display
(
)
.
to_string
(
)
;
assert
!
(
first
.
contains
(
"
ascii-art
"
)
)
;
let
last
=
files
.
last
(
)
.
unwrap
(
)
.
display
(
)
.
to_string
(
)
;
assert
!
(
last
.
contains
(
"
quotes
"
)
)
;
// Test for multiple sources, path must be unique and sorted
let
res
=
find_files
(
&
[
"
./tests/inputs/jokes
"
.
to_string
(
)
,
"
./tests/inputs/ascii-art
"
.
to_string
(
)
,
"
./tests/inputs/jokes
"
.
to_string
(
)
,
]
)
;
assert
!
(
res
.
is_ok
(
)
)
;
let
files
=
res
.
unwrap
(
)
;
assert_eq
!
(
files
.
len
(
)
,
2
)
;
if
let
Some
(
filename
)
=
files
.
first
(
)
.
unwrap
(
)
.
file_name
(
)
{
assert_eq
!
(
filename
.
to_string_lossy
(
)
,
"
ascii-art
"
.
to_string
(
)
)
}
if
let
Some
(
filename
)
=
files
.
last
(
)
.
unwrap
(
)
.
file_name
(
)
{
assert_eq
!
(
filename
.
to_string_lossy
(
)
,
"
jokes
"
.
to_string
(
)
)
}
}
}
Add find_files
to the imports.
The tests/inputs/empty directory contains the empty, hidden file .gitkeep so that Git will track this directory. If you choose to ignore empty files, you can change the expected number of files from five to four.
Note that the find_files
function must return the paths in sorted order.
Different operating systems will return the files in different orders, which will lead to the fortunes being in different orders, leading to difficulties in testing.
You will nip the problem in the bud if you return the files in a consistent, sorted order.
Furthermore, the returned paths should be unique, and you can use a combination of Vec::sort
and Vec::dedup
for this.
Stop reading and write the function that will satisfy cargo test find_files
.
Next, update your run
function to print the found files:
pub
fn
run
(
config
:Config
)
->
MyResult
<
()
>
{
let
files
=
find_files
(
&
config
.
sources
)
?
;
println
!
(
"{:#?}"
,
files
);
Ok
(())
}
When given a list of existing, readable files, it should print them in order:
$ cargo run tests/inputs/jokes tests/inputs/ascii-art [ "tests/inputs/ascii-art", "tests/inputs/jokes", ]
Test your program to see if it will find the files (that don’t end with .dat) in the tests/inputs directory:
$ cargo run tests/inputs/ [ "tests/inputs/ascii-art", "tests/inputs/empty/.gitkeep", "tests/inputs/jokes", "tests/inputs/literature", "tests/inputs/quotes", ]
Previous challenge programs in this book would note unreadable or nonexistent files and move on, but fortune
dies immediately when given even one file it can’t use.
Be sure your program does the same if you provide an invalid file, such as the nonexistent blargh:
$ cargo run tests/inputs/jokes blargh tests/inputs/ascii-art blargh: No such file or directory (os error 2)
Note that my version of find_files
tries only to find files and does not try to open them, which means an unreadable file does not trigger a failure at this point:
$ touch hammer && chmod 000 hammer $ cargo run -- hammer [ "hammer", ]
Once you have found the input files, the next step is to read the records of text from them.
I wrote a function that accepts the list of found files and possibly returns a list of the contained fortunes.
When the program is run with the -m
option to find all the matching fortunes for a given pattern, I will need both the fortune text and the source filename, so I decided to create a struct called Fortune
to contain these.
If you want to use this idea, add the following to src/lib.rs, perhaps just after the Config
struct:
#[
derive(Debug)
]
struct
Fortune
{
source
:
String
,
text
:
String
,
}
The source
is the filename containing the record.
The text
is the contents of the record up to but not including the terminating percent sign (%
).
My read_fortunes
function accepts a list of input paths and possibly returns a vector of Fortune
structs.
In the event of a problem such as an unreadable file, the function will return an error.
If you would like to write this function, here is the signature you can use:
fn
read_fortunes
(
paths
:&
[
PathBuf
])
->
MyResult
<
Vec
<
Fortune
>>
{
unimplemented
!
();
}
Following is a test_read_fortunes
unit test you can add to the tests
module:
#[
cfg(test)
]
mod
tests
{
use
super
:
:
{
find_files
,
parse_u64
,
read_fortunes
,
Fortune
}
;
use
std
::
path
::
PathBuf
;
#[
test
]
fn
test_parse_u64
(
)
{
}
// Same as before
#[
test
]
fn
test_find_files
(
)
{
}
// Same as before
#[
test
]
fn
test_read_fortunes
(
)
{
// One input file
let
res
=
read_fortunes
(
&
[
PathBuf
::
from
(
"
./tests/inputs/jokes
"
)
]
)
;
assert
!
(
res
.
is_ok
(
)
)
;
if
let
Ok
(
fortunes
)
=
res
{
// Correct number and sorting
assert_eq
!
(
fortunes
.
len
(
)
,
6
)
;
assert_eq
!
(
fortunes
.
first
(
)
.
unwrap
(
)
.
text
,
"
Q. What do you call a head of lettuce in a shirt and tie?
\n
\
A. Collared greens.
"
)
;
assert_eq
!
(
fortunes
.
last
(
)
.
unwrap
(
)
.
text
,
"
Q: What do you call a deer wearing an eye patch?
\n
\
A: A bad idea (bad-eye deer).
"
)
;
}
// Multiple input files
let
res
=
read_fortunes
(
&
[
PathBuf
::
from
(
"
./tests/inputs/jokes
"
)
,
PathBuf
::
from
(
"
./tests/inputs/quotes
"
)
,
]
)
;
assert
!
(
res
.
is_ok
(
)
)
;
assert_eq
!
(
res
.
unwrap
(
)
.
len
(
)
,
11
)
;
}
}
Import read_fortunes
, Fortune
, and PathBuf
for testing.
The tests/inputs/jokes file contains an empty fortune that is expected to be removed.
Stop here and implement a version of the function that passes cargo test read_fortunes
.
Update run
to print, for instance, one of the found records:
pub
fn
run
(
config
:Config
)
->
MyResult
<
()
>
{
let
files
=
find_files
(
&
config
.
sources
)
?
;
let
fortunes
=
read_fortunes
(
&
files
)
?
;
println
!
(
"{:#?}"
,
fortunes
.
last
());
Ok
(())
}
When passed good input sources, the program should print a fortune like so:
$ cargo run tests/inputs Some( Fortune { source: "quotes", text: "You can observe a lot just by watching.\n-- Yogi Berra", }, )
When provided an unreadable file, such as the previously created hammer file, the program should die with a useful error message:
$ cargo run hammer hammer: Permission denied (os error 13)
The program will have two possible outputs.
When the user supplies a pattern
, the program should print all the fortunes matching the pattern; otherwise, the program should randomly select one fortune to print.
For the latter option, I wrote a
pick_fortune
function that takes some fortunes and an optional seed and returns an optional string:
fn
pick_fortune
(
fortunes
:&
[
Fortune
],
seed
:Option
<
u64
>
)
->
Option
<
String
>
{
unimplemented
!
();
}
My function uses the rand
crate to select the fortune using a random number generator (RNG), as described earlier in the chapter.
When there is no seed value, I use rand::thread_rng
to create an RNG that is seeded by the system.
When there is a seed value, I use rand::rngs::StdRng::seed_from_u64
.
Finally, I use SliceRandom::choose
with the RNG to select a fortune.
Following is how you can expand your tests
module to include the test_read_fortunes
unit test:
#[
cfg(test)
]
mod
tests
{
use
super
:
:
{
find_files
,
parse_u64
,
pick_fortune
,
read_fortunes
,
Fortune
,
}
;
use
std
::
path
::
PathBuf
;
#[
test
]
fn
test_parse_u64
(
)
{
}
// Same as before
#[
test
]
fn
test_find_files
(
)
{
}
// Same as before
#[
test
]
fn
test_read_fortunes
(
)
{
}
// Same as before
#[
test
]
fn
test_pick_fortune
(
)
{
// Create a slice of fortunes
let
fortunes
=
&
[
Fortune
{
source
:
"
fortunes
"
.
to_string
(
)
,
text
:
"
You cannot achieve the impossible without
\
attempting the absurd.
"
.
to_string
(
)
,
}
,
Fortune
{
source
:
"
fortunes
"
.
to_string
(
)
,
text
:
"
Assumption is the mother of all screw-ups.
"
.
to_string
(
)
,
}
,
Fortune
{
source
:
"
fortunes
"
.
to_string
(
)
,
text
:
"
Neckties strangle clear thinking.
"
.
to_string
(
)
,
}
,
]
;
// Pick a fortune with a seed
assert_eq
!
(
pick_fortune
(
fortunes
,
Some
(
1
)
)
.
unwrap
(
)
,
"
Neckties strangle clear thinking.
"
.
to_string
(
)
)
;
}
}
Import the pick_fortune
function for testing.
Supply a seed in order to verify that the pseudorandom selection is reproducible.
Stop reading and write the function that will pass cargo test pick_fortune
.
You can integrate this function into your run
like so:
pub
fn
run
(
config
:Config
)
->
MyResult
<
()
>
{
let
files
=
find_files
(
&
config
.
sources
)
?
;
let
fortunes
=
read_fortunes
(
&
files
)
?
;
println
!
(
"{:#?}"
,
pick_fortune
(
&
fortunes
,
config
.
seed
));
Ok
(())
}
Run your program with no seed and revel in the ensuing chaos of randomness:
$ cargo run tests/inputs/ Some( "Q: Why did the gardener quit his job?\nA: His celery wasn't high enough.", )
When provided a seed, the program should always select the same fortune:
$ cargo run tests/inputs/ -s 1 Some( "You can observe a lot just by watching.\n-- Yogi Berra", )
The tests I wrote are predicated on the fortunes being in a particular order. I wrote find_files
to return the files in sorted order, which means the list of fortunes passed to pick_fortune
are ordered first by their source filename and then by their order inside the file. If you use a different data structure to represent the fortunes or parse them in a different order, then you’ll need to change the tests to reflect your decisions. The key is to find a way to make your pseudorandom choices be predictable and testable.
You now have all the pieces for finishing the program.
The last step is to decide whether to print all the fortunes that match a given regular expression or to randomly select one fortune.
You can expand your run
function like so:
pub
fn
run
(
config
:Config
)
->
MyResult
<
()
>
{
let
files
=
find_files
(
&
config
.
sources
)
?
;
let
fortunes
=
read_fortunes
(
&
files
)
?
;
if
let
Some
(
pattern
)
=
config
.
pattern
{
for
fortune
in
fortunes
{
// Print all the fortunes matching the pattern
}
}
else
{
// Select and print one fortune
}
Ok
(())
}
Remember that the program should let the user know when there are no fortunes, such as when using the tests/inputs/empty directory:
$ cargo run tests/inputs/empty No fortunes found
That should be enough information for you to finish this program using the provided tests. This is a tough problem, but don’t give up.
For the following code, you will need to expand your src/lib.rs with the following imports and definitions:
use
clap
::{
App
,
Arg
};
use
rand
::prelude
::SliceRandom
;
use
rand
::{
rngs
::StdRng
,
SeedableRng
};
use
regex
::{
Regex
,
RegexBuilder
};
use
std
::{
error
::Error
,
ffi
::OsStr
,
fs
::{
self
,
File
},
io
::{
BufRead
,
BufReader
},
path
::PathBuf
,
};
use
walkdir
::WalkDir
;
type
MyResult
<
T
>
=
Result
<
T
,
Box
<
dyn
Error
>>
;
#[derive(Debug)]
pub
struct
Config
{
sources
:Vec
<
String
>
,
pattern
:Option
<
Regex
>
,
seed
:Option
<
u64
>
,
}
#[derive(Debug)]
pub
struct
Fortune
{
source
:String
,
text
:String
,
}
I’ll show you how I wrote each of the functions I described in the previous section, starting with the find_files
function.
You will notice that it filters out files that have the extension .dat using the type OsStr
, which is a Rust type for an operating system’s preferred representation of a string that might not be a valid UTF-8 string.
The type OsStr
is borrowed, and the owned version is OsString
.
These are similar to the Path
and PathBuf
distinctions.
Both versions encapsulate the complexities of dealing with filenames on both Windows and Unix platforms.
In the following code, you’ll see that I use Path::extension
, which returns Option<&OsStr>
:
fn
find_files
(
paths
:
&
[
String
]
)
->
MyResult
<
Vec
<
PathBuf
>
>
{
let
dat
=
OsStr
::
new
(
"
dat
"
)
;
let
mut
files
=
vec
!
[
]
;
for
path
in
paths
{
match
fs
::
metadata
(
path
)
{
Err
(
e
)
=
>
return
Err
(
format
!
(
"
{}: {}
"
,
path
,
e
)
.
into
(
)
)
,
Ok
(
_
)
=
>
files
.
extend
(
WalkDir
::
new
(
path
)
.
into_iter
(
)
.
filter_map
(
Result
::
ok
)
.
filter
(
|
e
|
{
e
.
file_type
(
)
.
is_file
(
)
&
&
e
.
path
(
)
.
extension
(
)
!
=
Some
(
dat
)
}
)
.
map
(
|
e
|
e
.
path
(
)
.
into
(
)
)
,
)
,
}
}
files
.
sort
(
)
;
files
.
dedup
(
)
;
Ok
(
files
)
}
Create an OsStr
value for the string dat.
Create a mutable vector for the results.
If fs::metadata
fails, return a useful error message.
Use Vec::extend
to add the results from WalkDir
to the results.
Use walkdir::WalkDir
to find all the entries from the starting path.
This will ignore any errors for unreadable files or directories, which is the behavior of the original program.
Take only regular files that do not have the .dat extension.
The walkdir::DirEntry::path
function returns a Path
, so convert it into a
PathBuf
.
Use Vec::sort
to sort the entries in place.
Use Vec::dedup
to remove consecutive repeated values.
Return the sorted, unique files.
The files found by the preceding function are the inputs to the read_fortunes
function:
fn
read_fortunes
(
paths
:
&
[
PathBuf
]
)
->
MyResult
<
Vec
<
Fortune
>
>
{
let
mut
fortunes
=
vec
!
[
]
;
let
mut
buffer
=
vec
!
[
]
;
for
path
in
paths
{
let
basename
=
path
.
file_name
(
)
.
unwrap
(
)
.
to_string_lossy
(
)
.
into_owned
(
)
;
let
file
=
File
::
open
(
path
)
.
map_err
(
|
e
|
{
format
!
(
"
{}: {}
"
,
path
.
to_string_lossy
(
)
.
into_owned
(
)
,
e
)
}
)
?
;
for
line
in
BufReader
::
new
(
file
)
.
lines
(
)
.
filter_map
(
Result
::
ok
)
{
if
line
=
=
"
%
"
{
if
!
buffer
.
is_empty
(
)
{
fortunes
.
push
(
Fortune
{
source
:
basename
.
clone
(
)
,
text
:
buffer
.
join
(
"
\n
"
)
,
}
)
;
buffer
.
clear
(
)
;
}
}
else
{
buffer
.
push
(
line
.
to_string
(
)
)
;
}
}
}
Ok
(
fortunes
)
}
Create mutable vectors for the fortunes and a record buffer.
Convert Path::file_name
from OsStr
to String
, using the lossy version in case this is not valid UTF-8. The result is a clone-on-write smart pointer, so use Cow::into_owned
to clone the data if it is not already owned.
Open the file or return an error message.
A sole percent sign (%
) indicates the end of a record.
If the buffer is not empty, set the text
to the buffer lines joined on newlines and then clear the buffer.
Otherwise, add the current line to the buffer
.
Here is how I wrote the pick_fortune
function:
fn
pick_fortune
(
fortunes
:
&
[
Fortune
]
,
seed
:
Option
<
u64
>
)
->
Option
<
String
>
{
if
let
Some
(
val
)
=
seed
{
let
mut
rng
=
StdRng
::
seed_from_u64
(
val
)
;
fortunes
.
choose
(
&
mut
rng
)
.
map
(
|
f
|
f
.
text
.
to_string
(
)
)
}
else
{
let
mut
rng
=
rand
::
thread_rng
(
)
;
fortunes
.
choose
(
&
mut
rng
)
.
map
(
|
f
|
f
.
text
.
to_string
(
)
)
}
}
Check if the user has supplied a seed.
If so, create a PRNG using the provided seed.
Use the PRNG to select one of the fortunes.
Otherwise, use a PRNG seeded by the system.
I can bring all these ideas together in my run
like so:
pub
fn
run
(
config
:
Config
)
->
MyResult
<
(
)
>
{
let
files
=
find_files
(
&
config
.
sources
)
?
;
let
fortunes
=
read_fortunes
(
&
files
)
?
;
if
let
Some
(
pattern
)
=
config
.
pattern
{
let
mut
prev_source
=
None
;
for
fortune
in
fortunes
.
iter
(
)
.
filter
(
|
fortune
|
pattern
.
is_match
(
&
fortune
.
text
)
)
{
if
prev_source
.
as_ref
(
)
.
map_or
(
true
,
|
s
|
s
!
=
&
fortune
.
source
)
{
eprintln
!
(
"
({})
\n
%
"
,
fortune
.
source
)
;
prev_source
=
Some
(
fortune
.
source
.
clone
(
)
)
;
}
println
!
(
"
{}
\n
%
"
,
fortune
.
text
)
;
}
}
else
{
println
!
(
"
{}
"
,
pick_fortune
(
&
fortunes
,
config
.
seed
)
.
or_else
(
|
|
Some
(
"
No fortunes found
"
.
to_string
(
)
)
)
.
unwrap
(
)
)
;
}
Ok
(
(
)
)
}
Initialize a mutable variable to remember the last fortune source.
Iterate over the found fortunes and filter for those matching the provided regular expression.
Print the source header if the current source is not the same as the previous one seen.
Store the current fortune source.
Print the text of the fortune.
Print a random fortune or a message that states that there are no fortunes to be found.
The fortunes are stored with embedded newlines that may cause the regular expression matching to fail if the sought-after phrase spans multiple lines. This mimics how the original fortune
works but may not match the expectations of the user.
At this point, the program passes all the provided tests. I provided more guidance on this challenge because of the many steps involved in finding and reading files and then printing all the matching records or using a PRNG to randomly select one. I hope you enjoyed that as much as I did.
Read the fortune
manual page to learn about other options your program can implement.
For instance, you could add the -n length
option to restrict fortunes to those less than the given length.
Knowing the lengths of the fortunes would be handy for implementing the -s
option, which picks only short fortunes.
As noted in the final solution, the regular expression matching may fail because of the embedded newlines in the fortunes.
Can you find a way around this limitation?
Randomness is a key aspect to many games that you could try to write. Perhaps start with a game where the user must guess a randomly selected number in a range; then you could move on to a more difficult game like “Wheel of Fortune,” where the user guesses letters in a randomly selected word or phrase. Many systems have the file /usr/share/dict/words that contains many thousands of English words; you could use that as a source, or you could create your own input file of words and phrases.
Programs that incorporate randomness are some of my favorites. Random events are very useful for creating games as well as machine learning programs, so it’s important to understand how to control and test randomness. Here’s some of what you learned in this chapter:
The fortune records span multiple lines and use a lone percent sign to indicate the end of the record. You learned to read the lines into a buffer and dump the buffer when the record or file terminator is found.
You can use the rand
crate to make pseudorandom choices that can be controlled using a seed value.
The Path
(borrowed) and PathBuf
(owned) types are useful abstractions for dealing with system paths on both Windows and Unix. They are similar to the &str
and String
types for dealing with borrowed and owned strings.
The names of files and directories may be invalid UTF-8, so Rust uses the types OsStr
(borrowed) and OsString
(owned) to represent these strings.
Using abstractions like Path
and OsStr
makes your Rust code more portable across operating systems.
In the next chapter, you’ll learn to manipulate dates as you create a terminal-based calendar program.
1 ASCII art is a term for graphics that use only ASCII text values.
2 This was in the 1990s, which I believe the kids nowadays refer to as “the late 1900s.”
3 On Ubuntu, sudo apt install fortune-mod
; on macOS, brew install fortune
.
4 Robert R. Coveyou, “Random Number Generation Is Too Important to Be Left to Chance,” Studies in Applied Mathematics 3(1969): 70–111.