The Danger of Blindly Trusting Compiler Help

Rust has a reputation of having good compiler error messages citation needed.

I generally agree! However, blindly following the hints given by the compiler may sometimes hurt beginners who don't fully understand the language. It doesn't help that due to Rust's excellent formatting of code suggestions citation needed, the suggestions really seem Correct and Canonical.

Case Study

Consider this problem:

Write a function that takes in a word and a mutable reference to an output string, then appends an asterisk to the output if the entire word is made up of ASCII characters.

They likely don't! But the pattern of "appending to an output" happens a lot, and that's the main focus here.

Here's how someone who is a beginner to Rust but is familiar with other programming languages might approach the problem:

fn (: , : &mut ) {
    // ...
}

With the following thought process:

My function needs to receive a string, which is a String in rust, and also a mutable reference to another String, which I know I can do by using &mut.

And here's how they might write the function body:

fn (: , : &mut ) {
    if .() {
         =  + ::("*");
    }
}

With the following thought process:

If the target is ASCII, I need to add an asterisk to the result. I read the documentation on std::string::String, and know that I can create the String for the asterisk by using String::from.

And now the beginner is trapped:

[K[0m[1m[38;5;9merror[E0369][0m[0m[1m: cannot add `String` to `&mut String`[0m
[0m [0m[0m[1m[38;5;12m--> [0m[0msrc/main.rs:3:25[0m
[0m  [0m[0m[1m[38;5;12m|[0m
[0m[1m[38;5;12m3[0m[0m [0m[0m[1m[38;5;12m|[0m[0m [0m[0m        result = result + String::from("*");[0m
[0m  [0m[0m[1m[38;5;12m| [0m[0m                 [0m[0m[1m[38;5;12m------[0m[0m [0m[0m[1m[38;5;9m^[0m[0m [0m[0m[1m[38;5;12m-----------------[0m[0m [0m[0m[1m[38;5;12mString[0m
[0m  [0m[0m[1m[38;5;12m| [0m[0m                 [0m[0m[1m[38;5;12m|[0m[0m      [0m[0m[1m[38;5;9m|[0m
[0m  [0m[0m[1m[38;5;12m| [0m[0m                 [0m[0m[1m[38;5;12m|[0m[0m      [0m[0m[1m[38;5;9m`+` cannot be used to concatenate a `&str` with a `String`[0m
[0m  [0m[0m[1m[38;5;12m| [0m[0m                 [0m[0m[1m[38;5;12m&mut String[0m
[0m  [0m[0m[1m[38;5;12m|[0m
[0m[1m[38;5;14mhelp[0m[0m: create an owned `String` on the left and add a borrow on the right[0m
[0m  [0m[0m[1m[38;5;12m|[0m
[0m[1m[38;5;12m3[0m[0m [0m[0m[1m[38;5;12m| [0m[0m        result = result[0m[0m[38;5;10m.to_owned()[0m[0m + [0m[0m[38;5;10m&[0m[0mString::from("*");[0m
[0m  [0m[0m[1m[38;5;12m|[0m[0m                        [0m[0m[38;5;10m+++++++++++[0m[0m   [0m[0m[38;5;10m+[0m

[K[0m[1mFor more information about this error, try `rustc --explain E0369`.[0m
[K[0m[0m[1m[31merror[0m[1m:[0m could not compile `testing` (bin "testing") due to previous error

Leading to:

fn (: , : &mut ) {
    if .() {
         = .() + &::("*");
    }
}

Resulting in:

[K[0m[1m[38;5;9merror[E0308][0m[0m[1m: mismatched types[0m
[0m [0m[0m[1m[38;5;12m--> [0m[0msrc/main.rs:3:18[0m
[0m  [0m[0m[1m[38;5;12m|[0m
[0m[1m[38;5;12m1[0m[0m [0m[0m[1m[38;5;12m|[0m[0m [0m[0mfn append_asterisk_if_ascii(target: String, result: &mut String) {[0m
[0m  [0m[0m[1m[38;5;12m| [0m[0m                                                    [0m[0m[1m[38;5;12m-----------[0m[0m [0m[0m[1m[38;5;12mexpected due to this parameter type[0m
[0m[1m[38;5;12m2[0m[0m [0m[0m[1m[38;5;12m|[0m[0m [0m[0m    if target.is_ascii() {[0m
[0m[1m[38;5;12m3[0m[0m [0m[0m[1m[38;5;12m|[0m[0m [0m[0m        result = result.to_owned() + &String::from("*");[0m
[0m  [0m[0m[1m[38;5;12m| [0m[0m                 [0m[0m[1m[38;5;9m^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^[0m[0m [0m[0m[1m[38;5;9mexpected `&mut String`, found `String`[0m
[0m  [0m[0m[1m[38;5;12m|[0m
[0m[1m[38;5;14mhelp[0m[0m: consider dereferencing here to assign to the mutably borrowed value[0m
[0m  [0m[0m[1m[38;5;12m|[0m
[0m[1m[38;5;12m3[0m[0m [0m[0m[1m[38;5;12m| [0m[0m        [0m[0m[38;5;10m*[0m[0mresult = result.to_owned() + &String::from("*");[0m
[0m  [0m[0m[1m[38;5;12m|[0m[0m         [0m[0m[38;5;10m+[0m

[K[0m[1mFor more information about this error, try `rustc --explain E0308`.[0m
[K[0m[0m[1m[31merror[0m[1m:[0m could not compile `testing` (bin "testing") due to previous error

Leading to:

fn (: , : &mut ) {
    if .() {
        * = .() + &::("*");
    }
}

And now the program compiles.

Space Analysis

Rustaceans People familiar with Rust might have been screaming for the past few paragraphs. Let's get the irrelevant (in this particular case study, but is very relevant in general and should be fixed) improvement out of the way:

fn (: , : &mut ) {
    // ...
}

This function signature is overly specific. Since the only thing we need target for is the method .is_ascii, which does not mutate the String, we can avoid taking ownership of the String and use a &str instead, which is an immutable string slice.

In a similar vein, result should be &mut str, since you can "provide" a &mut str with types other than a String, so enforcing the restriction that it must be a String object is needlessly restrictive when all we are doing is appending a &str.

Now to the meat and potatoes:

if target.is_ascii() {
    *result = result.to_owned() + &::("*");
}

This code is Not Good because of one reason: It makes plenty of unnecessary memory allocations. In fact, it makes 2 extra allocations per call, when in the ideal case it makes 0. The allocations are

  1. result.to_owned(), which creates a clone of result, which is a String.

  2. String::from("*"), which creates a clone of the &'static str that is "*".

Note that the + does not allocate a new string, but rather reuses the buffer of the LHS, which in this case is result.to_owned().

Let's find out! We'll use the heap profiling crate dhat-rs. Here's the code:

#[]
static : :: = ::;

fn (: &, : &mut ) {
    if .() {
        * = .() + &::("*");
    }
}

fn () -> <(), <dyn ::::>> {
    let  = ::(10);
    let  = ::::().().();

    ("ascii!", &mut );

    let  = ::::();
    !("  Max blocks:\t{}", .);
    !("   Max bytes:\t{}", .);
    !("Total blocks:\t{}", .);
    !(" Total bytes:\t{}", .);

    (())
}

A few things are of note here:

  • We ensure the result string has sufficient capacity before the loop to avoid growing the string during the loop. Note that in this case, ensuring capacity does not change the memory used because the function replaces result each call.

  • We create the heap profiler *after* creating the result string to avoid measuring the heap allocation during the creation of result.

Here are the results:

[1m$ [33mcargo[0m run --release
  Max blocks:   2
   Max bytes:   9
Total blocks:   2
 Total bytes:   9

From our analysis earlier we know why the maximum number of blocks is 2. The breakdown for maximum number of bytes is rather complicated, but the TLDR is that the minimum heap allocation size when growing a String is 8 bytes. If you're interested, here's the stack trace:

alloc::raw_vec::finish_grow                                                                                     (core/src/result.rs:0:23)
alloc::raw_vec::RawVec<T,A>::grow_amortized                                                                     (alloc/src/raw_vec.rs:404:19)
alloc::raw_vec::RawVec<T,A>::reserve::do_reserve_and_handle                                                     (alloc/src/raw_vec.rs:289:28)
alloc::raw_vec::RawVec<T,A>::reserve                                                                            (alloc/src/raw_vec.rs:293:13)
alloc::vec::Vec<T,A>::reserve                                                                                   (src/vec/mod.rs:909:18)
alloc::vec::Vec<T,A>::append_elements                                                                           (src/vec/mod.rs:1992:9)
<alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<&T,core::slice::iter::Iter<T>>>::spec_extend       (src/vec/spec_extend.rs:55:23)
alloc::vec::Vec<T,A>::extend_from_slice                                                                         (src/vec/mod.rs:2438:9)
alloc::string::String::push_str                                                                                 (alloc/src/string.rs:903:9)
<alloc::string::String as core::ops::arith::Add<&str>>::add                                                     (alloc/src/string.rs:2264:14)
testing::append_asterisk_if_ascii                                                                               (testing/src/main.rs:6:19)
testing::main                                                                                                   (testing/src/main.rs:10:5)

with the relevant constant being MIN_NON_ZERO_CAP.

The 8 bytes, plus the 1 byte for String::from("*"), makes 9 bytes.

Elementary, my dear duckson. String::from takes a different code path!

<alloc::alloc::Global as core::alloc::Allocator>::allocate    (alloc/src/alloc.rs:241:9)
alloc::raw_vec::RawVec<T,A>::allocate_in                      (alloc/src/raw_vec.rs:184:45)
alloc::raw_vec::RawVec<T,A>::with_capacity_in                 (alloc/src/raw_vec.rs:130:9)
alloc::vec::Vec<T,A>::with_capacity_in                        (src/vec/mod.rs:670:20)
<T as alloc::slice::hack::ConvertVec>::to_vec                 (alloc/src/slice.rs:162:25)
alloc::slice::hack::to_vec                                    (alloc/src/slice.rs:111:9)
alloc::slice::<impl [T]>::to_vec_in                           (alloc/src/slice.rs:441:9)
alloc::slice::<impl [T]>::to_vec                              (alloc/src/slice.rs:416:14)
alloc::slice::<impl alloc::borrow::ToOwned for [T]>::to_owned (alloc/src/slice.rs:823:14)
alloc::str::<impl alloc::borrow::ToOwned for str>::to_owned   (alloc/src/str.rs:209:62)
<alloc::string::String as core::convert::From<&str>>::from    (alloc/src/string.rs:2612:11)
testing::append_asterisk_if_ascii                             (testing/src/main.rs:6:40)
testing::main                                                 (testing/src/main.rs:10:5)

What this code path does exactly is outside my attention span pay grade.

Here it is:

fn (: &, : &mut ) {
    if .() {
        .('*');
    }
}

Or this:

fn (: &, : &mut ) {
    if .() {
        * += "*";
    }
}

Both do a whopping 0 extra allocations provided the result still has enough capacity to fit the new content. This is because the underlying buffer in result is reused, instead of a new string being created to replace it. We can see the effects more pronounced by doing more iterations of append_asterisk_if_ascii:

let  = 100_000;
let mut  = ::();
let  = ::::().().();

for _ in 0.. {
    append_asterisk_if_ascii("full ascii!", &mut );
}

let  = ::::();

which still results in 0 allocations for the better version, but for the original...

[1m$ [33mcargo[0m run --release
  Max blocks:   3
   Max bytes:   399995
Total blocks:   299992
 Total bytes:   14999949980

Time Analysis

Let's use hyperfine to benchmark two versions of our program!

First, we'll add the following to our Cargo.toml:

[features]
slow = []
fast = []

Then, we can include our two versions of append_asterisk_if_ascii:

#[(feature = "slow")]
fn append_asterisk_if_ascii(target: &, result: &mut ) {
    if target.is_ascii() {
        *result = result.to_string() + &::from("*");
    }
}

#[(feature = "fast")]
fn append_asterisk_if_ascii(target: &, result: &mut ) {
    if target.is_ascii() {
        *result += "*";
    }
}

We'll keep the same number of iterations as before, and run a comparison of the two features:

[1m$ [33mhyperfine[0m --warmup 5 [32m"cargo run --release --features fast"[0m [32m"cargo run --release --features slow"[0m

[2K[1mBenchmark [0m[1m1[0m: cargo run --release --features fast
[2K  Time ([1;32mmean[0m ± [32mσ[0m):     [1;32m 40.4 ms[0m ± [32m  0.8 ms[0m    [User: [34m30.5 ms[0m, System: [34m9.7 ms[0m]
  Range ([36mmin[0m … [35mmax[0m):   [36m 39.6 ms[0m … [35m 43.8 ms[0m    [2m67 runs[0m

[1mBenchmark [0m[1m2[0m: cargo run --release --features slow
[2K  Time ([1;32mmean[0m ± [32mσ[0m):     [1;32m759.6 ms[0m ± [32m  2.2 ms[0m    [User: [34m257.7 ms[0m, System: [34m496.8 ms[0m]
  Range ([36mmin[0m … [35mmax[0m):   [36m755.8 ms[0m … [35m762.3 ms[0m    [2m10 runs[0m

[1mSummary[0m
  [36mcargo run --release --features fast[0m ran
[1;32m   18.81[0m ± [32m0.36[0m times faster than [35mcargo run --release --features slow[0m

18.8 times faster. Cool!

Conclusion

To be clear, I am not saying you should disregard the hints or help messages given by the Rust compiler. However, you should not assume that the help provided is accurate or solves the underlying problem exactly. The diagnostics given by the compiler is usually narrowly focused, and local rather than global.

Unfortunately, this is a tough problem to solve. For people looking to learn Rust, there's no way around taking the time to grok the reason for the language's existence. Tools like Clippy help with writing idiomatic code, but it isn't a panacea either. You just have to write code, possibly bad code, and keep telling yourself there must be a better way!

codeintel::block_2deefab6141ca1e6
fn append_asterisk_if_ascii(target: String, result: &mut String)
target: String
alloc::string
pub struct String {
    vec: Vec<u8>,
}

A UTF-8–encoded, growable string.

String is the most common string type. It has ownership over the contents of the string, stored in a heap-allocated buffer (see Representation). It is closely related to its borrowed counterpart, the primitive [str].

Examples

You can create a String from a literal string with [String::from]:

let hello = String::from("Hello, world!");

You can append a char to a String with the [push] method, and append a [&str] with the [push_str] method:

let mut hello = String::from("Hello, ");

hello.push('w');
hello.push_str("orld!");

If you have a vector of UTF-8 bytes, you can create a String from it with the [from_utf8] method:

// some bytes, in a vector
let sparkle_heart = vec![240, 159, 146, 150];

// We know these bytes are valid, so we'll use `unwrap()`.
let sparkle_heart = String::from_utf8(sparkle_heart).unwrap();

assert_eq!("💖", sparkle_heart);

UTF-8

Strings are always valid UTF-8. If you need a non-UTF-8 string, consider OsString. It is similar, but without the UTF-8 constraint. Because UTF-8 is a variable width encoding, Strings are typically smaller than an array of the same chars:

// `s` is ASCII which represents each `char` as one byte
let s = "hello";
assert_eq!(s.len(), 5);

// A `char` array with the same contents would be longer because
// every `char` is four bytes
let s = ['h', 'e', 'l', 'l', 'o'];
let size: usize = s.into_iter().map(|c| size_of_val(&c)).sum();
assert_eq!(size, 20);

// However, for non-ASCII strings, the difference will be smaller
// and sometimes they are the same
let s = "💖💖💖💖💖";
assert_eq!(s.len(), 20);

let s = ['💖', '💖', '💖', '💖', '💖'];
let size: usize = s.into_iter().map(|c| size_of_val(&c)).sum();
assert_eq!(size, 20);

This raises interesting questions as to how s[i] should work. What should i be here? Several options include byte indices and char indices but, because of UTF-8 encoding, only byte indices would provide constant time indexing. Getting the ith char, for example, is available using [chars]:

let s = "hello";
let third_character = s.chars().nth(2);
assert_eq!(third_character, Some('l'));

let s = "💖💖💖💖💖";
let third_character = s.chars().nth(2);
assert_eq!(third_character, Some('💖'));

Next, what should s[i] return? Because indexing returns a reference to underlying data it could be &u8, &[u8], or something similar. Since we’re only providing one index, &u8 makes the most sense but that might not be what the user expects and can be explicitly achieved with [as_bytes()]:

// The first byte is 104 - the byte value of `'h'`
let s = "hello";
assert_eq!(s.as_bytes()[0], 104);
// or
assert_eq!(s.as_bytes()[0], b'h');

// The first byte is 240 which isn't obviously useful
let s = "💖💖💖💖💖";
assert_eq!(s.as_bytes()[0], 240);

Due to these ambiguities/restrictions, indexing with a usize is simply forbidden:

let s = "hello";

// The following will not compile!
println!("The first letter of s is {}", s[0]);

It is more clear, however, how &s[i..j] should work (that is, indexing with a range). It should accept byte indices (to be constant-time) and return a &str which is UTF-8 encoded. This is also called “string slicing”. Note this will panic if the byte indices provided are not character boundaries - see [is_char_boundary] for more details. See the implementations for [SliceIndex<str>] for more details on string slicing. For a non-panicking version of string slicing, see [get].

The [bytes] and [chars] methods return iterators over the bytes and codepoints of the string, respectively. To iterate over codepoints along with byte indices, use [char_indices].

Deref

String implements [Deref]<Target = [str]>, and so inherits all of [str]’s methods. In addition, this means that you can pass a String to a function which takes a [&str] by using an ampersand (&):

fn takes_str(s: &str) { }

let s = String::from("Hello");

takes_str(&s);

This will create a [&str] from the String and pass it in. This conversion is very inexpensive, and so generally, functions will accept [&str]s as arguments unless they need a String for some specific reason.

In certain cases Rust doesn’t have enough information to make this conversion, known as [Deref] coercion. In the following example a string slice &'a str implements the trait TraitExample, and the function example_func takes anything that implements the trait. In this case Rust would need to make two implicit conversions, which Rust doesn’t have the means to do. For that reason, the following example will not compile.

trait TraitExample {}

impl<'a> TraitExample for &'a str {}

fn example_func<A: TraitExample>(example_arg: A) {}

let example_string = String::from("example_string");
example_func(&example_string);

There are two options that would work instead. The first would be to change the line example_func(&example_string); to example_func(example_string.as_str());, using the method [as_str()] to explicitly extract the string slice containing the string. The second way changes example_func(&example_string); to example_func(&*example_string);. In this case we are dereferencing a String to a [str], then referencing the [str] back to [&str]. The second way is more idiomatic, however both work to do the conversion explicitly rather than relying on the implicit conversion.

Representation

A String is made up of three components: a pointer to some bytes, a length, and a capacity. The pointer points to the internal buffer which String uses to store its data. The length is the number of bytes currently stored in the buffer, and the capacity is the size of the buffer in bytes. As such, the length will always be less than or equal to the capacity.

This buffer is always stored on the heap.

You can look at these with the [as_ptr], [len], and [capacity] methods:

let story = String::from("Once upon a time...");

// Deconstruct the String into parts.
let (ptr, len, capacity) = story.into_raw_parts();

// story has nineteen bytes
assert_eq!(19, len);

// We can re-build a String out of ptr, len, and capacity. This is all
// unsafe because we are responsible for making sure the components are
// valid:
let s = unsafe { String::from_raw_parts(ptr, len, capacity) } ;

assert_eq!(String::from("Once upon a time..."), s);

If a String has enough capacity, adding elements to it will not re-allocate. For example, consider this program:

let mut s = String::new();

println!("{}", s.capacity());

for _ in 0..5 {
    s.push_str("hello");
    println!("{}", s.capacity());
}

This will output the following:

0
8
16
16
32
32

At first, we have no memory allocated at all, but as we append to the string, it increases its capacity appropriately. If we instead use the [with_capacity] method to allocate the correct capacity initially:

let mut s = String::with_capacity(25);

println!("{}", s.capacity());

for _ in 0..5 {
    s.push_str("hello");
    println!("{}", s.capacity());
}

We end up with a different output:

25
25
25
25
25
25

Here, there’s no need to allocate more memory inside the loop.

result: &mut String
codeintel::block_886f78ca3d265088
fn append_asterisk_if_ascii(target: String, result: &mut String)
core::str
pub const fn is_ascii(&self) -> bool

Checks if all characters in this string are within the ASCII range.

An empty string returns true.

Examples

let ascii = "hello!\n";
let non_ascii = "Grüße, Jürgen ❤";

assert!(ascii.is_ascii());
assert!(!non_ascii.is_ascii());
alloc::string::String
fn from(s: &str) -> String

Converts a &str into a String.

The result is allocated on the heap.

codeintel::block_3221b707d07a34c3
fn append_asterisk_if_ascii(target: String, result: &mut String)
alloc::borrow
impl<T> ToOwned for T
fn to_owned(&self) -> T
where
    // Bounds from impl:
    T: Clone,

Creates owned data from borrowed data, usually by cloning.

Examples

Basic usage:

let s: &str = "a";
let ss: String = s.to_owned();

let v: &[i32] = &[1, 2];
let vv: Vec<i32> = v.to_owned();
codeintel::block_e5e324394c1455ec
fn append_asterisk_if_ascii(target: String, result: &mut String)
core::macros::builtin
macro global_allocator

Attribute macro applied to a static to register it as a global allocator.

See also std::alloc::GlobalAlloc.

codeintel::block_cad0a160d4e872ae
static ALLOC: dhat::Alloc = 
extern crate dhat

Warning: This crate is experimental. It relies on implementation techniques that are hard to keep working for 100% of configurations. It may work fine for you, or it may crash, hang, or otherwise do the wrong thing. Its maintenance is not a high priority of the author. Support requests such as issues and pull requests may receive slow responses, or no response at all. Sorry!

This crate provides heap profiling and ad hoc profiling capabilities to Rust programs, similar to those provided by DHAT.

The heap profiling works by using a global allocator that wraps the system allocator, tracks all heap allocations, and on program exit writes data to file so it can be viewed with DHAT’s viewer. This corresponds to DHAT’s --mode=heap mode.

The ad hoc profiling is via a second mode of operation, where ad hoc events can be manually inserted into a Rust program for aggregation and viewing. This corresponds to DHAT’s --mode=ad-hoc mode.

dhat also supports heap usage testing, where you can write tests and then check that they allocated as much heap memory as you expected. This can be useful for performance-sensitive code.

Motivation

DHAT is a powerful heap profiler that comes with Valgrind. This crate is a related but alternative choice for heap profiling Rust programs. DHAT and this crate have the following differences.

  • This crate works on any platform, while DHAT only works on some platforms(Linux, mostly). (Note that DHAT’s viewer is just HTML+JS+CSS and shouldwork in any modern web browser on any platform.)
  • This crate typically causes a smaller slowdown than DHAT.
  • This crate requires some modifications to a program’s source code andrecompilation, while DHAT does not.
  • This crate cannot track memory accesses the way DHAT does, because it doesnot instrument all memory loads and stores.
  • This crate does not provide profiling of copy functions such as memcpyand strcpy, unlike DHAT.
  • The backtraces produced by this crate may be better than those producedby DHAT.
  • DHAT measures a program’s entire execution, but this crate only measureswhat happens within main. It will miss the small number of allocationsthat occur before or after main, within the Rust runtime.
  • This crate enables heap usage testing.

Configuration (profiling and testing)

In your Cargo.toml file, as well as specifying dhat as a dependency, you should (a) enable source line debug info, and (b) create a feature or two that lets you easily switch profiling on and off:

[profile.release]
debug = 1

[features]
dhat-heap = []    # if you are doing heap profiling
dhat-ad-hoc = []  # if you are doing ad hoc profiling

You should only use dhat in release builds. Debug builds are too slow to be useful.

Setup (heap profiling)

For heap profiling, enable the global allocator by adding this code to your program:

#[cfg(feature = "dhat-heap")]
#[global_allocator]
static ALLOC: dhat::Alloc = dhat::Alloc;

Then add the following code to the very start of your main function:

#[cfg(feature = "dhat-heap")]
let _profiler = dhat::Profiler::new_heap();

Then run this command to enable heap profiling during the lifetime of the Profiler instance:

cargo run --features dhat-heap

dhat::Alloc is slower than the normal allocator, so it should only be enabled while profiling.

Setup (ad hoc profiling)

Ad hoc profiling involves manually annotating hot code points and then aggregating the executed annotations in some fashion.

To do this, add the following code to the very start of your main function:

 #[cfg(feature = "dhat-ad-hoc")]
 let _profiler = dhat::Profiler::new_ad_hoc();

Then insert calls like this at points of interest:

#[cfg(feature = "dhat-ad-hoc")]
dhat::ad_hoc_event(100);

Then run this command to enable ad hoc profiling during the lifetime of the Profiler instance:

cargo run --features dhat-ad-hoc

For example, imagine you have a hot function that is called from many call sites. You might want to know how often it is called and which other functions called it the most. In that case, you would add an ad_hoc_event call to that function, and the data collected by this crate and viewed with DHAT’s viewer would show you exactly what you want to know.

The meaning of the integer argument to ad_hoc_event will depend on exactly what you are measuring. If there is no meaningful weight to give to an event, you can just use 1.

Running

For both heap profiling and ad hoc profiling, the program will run more slowly than normal. The exact slowdown is hard to predict because it depends greatly on the program being profiled, but it can be large. (Even more so on Windows, because backtrace gathering can be drastically slower on Windows than on other platforms.)

When the Profiler is dropped at the end of main, some basic information will be printed to stderr. For heap profiling it will look like the following.

dhat: Total:     1,256 bytes in 6 blocks
dhat: At t-gmax: 1,256 bytes in 6 blocks
dhat: At t-end:  1,256 bytes in 6 blocks
dhat: The data has been saved to dhat-heap.json, and is viewable with dhat/dh_view.html

(“Blocks” is a synonym for “allocations”.)

For ad hoc profiling it will look like the following.

dhat: Total:     141 units in 11 events
dhat: The data has been saved to dhat-ad-hoc.json, and is viewable with dhat/dh_view.html

A file called dhat-heap.json (for heap profiling) or dhat-ad-hoc.json (for ad hoc profiling) will be written. It can be viewed in DHAT’s viewer.

If you don’t see this output, it may be because your program called std::process::exit, which exits a program without running any destructors. To work around this, explicitly call drop on the Profiler value just before exiting.

When doing heap profiling, if you unexpectedly see zero allocations in the output it may be because you forgot to set dhat::Alloc as the global allocator.

When doing heap profiling it is recommended that the lifetime of the Profiler value cover all of main. But it is still possible for allocations and deallocations to occur outside of its lifetime. Such cases are handled in the following ways.

  • Allocated before, untouched within: ignored.
  • Allocated before, freed within: ignored.
  • Allocated before, reallocated within: treated like a new allocationwithin.
  • Allocated after: ignored.

These cases are not ideal, but it is impossible to do better. dhat deliberately provides no way to reset the heap profiling state mid-run precisely because it leaves open the possibility of many such occurrences.

Viewing

Open a copy of DHAT’s viewer, version 3.17 or later. There are two ways to do this.

  • Easier: Use the online version.
  • Harder: Clone the Valgrind repository with git clone git://sourceware.org/git/valgrind.git and open dhat/dh_view.html.There is no need to build any code in this repository.

Then click on the “Load…” button to load dhat-heap.json or dhat-ad-hoc.json.

DHAT’s viewer shows a tree with nodes that look like this.

PP 1.1/2 {
  Total:     1,024 bytes (98.46%, 14,422,535.21/s) in 1 blocks (50%, 14,084.51/s), avg size 1,024 bytes, avg lifetime 35 µs (49.3% of program duration)
  Max:       1,024 bytes in 1 blocks, avg size 1,024 bytes
  At t-gmax: 1,024 bytes (98.46%) in 1 blocks (50%), avg size 1,024 bytes
  At t-end:  1,024 bytes (100%) in 1 blocks (100%), avg size 1,024 bytes
  Allocated at {
    #1: 0x10ae8441b: <alloc::alloc::Global as core::alloc::Allocator>::allocate (alloc/src/alloc.rs:226:9)
    #2: 0x10ae8441b: alloc::raw_vec::RawVec<T,A>::allocate_in (alloc/src/raw_vec.rs:207:45)
    #3: 0x10ae8441b: alloc::raw_vec::RawVec<T,A>::with_capacity_in (alloc/src/raw_vec.rs:146:9)
    #4: 0x10ae8441b: alloc::vec::Vec<T,A>::with_capacity_in (src/vec/mod.rs:609:20)
    #5: 0x10ae8441b: alloc::vec::Vec<T>::with_capacity (src/vec/mod.rs:470:9)
    #6: 0x10ae8441b: std::io::buffered::bufwriter::BufWriter<W>::with_capacity (io/buffered/bufwriter.rs:115:33)
    #7: 0x10ae8441b: std::io::buffered::linewriter::LineWriter<W>::with_capacity (io/buffered/linewriter.rs:109:29)
    #8: 0x10ae8441b: std::io::buffered::linewriter::LineWriter<W>::new (io/buffered/linewriter.rs:89:9)
    #9: 0x10ae8441b: std::io::stdio::stdout::{{closure}} (src/io/stdio.rs:680:58)
    #10: 0x10ae8441b: std::lazy::SyncOnceCell<T>::get_or_init_pin::{{closure}} (std/src/lazy.rs:375:25)
    #11: 0x10ae8441b: std::sync::once::Once::call_once_force::{{closure}} (src/sync/once.rs:320:40)
    #12: 0x10aea564c: std::sync::once::Once::call_inner (src/sync/once.rs:419:21)
    #13: 0x10ae81b1b: std::sync::once::Once::call_once_force (src/sync/once.rs:320:9)
    #14: 0x10ae81b1b: std::lazy::SyncOnceCell<T>::get_or_init_pin (std/src/lazy.rs:374:9)
    #15: 0x10ae81b1b: std::io::stdio::stdout (src/io/stdio.rs:679:16)
    #16: 0x10ae81b1b: std::io::stdio::print_to (src/io/stdio.rs:1196:21)
    #17: 0x10ae81b1b: std::io::stdio::_print (src/io/stdio.rs:1209:5)
    #18: 0x10ae2fe20: dhatter::main (dhatter/src/main.rs:8:5)
  }
}

Full details about the output are in the DHAT documentation. Note that DHAT uses the word “block” as a synonym for “allocation”.

When heap profiling, this crate doesn’t track memory accesses (unlike DHAT) and so the “reads” and “writes” measurements are not shown within DHAT’s viewer, and “sort metric” views involving reads, writes, or accesses are not available.

The backtraces produced by this crate are trimmed to reduce output file sizes and improve readability in DHAT’s viewer, in the following ways.

  • Only one allocation-related frame will be shown at the top of thebacktrace. That frame may be a function within alloc::alloc, a functionwithin this crate, or a global allocation function like __rg_alloc.
  • Common frames at the bottom of all backtraces, below main, are omitted.

Backtrace trimming is inexact and if the above heuristics fail more frames will be shown. ProfilerBuilder::trim_backtraces allows (approximate) control of how deep backtraces will be.

Heap usage testing

dhat lets you write tests that check that a certain piece of code does a certain amount of heap allocation when it runs. This is sometimes called “high water mark” testing. Sometimes it is precise (e.g. “this code should do exactly 96 allocations” or “this code should free all allocations before finishing”) and sometimes it is less precise (e.g. “the peak heap usage of this code should be less than 10 MiB”).

These tests are somewhat fragile, because heap profiling involves global state (allocation stats), which introduces complications.

  • dhat will panic if more than one Profiler is running at a time, butRust tests run in parallel by default. So parallel running of heap usagetests must be prevented.
  • If you use something like theserial_test crate to run heap usagetests in serial, Rust’s test runner code by default still runs inparallel with those tests, and it allocates memory. These allocationswill be counted by the Profiler as if they are part of the test, whichwill likely cause test failures.

Therefore, the best approach is to put each heap usage test in its own integration test file. Each integration test runs in its own process, and so cannot interfere with any other test. Also, if there is only one test in an integration test file, Rust’s test runner code does not use any parallelism, and so will not interfere with the test. If you do this, a simple cargo test will work as expected.

Alternatively, if you really want multiple heap usage tests in a single integration test file you can write your own custom test harness, which is simpler than it sounds.

But integration tests have some limits. For example, they only be used to test items from libraries, not binaries. One way to get around this is to restructure things so that most of the functionality is in a library, and the binary is a thin wrapper around the library.

Failing that, a blunt fallback is to run cargo tests -- --test-threads=1. This disables all parallelism in tests, avoiding all the problems. This allows the use of unit tests and multiples tests per integration test file, at the cost of a non-standard invocation and slower test execution.

With all that in mind, configuration of Cargo.toml is much the same as for the profiling use case.

Here is an example showing what is possible. This code would go in an integration test within a crate’s tests/ directory:

#[global_allocator]
static ALLOC: dhat::Alloc = dhat::Alloc;

#[test]
fn test() {
    let _profiler = dhat::Profiler::builder().testing().build();

    let _v1 = vec![1, 2, 3, 4];
    let v2 = vec![5, 6, 7, 8];
    drop(v2);
    let v3 = vec![9, 10, 11, 12];
    drop(v3);

    let stats = dhat::HeapStats::get();

    // Three allocations were done in total.
    dhat::assert_eq!(stats.total_blocks, 3);
    dhat::assert_eq!(stats.total_bytes, 48);

    // At the point of peak heap size, two allocations totalling 32 bytes existed.
    dhat::assert_eq!(stats.max_blocks, 2);
    dhat::assert_eq!(stats.max_bytes, 32);

    // Now a single allocation remains alive.
    dhat::assert_eq!(stats.curr_blocks, 1);
    dhat::assert_eq!(stats.curr_bytes, 16);
}

The testing call puts the profiler into testing mode, which allows the stats provided by HeapStats::get to be checked with dhat::assert! and similar assertions. These assertions work much the same as normal assertions, except that if any of them fail a heap profile will be saved.

When viewing the heap profile after a test failure, the best choice of sort metric in the viewer will depend on which stat was involved in the assertion failure.

  • total_blocks: “Total (blocks)”
  • total_bytes: “Total (bytes)”
  • max_blocks or max_bytes: “At t-gmax (bytes)”
  • curr_blocks or curr_bytes: “At t-end (bytes)”

This should give you a good understanding of why the assertion failed.

Note: if you try this example test it may work in a debug build but fail in a release build. This is because the compiler may optimize away some of the allocations that are unused. This is a common problem for contrived examples but less common for real tests. The unstable std::hint::black_box function may also be helpful in this situation.

Ad hoc usage testing

Ad hoc usage testing is also possible. It can be used to ensure certain code points in your program are hit a particular number of times during execution. It works in much the same way as heap usage testing, but ProfilerBuilder::ad_hoc must be specified, AdHocStats::get is used instead of HeapStats::get, and there is no possibility of Rust’s test runner code interfering with the tests.

dhat
pub struct Alloc

A global allocator that tracks allocations and deallocations on behalf of the Profiler type.

It must be set as the global allocator (via #[global_allocator]) when doing heap profiling.

codeintel::block_cad0a160d4e872ae
fn append_asterisk_if_ascii(target: &str, result: &mut String)
target: &str
str

String slices.

See also the std::str module.

The str type, also called a ‘string slice’, is the most primitive string type. It is usually seen in its borrowed form, &str. It is also the type of string literals, &'static str.

Basic Usage

String literals are string slices:

let hello_world = "Hello, World!";

Here we have declared a string slice initialized with a string literal. String literals have a static lifetime, which means the string hello_world is guaranteed to be valid for the duration of the entire program. We can explicitly specify hello_world’s lifetime as well:

let hello_world: &'static str = "Hello, world!";

Representation

A &str is made up of two components: a pointer to some bytes, and a length. You can look at these with the [as_ptr] and [len] methods:

use std::slice;
use std::str;

let story = "Once upon a time...";

let ptr = story.as_ptr();
let len = story.len();

// story has nineteen bytes
assert_eq!(19, len);

// We can re-build a str out of ptr and len. This is all unsafe because
// we are responsible for making sure the two components are valid:
let s = unsafe {
    // First, we build a &[u8]...
    let slice = slice::from_raw_parts(ptr, len);

    // ... and then convert that slice into a string slice
    str::from_utf8(slice)
};

assert_eq!(s, Ok(story));

Note: This example shows the internals of &str. unsafe should not be used to get a string slice under normal circumstances. Use as_str instead.

Invariant

Rust libraries may assume that string slices are always valid UTF-8.

Constructing a non-UTF-8 string slice is not immediate undefined behavior, but any function called on a string slice may assume that it is valid UTF-8, which means that a non-UTF-8 string slice can lead to undefined behavior down the road.

codeintel::block_cad0a160d4e872ae
fn main() -> Result<(), Box<dyn std::error::Error>>
core::result
pub enum Result<T, E> {
    Ok( /* … */ ),
    Err( /* … */ ),
}

Result is a type that represents either success (Ok) or failure (Err).

See the module documentation for details.

alloc::boxed
pub struct Box<T, A = Global>(Unique<T>, A)
where
    T: ?Sized,
    A: Allocator,

A pointer type that uniquely owns a heap allocation of type T.

See the module-level documentation for more.

extern crate std

The Rust Standard Library

The Rust Standard Library is the foundation of portable Rust software, a set of minimal and battle-tested shared abstractions for the broader Rust ecosystem. It offers core types, like [Vec<T>] and [Option<T>], library-defined operations on language primitives, standard macros, [I/O] and [multithreading], among many other things.

std is available to all Rust crates by default. Therefore, the standard library can be accessed in use statements through the path std, as in use std::env.

How to read this documentation

If you already know the name of what you are looking for, the fastest way to find it is to use the search button at the top of the page.

Otherwise, you may want to jump to one of these useful sections:

If this is your first time, the documentation for the standard library is written to be casually perused. Clicking on interesting things should generally lead you to interesting places. Still, there are important bits you don’t want to miss, so read on for a tour of the standard library and its documentation!

Once you are familiar with the contents of the standard library you may begin to find the verbosity of the prose distracting. At this stage in your development you may want to press the “ Summary” button near the top of the page to collapse it into a more skimmable view.

While you are looking at the top of the page, also notice the “Source” link. Rust’s API documentation comes with the source code and you are encouraged to read it. The standard library source is generally high quality and a peek behind the curtains is often enlightening.

What is in the standard library documentation?

First of all, The Rust Standard Library is divided into a number of focused modules, all listed further down this page. These modules are the bedrock upon which all of Rust is forged, and they have mighty names like [std::slice] and [std::cmp]. Modules’ documentation typically includes an overview of the module along with examples, and are a smart place to start familiarizing yourself with the library.

Second, implicit methods on primitive types are documented here. This can be a source of confusion for two reasons:

  1. While primitives are implemented by the compiler, the standard libraryimplements methods directly on the primitive types (and it is the onlylibrary that does so), which are documented in the section on primitives.
  2. The standard library exports many modules with the same name as primitive types. These define additional items related to the primitivetype, but not the all-important methods.

So for example there is a page for the primitive type char that lists all the methods that can be called on characters (very useful), and there is a page for the module std::char that documents iterator and error types created by these methods (rarely useful).

Note the documentation for the primitives [str] and [T] (also called ‘slice’). Many method calls on String and [Vec<T>] are actually calls to methods on [str] and [T] respectively, via deref coercions.

Third, the standard library defines [The Rust Prelude], a small collection of items - mostly traits - that are imported into every module of every crate. The traits in the prelude are pervasive, making the prelude documentation a good entry point to learning about the library.

And finally, the standard library exports a number of standard macros, and lists them on this page (technically, not all of the standard macros are defined by the standard library - some are defined by the compiler - but they are documented here the same). Like the prelude, the standard macros are imported by default into all crates.

Contributing changes to the documentation

Check out the Rust contribution guidelines here. The source for this documentation can be found on GitHub in the ‘library/std/’ directory. To contribute changes, make sure you read the guidelines first, then submit pull-requests for your suggested changes.

Contributions are appreciated! If you see a part of the docs that can be improved, submit a PR, or chat with us first on Zulip #docs.

A Tour of The Rust Standard Library

The rest of this crate documentation is dedicated to pointing out notable features of The Rust Standard Library.

Containers and collections

The option and result modules define optional and error-handling types, [Option<T>] and [Result<T, E>]. The iter module defines Rust’s iterator trait, Iterator, which works with the for loop to access collections.

The standard library exposes three common ways to deal with contiguous regions of memory:

  • [Vec<T>] - A heap-allocated vector that is resizable at runtime.
  • [T; N] - An inline array with a fixed size at compile time.
  • [T] - A dynamically sized slice into any other kind of contiguousstorage, whether heap-allocated or not.

Slices can only be handled through some kind of pointer, and as such come in many flavors such as:

  • &[T] - shared slice
  • &mut [T] - mutable slice
  • Box<[T]> - owned slice

[str], a UTF-8 string slice, is a primitive type, and the standard library defines many methods for it. Rust [str]s are typically accessed as immutable references: &str. Use the owned String for building and mutating strings.

For converting to strings use the format macro, and for converting from strings use the [FromStr] trait.

Data may be shared by placing it in a reference-counted box or the [Rc] type, and if further contained in a [Cell] or [RefCell], may be mutated as well as shared. Likewise, in a concurrent setting it is common to pair an atomically-reference-counted box, [Arc], with a [Mutex] to get the same effect.

The collections module defines maps, sets, linked lists and other typical collection types, including the common [HashMap<K, V>].

Platform abstractions and I/O

Besides basic data types, the standard library is largely concerned with abstracting over differences in common platforms, most notably Windows and Unix derivatives.

Common types of I/O, including [files], [TCP], and [UDP], are defined in the io, fs, and net modules.

The thread module contains Rust’s threading abstractions. sync contains further primitive shared memory types, including [atomic], [mpmc] and [mpsc], which contains the channel types for message passing.

Use before and after main()

Many parts of the standard library are expected to work before and after main(); but this is not guaranteed or ensured by tests. It is recommended that you write your own tests and run them on each platform you wish to support. This means that use of std before/after main, especially of features that interact with the OS or global state, is exempted from stability and portability guarantees and instead only provided on a best-effort basis. Nevertheless bug reports are appreciated.

On the other hand core and alloc are most likely to work in such environments with the caveat that any hookable behavior such as panics, oom handling or allocators will also depend on the compatibility of the hooks.

Some features may also behave differently outside main, e.g. stdio could become unbuffered, some panics might turn into aborts, backtraces might not get symbolicated or similar.

Non-exhaustive list of known limitations:

  • after-main use of thread-locals, which also affects additional features:
  • under UNIX, before main, file descriptors 0, 1, and 2 may be unchanged(they are guaranteed to be open during main,and are opened to /dev/null O_RDWR if they weren’t open on program start)
std
pub mod error
core::error
pub trait Error
where
    Self: Debug + Display,

Error is a trait representing the basic expectations for error values, i.e., values of type E in Result<T, E>.

Errors must describe themselves through the Display and Debug traits. Error messages are typically concise lowercase sentences without trailing punctuation:

let err = "NaN".parse::<u32>().unwrap_err();
assert_eq!(err.to_string(), "invalid digit found in string");

Error source

Errors may provide cause information. Error::source is generally used when errors cross “abstraction boundaries”. If one module must report an error that is caused by an error from a lower-level module, it can allow accessing that error via Error::source(). This makes it possible for the high-level module to provide its own errors while also revealing some of the implementation for debugging.

In error types that wrap an underlying error, the underlying error should be either returned by the outer error’s Error::source(), or rendered by the outer error’s Display implementation, but not both.

Example

Implementing the Error trait only requires that Debug and Display are implemented too.

use std::error::Error;
use std::fmt;
use std::path::PathBuf;

#[derive(Debug)]
struct ReadConfigError {
    path: PathBuf
}

impl fmt::Display for ReadConfigError {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let path = self.path.display();
        write!(f, "unable to read configuration at {path}")
    }
}

impl Error for ReadConfigError {}
let result: String
alloc::string::String
pub fn with_capacity(capacity: usize) -> String

Creates a new empty String with at least the specified capacity.

Strings have an internal buffer to hold their data. The capacity is the length of that buffer, and can be queried with the [capacity] method. This method creates an empty String, but one with an initial buffer that can hold at least capacity bytes. This is useful when you may be appending a bunch of data to the String, reducing the number of reallocations it needs to do.

If the given capacity is 0, no allocation will occur, and this method is identical to the [new] method.

Examples

let mut s = String::with_capacity(10);

// The String contains no chars, even though it has capacity for more
assert_eq!(s.len(), 0);

// These are all done without reallocating...
let cap = s.capacity();
for _ in 0..10 {
    s.push('a');
}

assert_eq!(s.capacity(), cap);

// ...but this may make the string reallocate
s.push('a');
let _profiler: Profiler
dhat
pub struct Profiler

A type whose lifetime dictates the start and end of profiling.

Profiling starts when the first value of this type is created. Profiling stops when (a) this value is dropped or (b) a dhat assertion fails, whichever comes first. When that happens, profiling data may be written to file, depending on how the Profiler has been configured. Only one Profiler can be running at any point in time.

dhat::Profiler
pub fn builder() -> ProfilerBuilder

Creates a new ProfilerBuilder, which defaults to heap profiling.

dhat::ProfilerBuilder
pub fn testing(self) -> Self

Requests testing mode, which allows the use of dhat::assert! and related macros, and disables saving of profile data on Profiler drop.

Examples

let _profiler = dhat::Profiler::builder().testing().build();
dhat::ProfilerBuilder
pub fn build(self) -> Profiler

Creates a Profiler from the builder and initiates profiling.

Panics

Panics if another Profiler is running.

let stats: HeapStats
dhat
pub struct HeapStats {
    pub total_blocks: u64,
    pub total_bytes: u64,
    pub curr_blocks: usize,
    pub curr_bytes: usize,
    pub max_blocks: usize,
    /* … */
}

Stats from heap profiling.

dhat::HeapStats
pub fn get() -> Self

Gets the current heap stats.

Panics

Panics if called when a Profiler is not running or not doing heap profiling.

std::macros
macro_rules! println

Prints to the standard output, with a newline.

On all platforms, the newline is the LINE FEED character (\n/U+000A) alone (no additional CARRIAGE RETURN (\r/U+000D)).

This macro uses the same syntax as format, but writes to the standard output instead. See [std::fmt] for more information.

The println! macro will lock the standard output on each call. If you call println! within a hot loop, this behavior may be the bottleneck of the loop. To avoid this, lock stdout with io::stdout().lock():

use std::io::{stdout, Write};

let mut lock = stdout().lock();
writeln!(lock, "hello world").unwrap();

Use println! only for the primary output of your program. Use [eprintln!] instead to print error and progress messages.

See the formatting documentation in std::fmt for details of the macro argument syntax.

Panics

Panics if writing to [io::stdout] fails.

Writing to non-blocking stdout can cause an error, which will lead this macro to panic.

Examples

println!(); // prints just a newline
println!("hello there!");
println!("format {} arguments", "some");
let local_variable = "some";
println!("format {local_variable} arguments");
dhat::HeapStats
pub max_blocks: usize

Number of blocks (a.k.a. allocations) allocated at the global peak, i.e. when curr_bytes peaked.

dhat::HeapStats
pub max_bytes: usize

Number of bytes allocated at the global peak, i.e. when curr_bytes peaked.

dhat::HeapStats
pub total_blocks: u64

Number of blocks (a.k.a. allocations) allocated over the entire run.

dhat::HeapStats
pub total_bytes: u64

Number of bytes allocated over the entire run.

core::result::Result
Ok(T)

Contains the success value

codeintel::block_a22f5303f6335c93
fn append_asterisk_if_ascii(target: &str, result: &mut String)
alloc::string::String
pub fn push(&mut self, ch: char)

Appends the given char to the end of this String.

Examples

let mut s = String::from("abc");

s.push('1');
s.push('2');
s.push('3');

assert_eq!("abc123", s);
codeintel::block_0362d7d98439393c
fn append_asterisk_if_ascii(target: &str, result: &mut String)
let num_asterisks: usize
let mut result: String
#[cfg]

Valid forms are:

  • #[cfg(predicate)]