Ownership and Borrowing
Ownership |
Move Semantics |
Clone (Deep Copy) |
Copy Semantics |
References and Borrowing |
Borrowing when Calling a Function |
The memory of a computer program is primarily structured into a stack and a heap. The stack is structured: it stores values in the order it gets them and removes the values in the opposite order ("last in, first out"). During program execution, the stack is used to store function parameters, non-static local variables, function return values and return addresses, or pointers to the heap.
Under the covers, most CPUs maintain a pointer to the top of the stack in one of its internal registers. Pushing data onto the stack is a matter of copying a value onto the next available memory slot. Popping data off the stack is simply adjusting the stack pointer register value. Therefore, the stack is fast, but it suffers from one major constraint: all data stored on the stack must have a known, fixed size.
Data with an unknown size at compile time or a size that might change must be stored on the heap instead. The heap is less organized and managed by a memory allocator subroutine. Upon request for a memory block, the memory allocator finds empty memory in the heap, marks it as being in use, and returns a pointer, which is the memory address of the allocated memory. Because a pointer to the heap is a known, fixed size, the pointer can be stored on the stack, but when you want the actual data, the pointer must be followed ("dereferenced").
Allocating space on the heap is slower than pushing data on the stack, because the allocator must first find a big enough space to hold the data and then perform bookkeeping to prepare for the next allocation. The allocator must track outstanding allocations to ensure that they do not overlap and that no memory that is no longer used is leaked (lost; not reusable for the remainder of the program's execution). The heap further suffers from fragmentation, which arises when many small chunks of free memory is interspersed with allocated memory, to a point where large enough chunks of free memory cannot allocated, leading to "out of memory" errors.
Some languages (Java, C#, Go...) use garbage collection (GC) that regularly reclaims no-longer-used memory as the program runs, moving memory blocks around to defragment the free memory into larger spans ("compaction"). GC is memory-safe, automatic, but expensive at runtime, as it must either temporarily stop the program or run in the background.
In other languages (C, C++...), the programmer must explicitly allocate and free the memory. This is very error-prone: each allocation must be paired with exactly one deallocation. Forgetting to free memory leaks it. Freeing memory too early or more than once ("double free" error) may lead to memory overlap and invalid variables, likely causing a program crash.
Rust manages memory through its ownership system instead: the memory is automatically returned once the (one and only one) variable that owns it goes out of scope. We will see below how Rust enforces very strict rules to ensure memory safety without runtime costs.
Ownership
Rust's ownership system ensures automatic memory safety without the need for a garbage collector.
Ownership is a set of rules that the compiler enforces at compile time. If these rules are violated, the program won't compile:
- Each value in Rust has a variable that is its owner.
- There can only be one owner at a time.
- When the owner goes out of scope, the value is dropped (meaning Rust automatically calls a special drop method to deallocate the memory).
The scope of a variable is defined as follows: A variable is valid from the point it is declared until the end of its current scope (usually a block of code enclosed in curly braces { }
and typically the body of a function).
Rules 1 and 3 ensures automatic memory cleanup (no leaks). Rule 2 ensures no "double free". The following example illustrates these rules:
fn main() { // `s` is not valid here, it is not yet declared. { // Start of the scope. // When a String is created, it requests memory from the heap. let s = String::from("hello"); // `s` is valid from this point forward. println!("{s}"); } // The scope is now over, and `s` is no longer valid. // Rust automatically returns this memory to the allocator. // This prevents memory leaks. // ERROR: println!("{s}"); }
Move Semantics
When you assign a heap-allocated value (like a String
) from one variable to another, Rust moves the ownership of the value: the new variable owns the data, and the original variable becomes invalid.
Moving enforces rule 2 above: there can only be one owner of a value.
This is very different from what is common in other programming languages: they perform either a "shallow copy" (copying the pointer to the heap data) or a "deep copy" (copying the heap data to another heap location and storing a pointer to the new location in the new variable). Shallow copies are cheap but result in multiple owners of the heap data. Deep copy is safe but is expensive, since it involves allocating an arbitrary number of bytes on the heap.
In contrast, Rust's move is like a shallow copy, but with the added rule of invalidating the original owner. It is therefore cheap and safe.
The following example illustrates this concept:
//! This example demonstrates move semantics in Rust. //! //! Note that `String` (and all non-`Copy` types) have "move" semantics. fn main() { // The variable `s1` owns the string "hello". let s1 = String::from("hello"); // The value in `s1` is MOVED into `s2`. // `s2` now owns the string "hello" and Rust invalidates `s1`. // This is NOT a shallow copy or a deep copy. let s2 = s1; println!("{s2}, world!"); // ERROR: println!("{s1}, world!"); // `s1` is invalid. } // `s2` gets out of scope here, therefore the String it owns is dropped // (deallocated). `s1` invalidation earlier prevents a "double free" error, // where two variables might try to deallocate the same memory when they go out // of scope.
Let's explain what happens behind the scene: The local variable s1
(on the stack) contains a pointer to the string's heap-allocated data (here, the unicode characters of the String). During the assignment let s2 = s1
, the pointer is copied into s2
(also on the stack). The heap data is not touched.
Most importantly, the s1
variable is made inaccessible. That means that, during compilation, the Rust compiler made sure that s1
could not be referenced by any line of code after the "move" event (or the program would simply not compile).Note that there are no runtime checks, just purely compiler-enforced rules.
Assignment (of non-Copy
values) is not the only event that triggers a move: passing a variable to a function does as well:
// Move semantics is not limited to assignments. // They apply to function parameters as well. // This function consumes its String parameter... fn consume(x: String) { println!("{x}"); } // ...then, when `x` goes out of scope, its value is dropped. // This function consumes, processes, then returns its `String` parameter. fn consume_and_return(x: String) -> String { println!("{x}"); x } fn main() { let s1 = String::from("Rust"); // s1 is moved into the `consume` function: consume(s1); // `s1` is no longer available: // println!("{s1}"); // ERROR: borrow of moved value: `s1`. // You can return the value that has been consumed, // if you still need it. // Most often, however, it is more convenient to borrow the `String` // using references (see below). let s2 = String::from("Rust"); let s3 = consume_and_return(s2); println!("{s3}"); }
Clone (Deep Copy)
Rust will never automatically create deep copies of your data, because, as described above, it can be expensive.
Instead, you may explicitly request a deep copy by calling the clone
method of the std::clone::Clone
↗ trait:
/// This example demonstrates the use of the `clone` method /// to create a deep copy of a `String`. fn main() { // Create a String on the heap. let mut s1 = String::from("hello"); // Clone the String, creating a new String with the same content on the // heap. let s2 = s1.clone(); // The implementation of `Clone` for `String` allocates a new memory chunk // on the heap, deep copies the pointed-to string buffer of `s1` into // it, and stores a pointer to the new heap location. // Both `s1` and `s2` remain accessible after the clone. println!("{s1}"); println!("{s2}"); assert_eq!(s1, s2); // `s1` and `s2` have the same contents... // ...but `s1` and `s2` do NOT point to the same heap memory location. assert!(!std::ptr::eq(s1.as_ptr(), s2.as_ptr())); // `s1` and `s2` are independent of each other. Let's modify `s1`: s1.push('!'); assert_ne!(s1, s2); }
You can implement the Clone
trait for your custom types (structs, enums...) to provide any type-specific behavior necessary to duplicate values safely.
However, you will often simply add the #[derive(Clone)]
attribute to have the compiler automagically implement the Clone
trait for you.
Copy Semantics
For stack-only, fixed-size variables (which include integers, floats, bools, chars, tuples thereof, and immutable references), there's no need for "move" semantics, because there's no heap data or requirements for special deallocation logic.
Such stack-only values, and more precisely all types that implement the std::marker::Copy
↗ trait, use "Copy Semantics" instead:
When you assign a variable of a Copy
type to another, a simple bitwise copy of the value is made, and the original variable remains valid.
The Copy
trait is a marker trait, meaning it doesn't have any methods. Types that implement Copy
must also implement Clone
. A type cannot implement Copy
if it or any of its parts implement the Drop
trait (since it is used for custom cleanup, like deallocating heap memory).
fn main() { // Integers implement the `Copy` trait, so they are bit-wise copied instead // of "moved". let x = 5; // x is an integer. let y = x; // y is a copy of x. // Both x and y remain valid: println!("x = {x}, y = {y}"); // It is possible to make a custom type `Copy` by using the `derive` // attribute: #[derive(Copy, Clone, Debug)] struct S(i32); // Notes: // - `Clone` is a supertrait of `Copy`, so everything which is // `Copy` must also implement `Clone`. // - `#[derive(Copy)]` requires that all of the struct's components // implement `Copy`. // - You could also implement `Copy` and `Clone` manually with `impl` // blocks. let a = S(5); // `a` is a struct that implements `Copy`. let b = a; // `b` is a copy of `a`. // Both `a` and `b` remain valid: println!("a = {a:?}, b = {b:?}"); }
References and Borrowing
A reference in Rust is essentially a pointer (a memory address) to a value in memory, plus additional guarantees that the pointed-to data is valid (while a reference to an object exist, the object cannot be destroyed / dropped).
Crucially, references do not own the value they point to. Creating a reference is called borrowing.
Immutable references (of type &T
if T
is the base type), also called shared references, provide read-only access to the underlying data:
//! Immutable Reference Example. fn main() { let a = 42; // Use the `&` operator to create a reference (i.e. "borrow"): let b: &i32 = &a; // Use `*` to dereference (follow the pointer): let c: i32 = *b; println!("c: {c}"); // Shared references are read-only: // *b += 1; // ERROR: cannot assign to `*b`, which is behind a `&` // reference. // The `.` operator used to retrieve a field from a struct or enum also // automatically dereferences: struct S { field: u32, } let s: &S = &S { field: 3 }; let _field: u32 = s.field; }
Mutable references (of type &mut T
), called exclusive references, allow you to borrow a value and modify it:
let mut a: i32 = 1;
// Create a mutable reference:
let b: &mut i32 = &mut a;
// Use `*` to dereference it:
*b += 1;
Note that If you have a mutable reference to a value, you can have no other simultaneous references to that value.
In other words, you can have either one and only one mutable reference (&mut T
) or any number of immutable references (&T
) to a particular piece of data in a particular scope. In effect, references function like a read/write lock:
#[derive(Debug)] struct MyStruct(bool); fn main() { let mut s1 = MyStruct(true); // We can take multiple immutable (shared) references at the same time: let ref_s1 = &s1; let ref_s2 = &s1; // You cannot modify `s1` or obtain a mutable (exclusive) reference to it // when holding immutable references. // s1.push('!'); // ERROR: "cannot borrow `s1` as mutable because it is also borrowed as // immutable." println!("{ref_s1:?} {ref_s2:?}"); // Last use of the `ref_*` variables. // However, you can reassign the variable or obtain a mutable reference, // once the shared references are no longer in use. s1 = MyStruct(false); s1.0 = true; }
This strict rule prevents data races, which occur when:
- Two or more pointers (or references) access the same data concurrently,
- At least one of them is a write,
- There's no synchronization mechanism being used to control the access.
Data races lead to undefined behavior, which can manifest as crashes, incorrect results, or subtle bugs that are hard to track down.
Borrowing when Calling a Function
We discussed above that passing a variable to a function by value will move or copy it, just as assignment does. To avoid transferring ownership of non-Copy
data every time you call a function, you will very often "borrow" the value using references.
The following example shows a function that takes a sample struct
by reference (&T
), instead of by value (T
). While that struct has move semantics, it is not consumed by the function when it is passed by reference. The function borrows, but does not gain ownership of, what it refers to, thus the referred value is not dropped when the function returns:
//! Demonstrates the concept of borrowing in Rust. // This struct does not implement `Copy` and has therefore "move semantics": #[derive(Debug)] struct MyStruct(bool); // This function takes an (immutable) reference to `MyStruct`. // We can read but not modify `s1`. fn calculate(s: &MyStruct) { println!("{s:?}"); } fn main() { let s1 = MyStruct(true); // `ref_s1` is an immutable reference to `s1`. // We call the action of creating a reference "borrowing". let ref_s1 = &s1; // We pass the reference to `calculate`: calculate(ref_s1); // `s` goes out of scope at the end of the `calculate` function. // Because this function does not have ownership of what it refers // to, `s1` is _not_ dropped and remains valid after borrowing: println!("{s1:?}"); // Immutable references are `Copy`, thus `ref_s1` is also still valid: println!("{ref_s1:?}"); }
The same applies to mutable references:
/// This function takes a mutable reference `&mut` to a `String` /// and appends ", world" to it. fn change(some_string: &mut String) { some_string.push_str(", world"); // Modifies the string in place. println!("{some_string}"); } fn main() { let mut s = String::from("hello"); // The `mut` keyword is required. // Create a mutable reference to `s`: let ref_mut1 = &mut s; // You can pass the string by mutable reference to a function and modify it. change(ref_mut1); // While the `String` type has "move semantics", the function does not // consume the string, since the mutable reference does not own it. // You cannot create other mutable or immutable references while the // exclusive reference is in use: // let ref_mut2 = &mut s; // ERROR: cannot borrow `s` as mutable more than once at a time. // let ref1 = &s; // ERROR: cannot borrow `s` as immutable because it is also borrowed as // mutable. println!("{ref_mut1}"); // The original data can be borrowed again only after the mutable reference // has been used for the last time. let _ref2 = &s; }
Related Topics
- Lifetimes.
- Rust Patterns.