Hashing

Hashing is a process that converts input data of any size into a "digest", a fixed-size string of characters, typically a sequence of numbers and letters, also known as a hash value or hash code. Hash functions are designed to be fast, deterministic (the same input will always produce the same output), and produce a unique hash for distinct inputs (lack of collisions). One key feature of hashing is that it is a one-way function; you can't reverse-engineer the original input from the hash value.

Hashes are widely used in various applications such as:

  • Hash Maps: Efficient data retrieval.
  • Data Integrity: Ensuring that data has not been altered by comparing the hash value of the original data with the hash value of the received data.
  • Password Storage: Storing hashed versions of passwords rather than plain text to enhance security.
  • Digital Signatures: Verifying the authenticity and integrity of messages or documents.

Use the following crates for general-purpose hashing and cryptographic hashing:

  • blake3 and sha2 for general-purpose hashing
  • ahash for use in in-memory hashmaps.
  • rustc-hash for fast, non-cryptographic hashing.
  • murmur3 is a non-cryptographic hash function suitable for general hash-based lookup.
  • fnv is the Fowler–Noll–Vo hash function that is more efficient for smaller hash keys.
  • hashbrown is a Rust port of Google's high-performance SwissTable hash map, adapted to make it a drop-in replacement for Rust's standard HashMap and HashSet types.

You may also use crc or crc32fast for CRC checksums.

TODO distinguish between general-purpose / fast / OOS resistant / crypto hashing.

Cryptographic hash function

Calculate the SHA-256 Digest of a File

ring ring~crates.io ring~repo ring~lib.rs cat~cryptography cat~no-std

data-encoding data-encoding~crates.io data-encoding~repo data-encoding~lib.rs cat~encoding cat~no-std

SHA-256 (Secure Hash Algorithm 256-bit) is part of the SHA-2 family of cryptographic hash functions. It produces a fixed-size 256-bit hash value (64 characters) from input data of any size. SHA-256 is widely used in applications such as digital signatures, certificate generation, and data integrity verification.

Writes some data to a file, then calculates the SHA-256 digest::Digest of the file's contents using digest::Context.

use std::fs;
use std::fs::File;
use std::io::BufReader;
use std::io::Read;
use std::io::Write;

use anyhow::Result;
use data_encoding::HEXUPPER;
use ring::digest::Context;
use ring::digest::Digest;
use ring::digest::SHA256;

/// Calculates the SHA-256 digest of the data read from the given reader.
fn sha256_digest<R: Read>(mut reader: R) -> Result<Digest> {
    let mut context = Context::new(&SHA256);
    let mut buffer = [0; 1024];

    loop {
        let count = reader.read(&mut buffer)?;
        if count == 0 {
            break;
        }
        context.update(&buffer[..count]);
    }

    Ok(context.finish())
}

fn main() -> Result<()> {
    if !fs::exists("temp")? {
        fs::create_dir("temp")?;
    }
    let path = "temp/file.txt";

    let mut output = File::create(path)?;
    write!(output, "We will generate a digest of this text")?;

    let input = File::open(path)?;
    let reader = BufReader::new(input);
    let digest = sha256_digest(reader)?;

    println!("SHA-256 digest is {}", HEXUPPER.encode(digest.as_ref()));

    Ok(())
}

Use General-purpose Hashing Algorithms

For more algorithms, see Rust Crypto Hashes: sha2, sha1, md-5

Hash with blake3

blake3 blake3~crates.io blake3~repo blake3~lib.rs

blake3 implements the BLAKE3 hash function. BLAKE3 is a cryptographic hash function that is faster than MD5, SHA-1, SHA-2, and SHA-3, yet is at least as secure as the latest standard SHA-3. It is designed to take advantage of parallel processing capabilities. BLAKE3 can produce hashes of arbitrary length, from short digests to longer ones. This is useful for various applications, including key derivation and password hashing. BLAKE3 allows for incremental hashing, i.e. updating the hash state with new data without recomputing the entire hash. This is useful for streaming data or situations where the input is received in chunks.

use blake3::Hasher;

/// This example demonstrates various ways to use the blake3 hashing library in
/// Rust.
fn main() {
    // Example 1: Hashing a simple string.
    // We use the `blake3::hash` function to compute the hash of a string.
    // The `as_bytes()` method converts the string to a byte slice.
    let input_string = "Hello, world!";
    let hash = blake3::hash(input_string.as_bytes());
    println!("Hash of '{input_string}': {hash}");

    // Example 2: Incremental hashing.
    // We create a new `Hasher` instance and update it with multiple byte
    // slices. The `finalize()` method computes the final hash.
    let mut hasher = Hasher::new();
    hasher.update(b"The quick brown ");
    hasher.update(b"fox jumps over ");
    hasher.update(b"the lazy dog.");
    let hash2 = hasher.finalize();
    println!("Incremental hash: {hash2}");

    // Example 3: Hashing a larger byte array.
    let large_data: Vec<u8> = (0..1024).map(|i| (i % 256) as u8).collect(); // Example 1KB data
    let hash3 = blake3::hash(&large_data);
    println!("Hash of 1KB data: {hash3}");

    // Example 4: Using a key for keyed hashing (KMAC).
    let key: &[u8; 32] = b"mysecretkey__________-----------";
    let mut hasher_keyed = blake3::Hasher::new_keyed(key);
    hasher_keyed.update(b"Message to be keyed hashed");
    let keyed_hash = hasher_keyed.finalize();
    // OR let mac = blake3::keyed_hash(key, b"foo");
    println!("Keyed hash: {keyed_hash}");

    // Example 5: Deriving a key using a context string:
    let context = "My application context";
    // Given cryptographic key material of any length and a context string of
    // any length, `derive_key` outputs a 32-byte derived subkey.
    // The context string should be hardcoded, globally unique, and
    // application-specific. A good default format for such strings is
    // "[application] [commit timestamp] [purpose]", e.g., "example.com
    // 2019-12-25 16:18:03 session tokens v1".
    let derived_key = blake3::derive_key(context, b"Input key material");
    println!("Derived Key: {derived_key:?}");

    // Example 6: Extended output.
    let mut output = [0u8; 1000];
    let hasher = blake3::Hasher::new();
    // Finalize the hash state and return an OutputReader, which can supply any
    // number of output bytes.
    let mut output_reader = hasher.finalize_xof();
    output_reader.fill(&mut output); // OutputReader also implements Read and Seek.
    println!("Output: {output:x?}");
}

Hash with sha2

sha2 sha2~crates.io sha2~repo sha2~lib.rs cat~cryptography cat~no-std

SHA-2 (Secure Hash Algorithm 2) is a family of cryptographic hash functions designed by the National Security Agency (NSA) and standardized by NIST.

sha2 is a pure Rust implementation of the SHA-2 hash function family, including SHA-224, SHA-256, SHA-384, and SHA-512. SHA-256 is the most commonly used variant.

// Constant-Time Base64 encoding:
use base64ct::Base64;
use base64ct::Encoding;
// Convenience wrapper trait covering functionality of
// cryptographic hash functions with fixed output size.
use sha2::Digest;
// SHA-256 hasher.
use sha2::Sha256;
// SHA-512 hasher.
use sha2::Sha512;

fn main() -> anyhow::Result<()> {
    // If a complete message is available, then we can use the convenience
    // `Digest::digest` method:
    let hash1 = Sha256::digest(b"my message");
    // Print the hash as a hexadecimal string:
    println!("SHA-256 hash #1: {hash1:x}");

    // Otherwise, create a Sha256 hasher:
    let mut hasher = Sha256::new();

    // Add input data:
    let data = b"hello world";
    hasher.update(data);

    // `update` can be called repeatedly and is generic over `AsRef<[u8]>`:
    hasher.update("String data");

    // Read hash digest and consume hasher:
    let hash2 = hasher.finalize();
    println!("SHA-256 hash #2: {hash2:x}");

    // Same exercise, but using `Sha512` and `chain_update`:
    let hash3 = Sha512::new()
        .chain_update(b"Hello world!")
        // `chain_update` can be called repeatedly and is generic over `AsRef<[u8]>`.
        .chain_update("String data")
        .finalize();

    let base64_hash = Base64::encode_string(&hash3);
    println!("Base64-encoded hash #3: {base64_hash}");

    // Hash the contents of a file:
    // First, we will create a file inside of `env::temp_dir()`.
    let mut file = tempfile::tempfile()?;
    use std::io::Write;
    writeln!(file, "Some data")?;
    // or: let mut file = fs::File::open(&some_path)?;

    // Copies the entire contents of a reader into a writer,
    // in this case the hasher.
    let mut hasher = Sha256::new();

    std::io::copy(&mut file, &mut hasher)?;
    let hash4 = hasher.finalize();
    // Constant-time conversion to hexadecimal:
    let hex_hash = base16ct::lower::encode_string(&hash4);
    println!("Hex-encoded hash #4: {hex_hash}");

    Ok(())
}

foldhash

foldhash foldhash~crates.io foldhash~repo foldhash~lib.rs cat~no-std cat~algorithms

A fast, non-cryptographic, minimally DoS-resistant hashing algorithm.

Legacy Hashing Algorithms

For legacy applications, you may consider using the following hashing algorithms. Note that these algorithms are considered weak by modern cryptographic standards and should not be used for security-sensitive applications.

Hash with sha1

sha1 sha1~crates.io sha1~repo sha1~lib.rs cat~cryptography cat~no-std

sha1 implements the SHA-1 hash function.

use hex_literal::hex;
use sha1::Digest;
use sha1::Sha1;

// BEWARE: SHA-1 is considered cryptographically broken and should NOT be used
// for new security-critical applications. It is primarily used for legacy
// compatibility or non-security-sensitive purposes.

fn main() {
    // Create a SHA-1 hasher:
    let mut hasher = Sha1::new();

    // Process input message:
    hasher.update(b"hello world");

    // Compute the hash digest:
    let result = hasher.finalize();

    // Assert the expected hash value:
    assert_eq!(result[..], hex!("2aae6c35c94fcfb415dbe95f408b9ce91ee846ed"));

    println!("SHA-1 hash of 'hello world': {result:x}");
}
// Example adapted from <https://docs.rs/sha1/0.10.6/sha1/index.html>.

Hash with md-5

md-5 md-5~crates.io md-5~repo md-5~lib.rs cat~cryptography cat~no-std

md-5 implements the MD5 hash function.

//! MD5 hashing example.
//!
//! WARNING: MD5 should be considered cryptographically broken and unsuitable
//! for further use. Collision attacks against MD5 are both practical and
//! trivial.
//!
//! The `md5` crate does not implement the `digest` traits, so it is not
//! interoperable with the RustCrypto ecosystem.
fn main() {
    // Input data:
    let data = "hello world";

    // Compute MD5 hash:
    let digest = md5::compute(data);

    // Print the hash as a hexadecimal string:
    println!("MD5 hash of '{data}': {digest:x}");
}

Cryptograhic Algorithms

Use ring, rust-crypto, sha2. Choose carefully based on security needs and audit history.

  • argon2, scrypt, bcrypt for password hashing,
  • aes-gcm-siv, aes-gcm, and chacha20poly1305 for AEAD Encryption,
  • rsa for RSA,
  • ed25519, ecdsa, dsa for digital signatures,
  • der, pem-rfc7468, pkcs8, x509-cert for certificates,
  • Algorithms.
  • Data Structures.