Hashing
Hashing is a process that converts input data of any size into a "digest", a fixed-size string of characters, typically a sequence of numbers and letters, also known as a hash value or hash code. Hash functions are designed to be fast, deterministic (the same input will always produce the same output), and produce a unique hash for distinct inputs (lack of collisions). One key feature of hashing is that it is a one-way function; you can't reverse-engineer the original input from the hash value.
Hashes are widely used in various applications such as:
- Hash Maps: Efficient data retrieval.
- Data Integrity: Ensuring that data has not been altered by comparing the hash value of the original data with the hash value of the received data.
- Password Storage: Storing hashed versions of passwords rather than plain text to enhance security.
- Digital Signatures: Verifying the authenticity and integrity of messages or documents.
Use the following crates for general-purpose hashing and cryptographic hashing:
blake3
andsha2
for general-purpose hashingahash
for use in in-memory hashmaps.rustc-hash
for fast, non-cryptographic hashing.murmur3
is a non-cryptographic hash function suitable for general hash-based lookup.fnv
is the Fowler–Noll–Vo hash function that is more efficient for smaller hash keys.hashbrown
is a Rust port of Google's high-performance SwissTable hash map, adapted to make it a drop-in replacement for Rust's standardHashMap
andHashSet
types.
You may also use crc
or crc32fast
for CRC checksums.
TODO distinguish between general-purpose / fast / OOS resistant / crypto hashing.
Calculate the SHA-256 Digest of a File
SHA-256 (Secure Hash Algorithm 256-bit) is part of the SHA-2 family of cryptographic hash functions. It produces a fixed-size 256-bit hash value (64 characters) from input data of any size. SHA-256 is widely used in applications such as digital signatures, certificate generation, and data integrity verification.
Writes some data to a file, then calculates the SHA-256 digest::Digest
↗ of the file's contents using digest::Context
↗.
use std::fs; use std::fs::File; use std::io::BufReader; use std::io::Read; use std::io::Write; use anyhow::Result; use data_encoding::HEXUPPER; use ring::digest::Context; use ring::digest::Digest; use ring::digest::SHA256; /// Calculates the SHA-256 digest of the data read from the given reader. fn sha256_digest<R: Read>(mut reader: R) -> Result<Digest> { let mut context = Context::new(&SHA256); let mut buffer = [0; 1024]; loop { let count = reader.read(&mut buffer)?; if count == 0 { break; } context.update(&buffer[..count]); } Ok(context.finish()) } fn main() -> Result<()> { if !fs::exists("temp")? { fs::create_dir("temp")?; } let path = "temp/file.txt"; let mut output = File::create(path)?; write!(output, "We will generate a digest of this text")?; let input = File::open(path)?; let reader = BufReader::new(input); let digest = sha256_digest(reader)?; println!("SHA-256 digest is {}", HEXUPPER.encode(digest.as_ref())); Ok(()) }
Use General-purpose Hashing Algorithms
For more algorithms, see Rust Crypto Hashes: sha2, sha1, md-5
Hash with blake3
blake3
↗ implements the BLAKE3 hash function. BLAKE3 is a cryptographic hash function that is faster than MD5, SHA-1, SHA-2, and SHA-3, yet is at least as secure as the latest standard SHA-3. It is designed to take advantage of parallel processing capabilities. BLAKE3 can produce hashes of arbitrary length, from short digests to longer ones. This is useful for various applications, including key derivation and password hashing. BLAKE3 allows for incremental hashing, i.e. updating the hash state with new data without recomputing the entire hash. This is useful for streaming data or situations where the input is received in chunks.
use blake3::Hasher; /// This example demonstrates various ways to use the blake3 hashing library in /// Rust. fn main() { // Example 1: Hashing a simple string. // We use the `blake3::hash` function to compute the hash of a string. // The `as_bytes()` method converts the string to a byte slice. let input_string = "Hello, world!"; let hash = blake3::hash(input_string.as_bytes()); println!("Hash of '{input_string}': {hash}"); // Example 2: Incremental hashing. // We create a new `Hasher` instance and update it with multiple byte // slices. The `finalize()` method computes the final hash. let mut hasher = Hasher::new(); hasher.update(b"The quick brown "); hasher.update(b"fox jumps over "); hasher.update(b"the lazy dog."); let hash2 = hasher.finalize(); println!("Incremental hash: {hash2}"); // Example 3: Hashing a larger byte array. let large_data: Vec<u8> = (0..1024).map(|i| (i % 256) as u8).collect(); // Example 1KB data let hash3 = blake3::hash(&large_data); println!("Hash of 1KB data: {hash3}"); // Example 4: Using a key for keyed hashing (KMAC). let key: &[u8; 32] = b"mysecretkey__________-----------"; let mut hasher_keyed = blake3::Hasher::new_keyed(key); hasher_keyed.update(b"Message to be keyed hashed"); let keyed_hash = hasher_keyed.finalize(); // OR let mac = blake3::keyed_hash(key, b"foo"); println!("Keyed hash: {keyed_hash}"); // Example 5: Deriving a key using a context string: let context = "My application context"; // Given cryptographic key material of any length and a context string of // any length, `derive_key` outputs a 32-byte derived subkey. // The context string should be hardcoded, globally unique, and // application-specific. A good default format for such strings is // "[application] [commit timestamp] [purpose]", e.g., "example.com // 2019-12-25 16:18:03 session tokens v1". let derived_key = blake3::derive_key(context, b"Input key material"); println!("Derived Key: {derived_key:?}"); // Example 6: Extended output. let mut output = [0u8; 1000]; let hasher = blake3::Hasher::new(); // Finalize the hash state and return an OutputReader, which can supply any // number of output bytes. let mut output_reader = hasher.finalize_xof(); output_reader.fill(&mut output); // OutputReader also implements Read and Seek. println!("Output: {output:x?}"); }
Hash with sha2
SHA-2 (Secure Hash Algorithm 2) is a family of cryptographic hash functions designed by the National Security Agency (NSA) and standardized by NIST.
sha2
↗ is a pure Rust implementation of the SHA-2 hash function family, including SHA-224, SHA-256, SHA-384, and SHA-512. SHA-256 is the most commonly used variant.
// Constant-Time Base64 encoding: use base64ct::Base64; use base64ct::Encoding; // Convenience wrapper trait covering functionality of // cryptographic hash functions with fixed output size. use sha2::Digest; // SHA-256 hasher. use sha2::Sha256; // SHA-512 hasher. use sha2::Sha512; fn main() -> anyhow::Result<()> { // If a complete message is available, then we can use the convenience // `Digest::digest` method: let hash1 = Sha256::digest(b"my message"); // Print the hash as a hexadecimal string: println!("SHA-256 hash #1: {hash1:x}"); // Otherwise, create a Sha256 hasher: let mut hasher = Sha256::new(); // Add input data: let data = b"hello world"; hasher.update(data); // `update` can be called repeatedly and is generic over `AsRef<[u8]>`: hasher.update("String data"); // Read hash digest and consume hasher: let hash2 = hasher.finalize(); println!("SHA-256 hash #2: {hash2:x}"); // Same exercise, but using `Sha512` and `chain_update`: let hash3 = Sha512::new() .chain_update(b"Hello world!") // `chain_update` can be called repeatedly and is generic over `AsRef<[u8]>`. .chain_update("String data") .finalize(); let base64_hash = Base64::encode_string(&hash3); println!("Base64-encoded hash #3: {base64_hash}"); // Hash the contents of a file: // First, we will create a file inside of `env::temp_dir()`. let mut file = tempfile::tempfile()?; use std::io::Write; writeln!(file, "Some data")?; // or: let mut file = fs::File::open(&some_path)?; // Copies the entire contents of a reader into a writer, // in this case the hasher. let mut hasher = Sha256::new(); std::io::copy(&mut file, &mut hasher)?; let hash4 = hasher.finalize(); // Constant-time conversion to hexadecimal: let hex_hash = base16ct::lower::encode_string(&hash4); println!("Hex-encoded hash #4: {hex_hash}"); Ok(()) }
foldhash
A fast, non-cryptographic, minimally DoS-resistant hashing algorithm.
Legacy Hashing Algorithms
For legacy applications, you may consider using the following hashing algorithms. Note that these algorithms are considered weak by modern cryptographic standards and should not be used for security-sensitive applications.
Hash with sha1
sha1
↗ implements the SHA-1 hash function.
use hex_literal::hex; use sha1::Digest; use sha1::Sha1; // BEWARE: SHA-1 is considered cryptographically broken and should NOT be used // for new security-critical applications. It is primarily used for legacy // compatibility or non-security-sensitive purposes. fn main() { // Create a SHA-1 hasher: let mut hasher = Sha1::new(); // Process input message: hasher.update(b"hello world"); // Compute the hash digest: let result = hasher.finalize(); // Assert the expected hash value: assert_eq!(result[..], hex!("2aae6c35c94fcfb415dbe95f408b9ce91ee846ed")); println!("SHA-1 hash of 'hello world': {result:x}"); } // Example adapted from <https://docs.rs/sha1/0.10.6/sha1/index.html>.
Hash with md-5
md-5
↗ implements the MD5 hash function.
//! MD5 hashing example. //! //! WARNING: MD5 should be considered cryptographically broken and unsuitable //! for further use. Collision attacks against MD5 are both practical and //! trivial. //! //! The `md5` crate does not implement the `digest` traits, so it is not //! interoperable with the RustCrypto ecosystem. fn main() { // Input data: let data = "hello world"; // Compute MD5 hash: let digest = md5::compute(data); // Print the hash as a hexadecimal string: println!("MD5 hash of '{data}': {digest:x}"); }
Cryptograhic Algorithms
Use ring
↗, rust-crypto
↗, sha2
↗. Choose carefully based on security needs and audit history.
argon2
,scrypt
,bcrypt
for password hashing,aes-gcm-siv
,aes-gcm
, andchacha20poly1305
for AEAD Encryption,rsa
for RSA,ed25519
,ecdsa
,dsa
for digital signatures,der
,pem-rfc7468
,pkcs8
,x509-cert
for certificates,
Related Topics
- Algorithms.
- Data Structures.