Bit Arrays
Recipe | Crates | Categories |
---|---|---|
Work with C-style Flags using bitflags | ||
Work with C-style Flags using flagset | ||
Create and Manipulate Bit Vectors with bitvec | ||
Store Large Sets of Integers with roaring |
A bit array (also known as bit map, bit set, bit string, or bit vector) is an data structure that allocates one or more adjacent bits for specific purposes, so that any single bit or group of bits within the structure can be set or inspected.
It is a way to represent a set of boolean flags or values with a limited range in a compact manner, using individual bits or small group of bits within an integer or an array of integers. Bit arrays save memory and potentially improve performance.
Work with C-style Flags using bitflags
bitflags
↗ generates ergonomic types for C-style flags. You can use bitflags
to provide user-friendly bindings to C APIs, where flags may or may not be fully known in advance, with string parsing and formatting support.
A flag is a set of bits in a bits type (fixed-width unsigned integers) that may have a unique name. Bits are not required to be exclusive to a flag. Bits are not required to be contiguous.
The following example creates a type MyFlags
with the help of the bitflags::bitflags
↗ macro and subsequently shows basic bitwise operations and formatting:
//! `bitflags` generate types for C-style flags with ergonomic APIs. use std::fmt; use bitflags::bitflags; // Define a flag type using the `bitflags` macro. bitflags! { // Attributes can be applied to flags types: #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] // You can derive additional traits on generated flags types if you enable the corresponding features: // #[derive(Serialize, Deserialize)] pub struct MyFlags: u32 { // A flag is a set of bits that may have a unique name. const FLAG_A = 0b0000_0001; const FLAG_B = 0b0000_0010; // Or 1 << 1; const FLAG_C = 0b0000_0100; // Flags that set multiple bits are possible: const FLAG_ABC = Self::FLAG_A.bits() | Self::FLAG_B.bits() | Self::FLAG_C.bits(); // Any bits in a flag you define are called "known bits". Any other bits are "unknown bits". // Some operators will unset any unknown bits they encounter // If you're generating flags types for an external source, such as a C API, // you can define an extra unnamed flag as a mask of all bits the external source may ever set. // const _ = !0; } } // `impl` blocks can be added to flags types to implement additional methods. impl MyFlags { // Convert the bitflags to a u64: pub fn as_u64(&self) -> u64 { self.bits() as u64 } } // Implement the `Display` trait for `MyFlags` to customize its string // representation: impl fmt::Display for MyFlags { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "{:032b}", self.bits()) } } fn main() { // Demonstrate bitwise operations with the defined flags. // Union: let e1 = MyFlags::FLAG_A | MyFlags::FLAG_C; let e2 = MyFlags::FLAG_B | MyFlags::FLAG_C; assert_eq!((e1 | e2), MyFlags::FLAG_ABC); // Intersection: assert_eq!((e1 & e2), MyFlags::FLAG_C); // Difference: assert_eq!((e1 - e2), MyFlags::FLAG_A); // Complement: assert_eq!(!e2, MyFlags::FLAG_A); // Use the `fmt::Display` implementation above: println!("e1: {e1} e2: {e2}"); // Demonstrate formatting options for the bitflags: let flags = MyFlags::FLAG_ABC; assert_eq!(format!("{flags}"), "00000000000000000000000000000111"); assert_eq!(format!("{:?}", MyFlags::FLAG_B), "MyFlags(FLAG_B)"); assert_eq!( format!("{:?}", MyFlags::FLAG_A | MyFlags::FLAG_B), "MyFlags(FLAG_A | FLAG_B)" ); println!("{flags:?}"); // Bitwise pattern matching: bitflags::bitflags_match!(flags, { MyFlags::FLAG_A | MyFlags::FLAG_B => println!("The value is A and B"), _ => println!("The value is not A and B"), }); }
Work with C-style Flags using flagset
flagset
↗ is an alternative to crates like bitflags
↗ and enumflags
↗. A flagset
flag definition looks just like a regular enumeration, with the addition of the field-size type:
//! Example of using the `flagset` crate to define and use bit flags.
//!
//! `flagset` has no dependencies, incl. the stdlib, so it can be used in
//! `no_std` environments, such as embedded systems or kernel development.
use std::os::raw::c_int;
use flagset::FlagSet;
use flagset::flags;
// Flags are defined using the `flags!` macro:
flags! {
// Flag values can be defined implicitly:
enum FlagsA: u8 { // Note the field-size type.
Foo, // 0b0001.
Bar, // 0b0010.
Baz, // 0b0100.
}
// Flag values can also be defined explicitly...
enum FlagsB: c_int {
Foo = 0x01,
Bar = 2,
Baz = 0b0110, // ...and overlap.
All = (FlagsB::Foo | FlagsB::Bar | FlagsB::Baz).bits(),
}
}
// Attributes can be used on the enumeration itself or any of the values.
// A collection of flags is a `FlagSet<T>`.
// If you are storing the flags in memory, the raw `FlagSet<T>` type should be
// used.
#[derive(Debug)]
struct Container(FlagSet<FlagsA>);
impl Container {
// However, if you want to receive flags as an input to a function, you
// should use `impl Into<FlagSet<T>>`.
fn new(flags: impl Into<FlagSet<FlagsA>>) -> Container {
Container(flags.into())
}
}
fn main() {
let container = Container::new(FlagsA::Foo | FlagsA::Bar);
assert_eq!(container.0.bits(), 0b011);
println!("{container:?}");
assert_eq!(Container::new(None).0.bits(), 0b000);
}
// Adapted from <https://docs.rs/flagset>
Create and Manipulate Bit Vectors with bitvec
bitvec
↗ provides efficient storage and manipulation of bit vectors. It addresses memory by bits, for packed collections and bitfields. This example demonstrates the usage of the bitvec
crate:
//! Add the following to your `Cargo.toml`: //! ```toml //! [dependencies] //! bitvec = "1.0.1" # Or latest //! ``` use bitvec::prelude::*; fn main() { // Create a new `BitVec` with default parameters: let mut bv = bitvec![u8, Msb0; 0, 1, 0, 1, 1, 0, 1, 0]; println!("Original BitVec: {bv}"); // Access individual bits: println!("First bit: {}", bv[0]); println!("Second bit: {}", bv[1]); // Modify bits. The parameters are: index, value: bv.set(0, true); bv.set(5, true); println!("Modified BitVec: {bv}"); // Get the length: println!("BitVec length: {}", bv.len()); // Check if all bits are set: println!("All bits set: {}", bv.all()); // Check if any bits are set: println!("Any bits set: {}", bv.any()); // Count set bits: println!("Number of set bits: {}", bv.count_ones()); // Iterate through bits: print!("Iterating through bits: "); for bit in bv.iter() { print!("{bit} "); } println!(); println!("\n===== Different Endianness and Storage Types ====="); // Create `BitVec`s with different storage types and endianness: let bv_lsb0 = bitvec![u16, Lsb0; 0, 1, 0, 1, 1, 0, 1, 0]; println!("u16 LSB BitVec: {bv_lsb0}"); let bv_u32 = bitvec![u32, Msb0; 0, 1, 0, 1, 1, 0, 1, 0]; println!("u32 MSB BitVec: {bv_u32}"); println!("\n===== Bit Manipulation Operations ====="); // Create two `BitVec`s for bitwise operations: let mut a = bitvec![u8, Msb0; 1, 1, 0, 0, 1, 0, 1, 0]; let b = bitvec![u8, Msb0; 0, 1, 1, 0, 0, 1, 1, 1]; println!("a: {a}"); println!("b: {b}"); // Bitwise AND: let c = a.clone() & b.clone(); println!("a & b: {c}"); // Bitwise OR: let d = a.clone() | b.clone(); println!("a | b: {d}"); // Bitwise XOR: let e = a.clone() ^ b.clone(); println!("a ^ b: {e}"); // Bitwise NOT: let f = !a.clone(); println!("!a: {f}"); // In-place operations: a &= &b; println!("a &= b: {a}"); println!("\n===== Slicing and Ranges ====="); let mut bv_slice = bitvec![u8, Msb0; 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0]; println!("Original BitVec: {bv_slice}"); // Get a slice: let slice = &bv_slice[2..6]; println!("Slice [2..6]: {slice}"); // Modify a slice: bv_slice[4..8].fill(true); println!("After filling [4..8] with `1`s: {bv_slice}"); println!("\n===== Practical Applications ====="); // Example 1: Bit flags e.g. file permissions. let mut flags = bitvec![u8, Msb0; 0; 8]; // Set flags: flags.set(0, true); // READ. flags.set(1, true); // WRITE. flags.set(2, false); // EXECUTE. println!("Permission flags: {flags}"); println!("Can read: {}", flags[0]); println!("Can write: {}", flags[1]); println!("Can execute: {}", flags[2]); // Example 2: Bit packing - pack multiple boolean values efficiently. let mut packed = BitVec::<u8, Msb0>::new(); // Appends single bits to the vector: packed.push(true); // is_active. packed.push(false); // is_admin. packed.push(true); // is_verified. packed.push(false); // is_premium. println!("Packed bits: {packed}"); // Example 3: Bit-packed struct. use bitvec::field::BitField; let mut data = bitvec![u16, Lsb0; 0; 16]; // Pack (small) values into specific bit ranges: data[0..3].store(0b101u8); data[3..8].store(0b10011u8); data[8..16].store(0b11001010u8); println!("Bit-packed data: {data}"); // Extract values: let value1: u8 = data[0..3].load(); let value2: u8 = data[3..8].load(); let value3: u8 = data[8..16].load(); println!("Extracted values: {value1}, {value2}, {value3}"); }
Store Large Sets of Integers with roaring
Bitmaps are commonly used as fast data structures, for example as indices for databases, search engines, and big data processing systems.
Unfortunately, they can use too much memory. To compensate, roaring
↗ implements compressed bitmap data structures. They efficiently store and manipulate large sets of integers:
use roaring::RoaringBitmap; fn main() { // Create two bitmaps representing two sets of user IDs: // These are highly optimized data structures for storing sets of 32-bit // unsigned integers (u32). let mut online_users = RoaringBitmap::new(); let mut subscribed_users = RoaringBitmap::new(); // Add some user IDs to the bitmaps. Users with IDs 1, 5, 1000 are online. // Roaring bitmaps automatically handle compression, making them very // memory-efficient for both sparse and dense data. online_users.insert(1); online_users.insert(5); online_users.insert(1000); println!("Online users: {online_users:?}"); println!("Number of online users: {}", online_users.len()); // Users with IDs 5, 1000, 2000 are subscribed. subscribed_users.insert(5); subscribed_users.insert(1000); subscribed_users.insert(2000); println!("Subscribed users: {subscribed_users:?}"); // Perform a set operation: union. // The '|' operator calculates the union of the two sets. // This finds all users who are either online OR subscribed. let all_relevant_users = &online_users | &subscribed_users; println!("All relevant users (union): {all_relevant_users:?}"); // Check for the presence of specific user IDs. let user_id_to_check = 99; if all_relevant_users.contains(user_id_to_check) { println!("User {user_id_to_check} is in the relevant set."); } // Intersection (&): users who are both online AND subscribed. let active_and_subscribed = &online_users & &subscribed_users; println!("Active and subscribed (intersection): {active_and_subscribed:?}"); // Difference (-): users who are online BUT NOT subscribed. let online_only = &online_users - &subscribed_users; println!("Online but not subscribed (difference): {online_only:?}"); }
Related Topics
- Vectors.