Character Sets
Recipe | Crates | Categories |
---|---|---|
URL Encoding | ||
Encode a String as application/x-www-form-urlencoded | ||
Encode and Decode Hexadecimal | ||
Encode and Decode base64 |
Percent-encode a String
Encode an input string with percent_encoding⮳ using the percent_encoding
⮳ crate. Then decode using the percent_encoding::percent_decode
⮳ function.
//! Example of percent encoding and decoding strings. use std::str::Utf8Error; use percent_encoding::AsciiSet; use percent_encoding::CONTROLS; use percent_encoding::percent_decode; use percent_encoding::utf8_percent_encode; /// https://url.spec.whatwg.org/#fragment-percent-encode-set const FRAGMENT: &AsciiSet = &CONTROLS.add(b' ').add(b'"').add(b'<').add(b'>').add(b'`'); fn main() -> Result<(), Utf8Error> { // The input string we want to encode. // Note that the string contains spaces, which are not allowed in the // fragment part of a URL. The FRAGMENT AsciiSet will encode spaces as // %20. let input = "confident, productive systems programming"; // Encode the string using the FRAGMENT AsciiSet. let iter = utf8_percent_encode(input, FRAGMENT); let encoded: String = iter.collect(); println!("{}", encoded); assert_eq!(encoded, "confident,%20productive%20systems%20programming"); // Decode the encoded string. let iter = percent_decode(encoded.as_bytes()); let decoded = iter.decode_utf8()?; println!("{}", decoded); assert_eq!(decoded, "confident, productive systems programming"); Ok(()) }
The encode set defines which bytes (in addition to non-ASCII and controls) need to be percent-encoded. The choice of this set depends on context. For example, url
⮳ encodes ?
in a URL path but not in a query string.
The return value of encoding is an iterator of &str
slices which collect into a std::string::String
⮳.
Encode a String as application/x-www-form-urlencoded
Encodes a string into application/x-www-form-urlencoded
syntax using the form_urlencoded::byte_serialize
⮳ and subsequently decodes it with form_urlencoded::parse
⮳. Both functions return iterators that collect into a std::string::String
⮳.
//! This example demonstrates how to URL-encode and decode strings using the //! `url` crate. use url::form_urlencoded::byte_serialize; use url::form_urlencoded::parse; fn main() { // Encode a string to URL-encoded format. // The `byte_serialize` function takes a byte slice and returns an iterator // over the encoded bytes. let urlencoded: String = byte_serialize("What is ❤?".as_bytes()).collect(); assert_eq!(urlencoded, "What+is+%E2%9D%A4%3F"); println!("urlencoded:'{}'", urlencoded); let decoded: String = parse(urlencoded.as_bytes()) .map(|(key, val)| [key, val].concat()) .collect(); assert_eq!(decoded, "What is ❤?"); println!("decoded:'{}'", decoded); }
Encode and Decode Hexadecimal
The data_encoding
⮳ crate provides a HEXUPPER::encode
method which takes a &[u8]
and returns a std::string::String
⮳ containing the hexadecimal representation of the data.
Similarly, a HEXUPPER::decode
method is provided which takes a &[u8]
and returns a Vec<u8>
if the input data is successfully decoded.
The example below coverts &[u8]
data to hexadecimal equivalent. Compares this value to the expected value.
//! Example of encoding and decoding a string using hexadecimal encoding. use data_encoding::DecodeError; use data_encoding::HEXUPPER; fn main() -> Result<(), DecodeError> { // The original string to be encoded. let original = b"The quick brown fox jumps over the lazy dog."; let expected = "54686520717569636B2062726F776E20666F78206A756D7073206F76\ 657220746865206C617A7920646F672E"; let encoded = HEXUPPER.encode(original); println!("{}", encoded); assert_eq!(encoded, expected); let decoded = HEXUPPER.decode(&encoded.into_bytes())?; println!("{:?}", decoded); assert_eq!(&decoded[..], &original[..]); Ok(()) }
Encode and Decode base64
Encodes byte slice into base64
⮳ String using base64::encode
and decodes it with base64::decode
.
//! This example demonstrates how to encode and decode a string using `base64`. use std::str; use anyhow::Result; use base64::prelude::*; fn main() -> Result<()> { let hello = b"hello rustaceans"; let encoded: String = BASE64_STANDARD.encode(hello); let decoded: Vec<u8> = BASE64_STANDARD.decode(&encoded)?; println!("origin: {}", str::from_utf8(hello)?); println!("base64 encoded: {}", encoded); println!("back to origin: {}", str::from_utf8(&decoded)?); Ok(()) }
URL Encoding
percent-encoding
⮳ handles URL encoding and decoding.
//! The following demonstrates how to use percent encoding to handle special //! characters in URLs. //! //! URLs use special characters to indicate the parts of the request. For //! example, a `?` question mark marks the end of a path and the start of a //! query string. Percent encoding replaces reserved characters with the `%` //! escape character followed by a byte value as two hexadecimal digits. For //! example, an ASCII space is replaced with `%20`. //! //! This example uses the NON_ALPHANUMERIC set to encode everything that is not //! an ASCII letter or digit. use anyhow::Result; use percent_encoding::AsciiSet; use percent_encoding::CONTROLS; use percent_encoding::NON_ALPHANUMERIC; use percent_encoding::percent_decode_str; use percent_encoding::percent_encode; use percent_encoding::utf8_percent_encode; /// Encodes a string for use as a URL component. fn encode_url_component(input: &str) -> String { percent_encode(input.as_bytes(), NON_ALPHANUMERIC).to_string() } /// Decodes a percent-encoded string. fn decode_url_component(encoded: &str) -> Result<String> { // Decode the percent-encoded string and convert it to UTF-8. let decoded = percent_decode_str(encoded).decode_utf8()?.to_string(); Ok(decoded) } // Custom encoding with a specific ASCII set. fn custom_encode(input: &str) -> String { // Create a 'fragment' custom set that only encodes spaces and some special // characters. /// See https://url.spec.whatwg.org/#fragment-percent-encode-set const FRAGMENT: &AsciiSet = &CONTROLS.add(b' ').add(b'"').add(b'<').add(b'>').add(b'`'); // Percent-encode the UTF-8 encoding of the given string. utf8_percent_encode(input, FRAGMENT).to_string() } fn main() -> Result<()> { // Example URL component encoding let original = "Hello, World! @#$%^&*()"; let encoded = encode_url_component(original); let decoded = decode_url_component(&encoded)?; println!("Original: {}", original); println!("Encoded: {}", encoded); println!("Decoded: {}", decoded); // Demonstrate custom encoding let custom_input = "special chars: %+"; let custom_encoded = custom_encode(custom_input); println!("\nCustom Encoding:"); println!("Original: {}", custom_input); println!("Encoded: {}", custom_encoded); Ok(()) } #[cfg(test)] mod tests { use super::*; #[test] fn test_encode_decode() -> Result<()> { let test_cases = vec![ "Hello, World!", "https://example.com/path?key=value", "特殊", "spaces and @ symbols", ]; for input in test_cases { let encoded = encode_url_component(input); let decoded = decode_url_component(&encoded)?; assert_eq!( input, decoded, "Encoding and decoding failed for: {}", input ); } Ok(()) } #[test] fn test_custom_encoding() { let input = "test with spaces"; let custom_encoded = custom_encode(input); assert!(custom_encoded.contains("%20"), "Spaces should be encoded"); assert!( !custom_encoded.contains("%2B"), "Plus should not be encoded in this test" ); } }