Character Sets

Percent-encode a String

percent_encoding cat-encoding

Encode an input string with percent_encoding⮳ using the percent_encoding crate. Then decode using the percent_encoding::percent_decode⮳ function.

//! Example of percent encoding and decoding strings.

use std::str::Utf8Error;

use percent_encoding::AsciiSet;
use percent_encoding::CONTROLS;
use percent_encoding::percent_decode;
use percent_encoding::utf8_percent_encode;

/// https://url.spec.whatwg.org/#fragment-percent-encode-set
const FRAGMENT: &AsciiSet =
    &CONTROLS.add(b' ').add(b'"').add(b'<').add(b'>').add(b'`');

fn main() -> Result<(), Utf8Error> {
    // The input string we want to encode.
    // Note that the string contains spaces, which are not allowed in the
    // fragment part of a URL. The FRAGMENT AsciiSet will encode spaces as
    // %20.
    let input = "confident, productive systems programming";

    // Encode the string using the FRAGMENT AsciiSet.
    let iter = utf8_percent_encode(input, FRAGMENT);
    let encoded: String = iter.collect();
    println!("{}", encoded);
    assert_eq!(encoded, "confident,%20productive%20systems%20programming");

    // Decode the encoded string.
    let iter = percent_decode(encoded.as_bytes());
    let decoded = iter.decode_utf8()?;
    println!("{}", decoded);
    assert_eq!(decoded, "confident, productive systems programming");

    Ok(())
}

The encode set defines which bytes (in addition to non-ASCII and controls) need to be percent-encoded. The choice of this set depends on context. For example, url encodes ? in a URL path but not in a query string.

The return value of encoding is an iterator of &str slices which collect into a std::string::String⮳.

Encode a String as application/x-www-form-urlencoded

url url-crates.io url-github url-lib.rscat-encodingcat-no-stdcat-parser-implementationscat-web-programming

Encodes a string into application/x-www-form-urlencoded syntax using the form_urlencoded::byte_serialize⮳ and subsequently decodes it with form_urlencoded::parse⮳. Both functions return iterators that collect into a std::string::String⮳.

//! This example demonstrates how to URL-encode and decode strings using the
//! `url` crate.

use url::form_urlencoded::byte_serialize;
use url::form_urlencoded::parse;

fn main() {
    // Encode a string to URL-encoded format.
    // The `byte_serialize` function takes a byte slice and returns an iterator
    // over the encoded bytes.
    let urlencoded: String = byte_serialize("What is ❤?".as_bytes()).collect();
    assert_eq!(urlencoded, "What+is+%E2%9D%A4%3F");
    println!("urlencoded:'{}'", urlencoded);

    let decoded: String = parse(urlencoded.as_bytes())
        .map(|(key, val)| [key, val].concat())
        .collect();
    assert_eq!(decoded, "What is ❤?");
    println!("decoded:'{}'", decoded);
}

Encode and Decode Hexadecimal

data-encoding cat-encoding

The data_encoding⮳ crate provides a HEXUPPER::encode method which takes a &[u8] and returns a std::string::String⮳ containing the hexadecimal representation of the data.

Similarly, a HEXUPPER::decode method is provided which takes a &[u8] and returns a Vec<u8> if the input data is successfully decoded.

The example below coverts &[u8] data to hexadecimal equivalent. Compares this value to the expected value.

//! Example of encoding and decoding a string using hexadecimal encoding.

use data_encoding::DecodeError;
use data_encoding::HEXUPPER;

fn main() -> Result<(), DecodeError> {
    // The original string to be encoded.
    let original = b"The quick brown fox jumps over the lazy dog.";
    let expected = "54686520717569636B2062726F776E20666F78206A756D7073206F76\
        657220746865206C617A7920646F672E";

    let encoded = HEXUPPER.encode(original);
    println!("{}", encoded);
    assert_eq!(encoded, expected);

    let decoded = HEXUPPER.decode(&encoded.into_bytes())?;
    println!("{:?}", decoded);
    assert_eq!(&decoded[..], &original[..]);

    Ok(())
}

Encode and Decode base64

base64 cat-encoding

Encodes byte slice into base64 String using base64::encode and decodes it with base64::decode.

//! This example demonstrates how to encode and decode a string using `base64`.

use std::str;

use anyhow::Result;
use base64::prelude::*;

fn main() -> Result<()> {
    let hello = b"hello rustaceans";
    let encoded: String = BASE64_STANDARD.encode(hello);
    let decoded: Vec<u8> = BASE64_STANDARD.decode(&encoded)?;

    println!("origin: {}", str::from_utf8(hello)?);
    println!("base64 encoded: {}", encoded);
    println!("back to origin: {}", str::from_utf8(&decoded)?);

    Ok(())
}

URL Encoding

percent-encoding percent-encoding-crates.io percent-encoding-github percent-encoding-lib.rs

percent-encoding handles URL encoding and decoding.

//! The following demonstrates how to use percent encoding to handle special
//! characters in URLs.
//!
//! URLs use special characters to indicate the parts of the request. For
//! example, a `?` question mark marks the end of a path and the start of a
//! query string. Percent encoding replaces reserved characters with the `%`
//! escape character followed by a byte value as two hexadecimal digits. For
//! example, an ASCII space is replaced with `%20`.
//!
//! This example uses the NON_ALPHANUMERIC set to encode everything that is not
//! an ASCII letter or digit.

use anyhow::Result;
use percent_encoding::AsciiSet;
use percent_encoding::CONTROLS;
use percent_encoding::NON_ALPHANUMERIC;
use percent_encoding::percent_decode_str;
use percent_encoding::percent_encode;
use percent_encoding::utf8_percent_encode;

/// Encodes a string for use as a URL component.
fn encode_url_component(input: &str) -> String {
    percent_encode(input.as_bytes(), NON_ALPHANUMERIC).to_string()
}

/// Decodes a percent-encoded string.
fn decode_url_component(encoded: &str) -> Result<String> {
    // Decode the percent-encoded string and convert it to UTF-8.
    let decoded = percent_decode_str(encoded).decode_utf8()?.to_string();
    Ok(decoded)
}

// Custom encoding with a specific ASCII set.
fn custom_encode(input: &str) -> String {
    // Create a 'fragment' custom set that only encodes spaces and some special
    // characters.
    /// See https://url.spec.whatwg.org/#fragment-percent-encode-set
    const FRAGMENT: &AsciiSet =
        &CONTROLS.add(b' ').add(b'"').add(b'<').add(b'>').add(b'`');
    // Percent-encode the UTF-8 encoding of the given string.
    utf8_percent_encode(input, FRAGMENT).to_string()
}

fn main() -> Result<()> {
    // Example URL component encoding
    let original = "Hello, World! @#$%^&*()";
    let encoded = encode_url_component(original);
    let decoded = decode_url_component(&encoded)?;

    println!("Original:  {}", original);
    println!("Encoded:   {}", encoded);
    println!("Decoded:   {}", decoded);

    // Demonstrate custom encoding
    let custom_input = "special chars: %+";
    let custom_encoded = custom_encode(custom_input);
    println!("\nCustom Encoding:");
    println!("Original:  {}", custom_input);
    println!("Encoded:   {}", custom_encoded);

    Ok(())
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_encode_decode() -> Result<()> {
        let test_cases = vec![
            "Hello, World!",
            "https://example.com/path?key=value",
            "特殊",
            "spaces and @ symbols",
        ];

        for input in test_cases {
            let encoded = encode_url_component(input);
            let decoded = decode_url_component(&encoded)?;

            assert_eq!(
                input, decoded,
                "Encoding and decoding failed for: {}",
                input
            );
        }

        Ok(())
    }

    #[test]
    fn test_custom_encoding() {
        let input = "test with spaces";
        let custom_encoded = custom_encode(input);

        assert!(custom_encoded.contains("%20"), "Spaces should be encoded");
        assert!(
            !custom_encoded.contains("%2B"),
            "Plus should not be encoded in this test"
        );
    }
}