Uniform Resource Location
Parse a URL from a String to a Url
Type
The url::Url::parse
⮳ method from the url
⮳ crate validates and parses a &str
into a url::Url
⮳ struct. The input string may be malformed so this method returns
Result<Url, ParseError>
.
Once the URL has been parsed, it can be used with all of the methods in the
url::Url
⮳ type.
use url::ParseError; use url::Url; /// Demonstrates URL parsing. /// /// This function parses a URL string and prints the path part of the URL. fn main() -> Result<(), ParseError> { // Define a URL string. let s = "https://github.com/rust-lang/rust/issues?labels=E-easy&state=open"; let parsed = Url::parse(s)?; println!("The path part of the URL is: {}", parsed.path()); Ok(()) }
Create a Base URL by Removing Path Segments
A base URL includes a protocol and a domain. Base URLs have no folders, files or query strings. Each of those items are stripped out of the given URL. url::PathSegmentsMut::clear
⮳ removes paths and url::Url::set_query
⮳ removes query string.
//! This module demonstrates how to extract the base URL from a given URL. //! //! It uses the `url` crate to parse and manipulate URLs. use anyhow::Result; use url::Url; /// Extracts the base URL from a given URL. /// /// This function takes a URL and removes the path segments and query /// parameters, effectively returning the base URL. /// /// # Arguments /// /// * `url` - The URL to extract the base from. /// /// # Returns /// /// Returns a `Result` containing the base URL or an error if the URL cannot be /// processed. fn base_url(mut url: Url) -> Result<Url> { url.set_fragment(None); url.set_query(None); url.set_path(""); // You could also use `path_segments_mut` to return an object with methods // to manipulate the URL's path segments. match url.path_segments_mut() // { Ok(mut path) => { // path.clear(); // } // Err(_) => { // // Some (uncommon) URLs are said to be cannot-be-a-base: // // they don’t have a username, password, host, or port, // // and their "path" is an arbitrary string rather than // slash-separated segments. return Err(anyhow::anyhow!("This // URL is cannot-be-a-base.")); } // } Ok(url) } fn main() -> Result<()> { let full = "https://github.com/rust-lang/cargo?asdf"; let url = Url::parse(full)?; let base = base_url(url)?; assert_eq!(base.as_str(), "https://github.com/"); println!("The base of the URL is: {}", base); Ok(()) }
Create new URLs from a Base URL
The url::Url::join
⮳ method creates a new URL from a base and relative path.
//! Demonstrates how to build a URL by joining a base URL with a path. use url::ParseError; use url::Url; /// Builds a GitHub URL by joining a base URL with a given path. /// /// # Arguments /// /// * `path` - The path to append to the base GitHub URL. /// /// Returns a `Result` containing the joined `Url` or a `ParseError`. fn build_github_url(path: &str) -> Result<Url, ParseError> { const GITHUB: &str = "https://github.com"; let base = Url::parse(GITHUB).expect("This hardcoded URL is known to be valid"); let joined = base.join(path)?; Ok(joined) } fn main() -> Result<(), ParseError> { let path = "/rust-lang/cargo"; let gh = build_github_url(path)?; println!("The joined URL is: {}", gh); assert_eq!(gh.as_str(), "https://github.com/rust-lang/cargo"); Ok(()) }
Extract the URL Origin (scheme / Host / port)
The url::Url
⮳ struct exposes various methods to extract information about the URL it represents.
//! Demonstrates parsing a URL and extracting its origin components. use url::Host; use url::ParseError; use url::Url; fn main() -> Result<(), ParseError> { let s = "ftp://rust-lang.org/examples"; let url = Url::parse(s)?; assert_eq!(url.scheme(), "ftp"); assert_eq!(url.host(), Some(Host::Domain("rust-lang.org"))); assert_eq!(url.port_or_known_default(), Some(21)); println!("The origin is as expected!"); Ok(()) }
url::Url::origin
⮳ produces the same result.
//! This example demonstrates how to parse a URL and extract its origin. //! //! The `url::Url` struct is used to parse the URL string. //! The `url::Host` enum is used to represent the host part of the URL. //! The `url::Origin` enum is used to represent the origin of the URL. use anyhow::Result; use url::Host; use url::Origin; use url::Url; fn main() -> Result<()> { let s = "ftp://rust-lang.org/examples"; let url = Url::parse(s)?; let expected_scheme = "ftp".to_owned(); let expected_host = Host::Domain("rust-lang.org".to_owned()); let expected_port = 21; let expected = Origin::Tuple(expected_scheme, expected_host, expected_port); let origin = url.origin(); assert_eq!(origin, expected); println!("The origin is as expected!"); Ok(()) } #[test] fn test() -> anyhow::Result<()> { main()?; Ok(()) }
Remove Fragment Identifiers and Query Pairs from a URL
Parses url::Url
⮳ and slices it with url::Position
⮳ to strip unneeded URL parts.
//! This example demonstrates how to parse a URL and extract a portion of it. //! //! The `Url::parse` function is used to parse a URL string into a `Url` object. //! The `Position` enum is used to specify a position within the URL. //! In this case, `Position::AfterPath` is used to specify the position after //! the path. The `cleaned` variable is then assigned a slice of the URL string //! from the beginning to the specified position. Finally, the `cleaned` string //! is printed to the console. use url::ParseError; use url::Position; use url::Url; fn main() -> Result<(), ParseError> { let parsed = Url::parse( "https://github.com/rust-lang/rust/issues?labels=E-easy&state=open", )?; let cleaned: &str = &parsed[..Position::AfterPath]; println!("`cleaned`: {}", cleaned); Ok(()) }