General Programming Language Parsing

Parse JavaScript

swc_ecma_parser swc_ecma_parser-crates.io swc_ecma_parser-github swc_ecma_parser-lib.rs

swc_ecma_parser⮳ is a feature-complete ECMAScript / TypeScript parser written in Rust.

//! This example demonstrates how to use the `swc_ecma_parser` crate to parse
//! JavaScript code. It creates a simple JavaScript program, parses it, and
//! then prints the parsed Abstract Syntax Tree (AST) to the console.
//!
//! `swc_ecma_parser` is a library for parsing ECMAScript (JavaScript) code.
//!
//! In `Cargo.toml`, add:
//! ```toml
//! swc_ecma_parser = "11.0.0" # Or latest
//! swc_ecma_ast = "8.0"
//! swc_common = "8.0"
//! ```
use swc_common::FileName;
use swc_common::input::StringInput;
use swc_common::sync::Lrc;
use swc_ecma_ast::EsVersion;
use swc_ecma_ast::Program;
use swc_ecma_parser::EsSyntax;
use swc_ecma_parser::Parser;
use swc_ecma_parser::Syntax;
use swc_ecma_parser::lexer::Lexer;

fn main() {
    let cm: Lrc<swc_common::SourceMap> = Default::default();
    let fm = cm.new_source_file(
        FileName::Custom("example.js".into()).into(),
        "const a = 1;".into(),
    );

    let lexer = Lexer::new(
        Syntax::Es(EsSyntax {
            jsx: true,
            ..Default::default()
        }),
        EsVersion::latest(),
        StringInput::from(&*fm),
        None,
    );

    let mut parser = Parser::new_from(lexer);
    let program: Program = parser.parse_program().expect("Failed to parse");

    // Print the parsed program.
    println!("{:#?}", program);
}

Parse SQL

sqlparser sqlparser-crates.io sqlparser-github sqlparser-lib.rs

sqlparser is a general SQL lexer and parser with support for ANSI SQL:2011.

//! This example demonstrates how to use the `sqlparser` crate to parse SQL
//! statements, extract information from them, and regenerate the original SQL
//! text from the parsed Abstract Syntax Tree (AST).
//!
//! Add to your `Cargo.toml`:
//! ```toml
//! [dependencies]
//! sqlparser = { version = "0.54.0", features = ["visitor"] }
//! ```
use std::ops::ControlFlow;

use sqlparser::ast::SetExpr;
use sqlparser::ast::Statement;
use sqlparser::dialect::GenericDialect;
use sqlparser::parser::Parser;

fn main() -> anyhow::Result<()> {
    let sql = "SELECT * FROM users WHERE age > 18";

    let dialect = GenericDialect {}; // Or AnsiDialect, PostgreSqlDialect, etc.

    // Create a parser for a `Dialect`:
    let mut parser = Parser::new(&dialect).try_with_sql(sql)?;
    // You may configure the parser with e.g.
    // `.with_recursion_limit(n).with_options(options)`.

    // Parse potentially multiple statements; tokenize the sql string and sets
    // this parser's state to parse the resulting tokens.
    let statements = parser.parse_statements()?;

    // You may also use `parse_sql`:
    // let statements = Parser::parse_sql(
    //   &dialect, "SELECT * FROM foo"
    // )?;

    for statement in statements.clone() {
        // `statement` is a top-level construct: SELECT, INSERT, CREATE, etc.
        match statement {
            // SELECT statment.
            Statement::Query(query) => match *query.body {
                // SELECT .. FROM .. HAVING (no ORDER BY or set operations).
                SetExpr::Select(select) => {
                    println!("SELECT statement:");
                    println!("  Projection: {:?}", select.projection);
                    println!("  From: {:?}", select.from);
                    println!("  Where: {:?}", select.selection);
                }
                _ => println!("Not a SELECT statement"),
            },
            _ => println!("Not a Query statement"),
        }
    }

    // The original SQL text can be generated from the AST
    // (Abstract Syntax Tree).
    assert_eq!(statements[0].to_string(), sql);

    // You may also visit all statements, expressions, or tables.
    // You can also implement a custom `Visitor`.
    let mut visited = vec![];
    sqlparser::ast::visit_statements(&statements, |stmt| {
        visited.push(format!("Statement: {}", stmt));
        ControlFlow::<()>::Continue(())
    });
    println!("{:?}", visited);

    Ok(())
}

See also diesel, an ORM that includes SQL parsing.

Parse Rust Code

syn parses Rust code into an AST. quote is often used alongside syn for code generation.

See Write Proc Macros.

Parse WebAssembly (WAT/WASM)

wat parses WAT (WebAssembly Text Format). parity-wasm is a more general WebAssembly tooling library.

Refer to the WASM chapter.