Welcome to Chapter 11! In the previous chapters, we meticulously built the core components of our Mermaid analyzer and fixer: the lexer, parser, AST, validator, rule engine, and formatter. We laid a strong foundation with a focus on strict correctness and deterministic behavior. However, a production-grade tool is only as reliable as its test suite. This chapter is dedicated to establishing a comprehensive testing strategy that ensures the integrity, robustness, and long-term maintainability of our mermaid-tool.

In this chapter, we will implement three crucial types of testing: unit tests for granular component verification, golden (snapshot) tests for verifying complex input-output transformations, and fuzz testing to uncover unexpected edge cases and vulnerabilities by feeding random, malformed inputs. By the end of this chapter, our mermaid-tool will be backed by a robust test harness, giving us high confidence in its correctness and its ability to handle real-world, often messy, Mermaid code. This is a critical step towards a truly production-ready compiler-like tool.

Planning & Design

Building a compiler-like tool demands a multi-faceted testing approach. Our strategy will combine the precision of unit tests, the comprehensive coverage of golden tests, and the resilience-building power of fuzz testing.

Testing Strategy Overview

  1. Unit Tests: These are small, isolated tests that verify the behavior of individual functions, methods, or modules. They ensure that each component works as expected in isolation. We’ll add these directly within the src modules using #[cfg(test)].
  2. Golden Tests (Snapshot Testing): For components that transform input to output (like the lexer, parser, and formatter), golden tests are invaluable. They capture the expected output for a given input and store it as a “snapshot.” Subsequent test runs compare the current output against this snapshot, immediately flagging any regressions. This is particularly useful for verifying the complex structures of ASTs or formatted code. We’ll use the insta crate for this.
  3. Fuzz Testing: This technique involves feeding random, often malformed, inputs to our program to discover crashes, panics, or unexpected behavior. For a tool that processes arbitrary user input, fuzzing is essential for identifying robustness issues that traditional tests might miss. We’ll leverage cargo-fuzz to integrate libfuzzer with our Rust code.

Testing Process Flow

flowchart TD A[Start Testing Process] --> B{Test Type?}; subgraph Unit_Tests["Unit Tests (src/**/*.rs)"] B -->|Unit Test| C[Lexer Unit Tests]; C --> D[Parser Unit Tests]; D --> E[Validator Unit Tests]; E --> F[Rule Engine Unit Tests]; F --> G[Formatter Unit Tests]; end subgraph Golden_Tests["Golden Tests (tests/golden/)"] B -->|Golden Test| H[Lexer Snapshot Tests]; H --> I[Parser Snapshot Tests]; I --> J[Formatter Snapshot Tests]; J --> K{Snapshots Match?}; K -->|Yes| L[Golden Tests Passed]; K -->|No| M[Review & Accept Snapshots]; end subgraph Fuzz_Tests["Fuzz Tests (fuzz/)"] B -->|Fuzz Test| N[Lexer Fuzzing]; N --> O[Parser Fuzzing]; O --> P{Crash/Panic Found?}; P -->|Yes| Q[Analyze Crash & Fix Bug]; P -->|No| R[Fuzzing Continues]; end L --> S[All Tests Complete]; M --> A; Q --> A;

File Structure for Tests

Our project will adopt the following test file structure:

  • Unit Tests: Located alongside the code they test, typically in a tests submodule within the src file (e.g., src/lexer.rs would have a mod tests { ... }).
  • Golden Tests: Reside in the top-level tests/ directory, following a convention like tests/golden/<component>_snapshots/.
  • Fuzz Tests: Managed by cargo-fuzz in a dedicated fuzz/ directory at the project root.
mermaid-tool/
├── src/
│   ├── main.rs
│   ├── lexer.rs          # Contains lexer code and its unit tests
│   ├── parser.rs         # Contains parser code and its unit tests
│   ├── ast.rs
│   ├── validator.rs      # Contains validator code and its unit tests
│   ├── rule_engine.rs    # Contains rule engine code and its unit tests
│   ├── formatter.rs      # Contains formatter code and its unit tests
│   └── diagnostics.rs
├── tests/
│   ├── golden/
│   │   ├── lexer_snapshots/
│   │   ├── parser_snapshots/
│   │   └── formatter_snapshots/
│   └── integration_tests.rs # (Optional, for higher-level CLI tests)
├── fuzz/
│   ├── Cargo.toml
│   └── fuzz_targets/
│       ├── lexer_fuzzer.rs
│       └── parser_fuzzer.rs
└── Cargo.toml

Step-by-Step Implementation

1. Setup/Configuration: Adding Test Dependencies

First, we need to add the necessary testing crates to our Cargo.toml.

Open Cargo.toml at the project root and add the following under [dev-dependencies]:

# Cargo.toml

[dev-dependencies]
# For snapshot testing
insta = { version = "1.34", features = ["yaml"] }
# For fuzz testing (not a direct dependency, but cargo-fuzz uses it)
# We'll install cargo-fuzz separately
# proptest = "1.2" # Optional for property-based testing, but fuzzing is more comprehensive

Why these dependencies?

  • insta: A powerful snapshot testing library for Rust. It serializes data structures (like our tokens or AST) or strings to a file and compares them on subsequent runs. If there’s a mismatch, it provides a diff and allows easy updating. We enable the yaml feature for human-readable snapshot files.

2. Core Implementation: Unit Tests

We’ve likely already added some basic unit tests in previous chapters. Now, let’s ensure each core module has comprehensive unit tests covering its specific logic, edge cases, and error conditions. We’ll focus on demonstrating the structure for a couple of key modules.

a) Lexer Unit Tests

Navigate to src/lexer.rs. Inside, add or expand the #[cfg(test)] module.

// src/lexer.rs

// ... existing lexer code ...

#[cfg(test)]
mod tests {
    use super::*;
    use crate::diagnostics::{Diagnostic, Span}; // Assuming Diagnostic and Span are defined
    use crate::token::{Token, TokenType};

    #[test]
    fn test_empty_input() {
        let mut lexer = Lexer::new("");
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Eof);
        assert!(lexer.diagnostics.is_empty());
    }

    #[test]
    fn test_basic_graph_declaration() {
        let mut lexer = Lexer::new("graph TD");
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::KeywordGraph);
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::DirectionTD);
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Eof);
        assert!(lexer.diagnostics.is_empty());
    }

    #[test]
    fn test_node_and_edge() {
        let input = "A-->B";
        let mut lexer = Lexer::new(input);
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Identifier); // A
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::ArrowDirectional); // -->
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Identifier); // B
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Eof);
        assert!(lexer.diagnostics.is_empty());
    }

    #[test]
    fn test_multiline_input_with_comments() {
        let input = r#"
        graph TD
            A[Node A] --> B(Node B) %% This is a comment
            B --> C{Decision}
        "#;
        let mut lexer = Lexer::new(input);

        // Expect tokens, skipping whitespace
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::KeywordGraph);
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::DirectionTD);
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Identifier); // A
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::BracketSquareOpen); // [
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::StringLiteral); // Node A
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::BracketSquareClose); // ]
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::ArrowDirectional); // -->
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Identifier); // B
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::ParenOpen); // (
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::StringLiteral); // Node B
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::ParenClose); // )
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Comment); // %% This is a comment
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Identifier); // B
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::ArrowDirectional); // -->
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Identifier); // C
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::BracketCurlyOpen); // {
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::StringLiteral); // Decision
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::BracketCurlyClose); // }
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Eof);
        assert!(lexer.diagnostics.is_empty());
    }

    #[test]
    fn test_invalid_character() {
        let mut lexer = Lexer::new("A -> #Invalid");
        // Lexer should tokenize valid parts and report error for invalid char
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Identifier); // A
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::ArrowDirectional); // ->
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Invalid); // #
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Identifier); // Invalid
        assert_eq!(lexer.next_token().unwrap().token_type, TokenType::Eof);
        assert_eq!(lexer.diagnostics.len(), 1);
        assert_eq!(lexer.diagnostics[0].code, "L001"); // Assuming L001 is for invalid character
    }

    // Add more tests for:
    // - All arrow types
    // - Different bracket types
    // - String literals with escaped quotes
    // - Subgraphs
    // - Class diagrams keywords
    // - Sequence diagram syntax
    // - Numbers, operators, etc.
}

Explanation:

  • We use #[cfg(test)] to ensure this module is only compiled when running tests.
  • use super::*; brings the Lexer struct and its methods into scope.
  • Each #[test] function focuses on a specific aspect of the lexer’s behavior.
  • We manually assert the TokenType of each token returned by lexer.next_token().
  • We also check lexer.diagnostics to ensure no unexpected errors are reported for valid input, and expected errors for invalid input.
  • This granular testing helps pinpoint issues quickly.

b) Parser Unit Tests

Similarly, for src/parser.rs, you would create a #[cfg(test)] module to test parsing logic.

// src/parser.rs

// ... existing parser code ...

#[cfg(test)]
mod tests {
    use super::*;
    use crate::lexer::Lexer; // Need Lexer to generate tokens
    use crate::ast::*; // Assuming AST structs are in ast.rs
    use crate::diagnostics::{Diagnostic, Severity};

    fn parse_test_input(input: &str) -> (Option<Diagram>, Vec<Diagnostic>) {
        let mut lexer = Lexer::new(input);
        let tokens = lexer.tokenize(); // Assuming a tokenize method that returns Vec<Token>
        let mut parser = Parser::new(tokens);
        let diagram = parser.parse();
        (diagram, parser.diagnostics)
    }

    #[test]
    fn test_simple_flowchart_td() {
        let input = "graph TD\nA-->B";
        let (diagram, diagnostics) = parse_test_input(input);

        assert!(diagnostics.is_empty(), "Diagnostics: {:?}", diagnostics);
        let diagram = diagram.expect("Should have parsed a diagram");
        assert_eq!(diagram.diagram_type, DiagramType::Flowchart);
        assert_eq!(diagram.direction, Some(GraphDirection::TD));
        // Further assertions to check the structure of nodes and edges in the AST
        if let DiagramContent::Flowchart(fc) = diagram.content {
            assert_eq!(fc.nodes.len(), 2);
            assert_eq!(fc.edges.len(), 1);
            assert_eq!(fc.nodes[0].id.as_str(), "A");
            assert_eq!(fc.nodes[1].id.as_str(), "B");
            assert_eq!(fc.edges[0].source_node.as_str(), "A");
            assert_eq!(fc.edges[0].target_node.as_str(), "B");
            assert_eq!(fc.edges[0].arrow.edge_type, EdgeType::Directional);
        } else {
            panic!("Expected a flowchart diagram");
        }
    }

    #[test]
    fn test_flowchart_with_subgraph() {
        let input = r#"
        graph LR
            subgraph My Subgraph
                A --> B
            end
            C --> A
        "#;
        let (diagram, diagnostics) = parse_test_input(input);

        assert!(diagnostics.is_empty(), "Diagnostics: {:?}", diagnostics);
        let diagram = diagram.expect("Should have parsed a diagram");
        // Assertions for subgraph presence and structure
        if let DiagramContent::Flowchart(fc) = diagram.content {
            assert_eq!(fc.subgraphs.len(), 1);
            assert_eq!(fc.subgraphs[0].id.as_str(), "My_Subgraph"); // Normalized ID
            assert_eq!(fc.subgraphs[0].label, Some("My Subgraph".to_string()));
            assert_eq!(fc.subgraphs[0].nodes.len(), 2);
            assert_eq!(fc.subgraphs[0].edges.len(), 1);
            assert_eq!(fc.edges.len(), 1); // C --> A
        } else {
            panic!("Expected a flowchart diagram");
        }
    }

    #[test]
    fn test_parser_error_unmatched_bracket() {
        let input = "graph TD\nA[Node A --> B"; // Missing closing bracket
        let (diagram, diagnostics) = parse_test_input(input);

        assert!(diagram.is_none()); // Parsing should fail or be incomplete
        assert!(!diagnostics.is_empty());
        assert_eq!(diagnostics[0].severity, Severity::Error);
        assert_eq!(diagnostics[0].code, "P001"); // Assuming P001 is for unmatched bracket
        assert!(diagnostics[0].message.contains("Expected ']'"));
    }

    // Add more tests for:
    // - All diagram types (sequence, class, state, gantt, etc.)
    // - Complex node definitions (shapes, labels)
    // - Various edge types and labels
    // - Multiline statements
    // - Error recovery scenarios
    // - Empty diagrams
}

Explanation:

  • A helper parse_test_input function simplifies test setup by handling lexing and parsing.
  • Tests assert the presence and correctness of Diagram and its content (e.g., Flowchart struct).
  • Error cases are tested by asserting that diagram is None and diagnostics contain the expected errors.
  • These tests ensure the parser correctly translates token streams into the expected AST structure and reports errors accurately.

3. Core Implementation: Golden Tests (Snapshot Testing with insta)

Golden tests are perfect for verifying that our lexer, parser, and formatter consistently produce the expected output for a given input.

a) Setting up Golden Test Directory

Create the tests/golden/ directory and its subdirectories:

mkdir -p tests/golden/lexer_snapshots
mkdir -p tests/golden/parser_snapshots
mkdir -p tests/golden/formatter_snapshots

b) Lexer Golden Tests

Create a new file tests/golden/test_lexer.rs:

// tests/golden/test_lexer.rs

use insta::assert_debug_snapshot;
use crate::lexer::Lexer;
use crate::token::Token; // Assuming Token is public

#[test]
fn lexer_snapshot_basic_flowchart() {
    let input = "graph TD\n    A[Start] --> B(Process)\n    B --> C{Decision}\n    C -->|Yes| D[End]";
    let mut lexer = Lexer::new(input);
    let tokens: Vec<Token> = lexer.tokenize(); // Assuming tokenize() method returns Vec<Token>
    assert_debug_snapshot!("lexer_basic_flowchart", tokens);
}

#[test]
fn lexer_snapshot_sequence_diagram() {
    let input = r#"
    sequenceDiagram
        Alice->>Bob: Hello Bob, how are you?
        Bob-->>Alice: I am good thanks!
    "#;
    let mut lexer = Lexer::new(input);
    let tokens: Vec<Token> = lexer.tokenize();
    assert_debug_snapshot!("lexer_sequence_diagram", tokens);
}

#[test]
fn lexer_snapshot_class_diagram_with_members() {
    let input = r#"
    classDiagram
        Class01 <|-- Class02
        Class03 *-- Class04
        Class05 o-- Class06
        Class07 .. Class08
        Class09 -- Class10
        Class01 : +getField1()
        Class01 : +getField2()
        Class01 : +method1()
        Class01 : -method2()
        Class02 : +method3()
    "#;
    let mut lexer = Lexer::new(input);
    let tokens: Vec<Token> = lexer.tokenize();
    assert_debug_snapshot!("lexer_class_diagram_members", tokens);
}

#[test]
fn lexer_snapshot_multiline_labels_and_comments() {
    let input = r#"
    graph TD
        A["This is a very
        long multiline label"] --> B{A decision
        point}; %% Important comment
    "#;
    let mut lexer = Lexer::new(input);
    let tokens: Vec<Token> = lexer.tokenize();
    assert_debug_snapshot!("lexer_multiline_labels_comments", tokens);
}

// Add more snapshots for complex syntax, error cases, various diagram types.

Explanation:

  • use insta::assert_debug_snapshot; brings the macro into scope.
  • lexer.tokenize() is assumed to be a method that consumes the lexer and returns a Vec<Token>. If Lexer only has next_token(), you’d need to collect them.
  • The first argument to assert_debug_snapshot! is a name for the snapshot file. insta will automatically create files like tests/golden/lexer_snapshots/lexer_basic_flowchart.snap.
  • The second argument is the value to be serialized. We use debug serialization, so your Token struct and its members should derive Debug.

c) Parser Golden Tests

Create a new file tests/golden/test_parser.rs:

// tests/golden/test_parser.rs

use insta::assert_debug_snapshot;
use crate::lexer::Lexer;
use crate::parser::Parser;
use crate::ast::Diagram; // Assuming Diagram is public and derives Debug

fn parse_and_get_diagram(input: &str) -> Option<Diagram> {
    let mut lexer = Lexer::new(input);
    let tokens = lexer.tokenize();
    let mut parser = Parser::new(tokens);
    parser.parse()
}

#[test]
fn parser_snapshot_simple_flowchart() {
    let input = "graph TD\n    A[Start] --> B(Process)";
    let diagram = parse_and_get_diagram(input);
    assert_debug_snapshot!("parser_simple_flowchart", diagram);
}

#[test]
fn parser_snapshot_flowchart_with_subgraph() {
    let input = r#"
    graph LR
        subgraph My_Subgraph["My Subgraph"]
            A --> B
        end
        C --> A
    "#;
    let diagram = parse_and_get_diagram(input);
    assert_debug_snapshot!("parser_flowchart_with_subgraph", diagram);
}

#[test]
fn parser_snapshot_sequence_diagram_full() {
    let input = r#"
    sequenceDiagram
        participant Alice
        participant Bob
        Alice->>Bob: Hello Bob, how are you?
        activate Bob
        Bob-->>Alice: I am good thanks!
        deactivate Bob
        Note left of Alice: Alice thinks a lot
        alt successful case
            Alice->Bob: Let's go!
        else some other way
            Alice->Bob: Oh no!
        end
    "#;
    let diagram = parse_and_get_diagram(input);
    assert_debug_snapshot!("parser_sequence_diagram_full", diagram);
}

// Add more snapshots for various diagram types, complex structures,
// and even malformed inputs where you expect a partial AST or specific errors.

Explanation:

  • The parse_and_get_diagram helper simplifies getting the AST.
  • We assert the Debug representation of the Option<Diagram> directly. This will capture the entire parsed AST structure.
  • For insta to work, all structs within your Diagram (e.g., Node, Edge, Subgraph, DiagramType, etc.) must derive Debug.

d) Formatter Golden Tests

Create a new file tests/golden/test_formatter.rs:

// tests/golden/test_formatter.rs

use insta::assert_snapshot; // Using assert_snapshot for string output
use crate::lexer::Lexer;
use crate::parser::Parser;
use crate::formatter::Formatter; // Assuming a Formatter struct

fn format_test_input(input: &str) -> String {
    let mut lexer = Lexer::new(input);
    let tokens = lexer.tokenize();
    let mut parser = Parser::new(tokens);
    let diagram = parser.parse().expect("Failed to parse for formatting test"); // Expect valid diagrams for formatter tests
    let mut formatter = Formatter::new(); // Assuming Formatter can be initialized
    formatter.format_diagram(&diagram) // Assuming format_diagram returns a String
}

#[test]
fn formatter_snapshot_simple_flowchart() {
    let input = "graph TD\nA-->B";
    let formatted_output = format_test_input(input);
    assert_snapshot!("formatter_simple_flowchart", formatted_output);
}

#[test]
fn formatter_snapshot_unformatted_flowchart() {
    let input = "graph TD\n    A[Node A]    -->  B(Node B)\nB   --  >C{Decision}\n    C-->  |Yes|D[End]";
    let formatted_output = format_test_input(input);
    assert_snapshot!("formatter_unformatted_flowchart", formatted_output);
}

#[test]
fn formatter_snapshot_multiline_labels_and_comments() {
    let input = r#"
    graph TD
        A["This is a very
        long multiline label"] --> B{A decision
        point}; %% Important comment
    "#;
    let formatted_output = format_test_input(input);
    assert_snapshot!("formatter_multiline_labels_comments", formatted_output);
}

#[test]
fn formatter_snapshot_sequence_diagram_complex() {
    let input = r#"
    sequenceDiagram
        participant Alice as The Great Alice
        participant Bob as Mr. Bob
        Alice->>Bob: Hello Bob, how are you?
        Note right of Bob: Bob processes request
        activate Bob
        Bob-->>Alice: I am good thanks!
        deactivate Bob
        alt successful case
            Alice->>Bob: Let's proceed
        else some other way
            Alice->>Bob: Abort!
        end
        loop Every day
            Bob->>Alice: Daily report
        end
    "#;
    let formatted_output = format_test_input(input);
    assert_snapshot!("formatter_sequence_diagram_complex", formatted_output);
}

// Add more formatter tests to cover all diagram types and complex formatting rules.

Explanation:

  • For formatter tests, insta::assert_snapshot! is often preferred over assert_debug_snapshot! because we want to compare the actual string output, not its debug representation.
  • The format_test_input helper ensures we’re testing the full pipeline from raw input to formatted output.

Running Golden Tests:

  1. First Run (Generate Snapshots):
    cargo test
    
    This will run all tests. insta will detect missing snapshots and create them. It will then report them as “new” or “rejected” (meaning they don’t match the expectation, but there was no existing snapshot).
  2. Review and Accept Snapshots:
    cargo insta review
    
    This command will open an interactive tool (or print to console if not interactive) allowing you to review the newly generated or changed snapshots. You can accept them (if they’re correct) or reject them (if they represent a bug).
  3. Subsequent Runs:
    cargo test
    
    Now, cargo test will compare the current output against the accepted snapshots. If any mismatch occurs, the test will fail, and insta will provide a diff. You’d then use cargo insta review again to either accept the new behavior (if it’s an intentional change) or fix your code (if it’s a regression).

4. Core Implementation: Fuzz Testing (cargo-fuzz)

Fuzz testing is crucial for discovering unexpected panics or crashes when our tool processes malformed or random Mermaid code. This is particularly important for the lexer and parser, which are the first line of defense against arbitrary input.

a) Install cargo-fuzz

If you haven’t already, install the cargo-fuzz command-line tool:

cargo install cargo-fuzz

b) Initialize Fuzz Project

Navigate to your project root (mermaid-tool/) and initialize the fuzzing setup:

cargo fuzz init

This command creates a fuzz/ directory with its own Cargo.toml and an example fuzz target.

c) Configure fuzz/Cargo.toml

The fuzz/Cargo.toml needs to know about our main project’s crates. Open fuzz/Cargo.toml and add a path dependency to mermaid-tool:

# fuzz/Cargo.toml

[package]
name = "mermaid-tool-fuzz"
version = "0.0.0"
publish = false
edition = "2021"

[dependencies]
libfuzzer-sys = { git = "https://github.com/rust-fuzz/libfuzzer-sys.git" }
# Add a path dependency to your main project crate
mermaid-tool = { path = ".." }

[[bin]]
name = "lexer_fuzzer"
path = "fuzz_targets/lexer_fuzzer.rs"
test = false
doc = false

[[bin]]
name = "parser_fuzzer"
path = "fuzz_targets/parser_fuzzer.rs"
test = false
doc = false

Explanation:

  • libfuzzer-sys: Provides the interface for libfuzzer.
  • mermaid-tool = { path = ".." }: This is critical. It tells the fuzz project how to find and link against our main mermaid-tool library, allowing fuzz targets to call our lexer and parser.
  • [[bin]] sections declare our fuzz targets as binaries.

d) Create Fuzz Targets

Now, create the actual fuzz targets in fuzz/fuzz_targets/.

fuzz/fuzz_targets/lexer_fuzzer.rs

// fuzz/fuzz_targets/lexer_fuzzer.rs

#![no_main]
use libfuzzer_sys::fuzz_target;

// Import our lexer from the main project
use mermaid_tool::lexer::Lexer;
use mermaid_tool::token::Token; // Needed for type inference if Lexer::tokenize returns Vec<Token>

fuzz_target!(|data: &[u8]| {
    // Convert the raw fuzzer input to a string.
    // Handle potential invalid UTF-8 gracefully to avoid panics during conversion,
    // as libfuzzer can produce arbitrary byte sequences.
    if let Ok(s) = std::str::from_utf8(data) {
        let mut lexer = Lexer::new(s);
        // The goal here is to ensure the lexer doesn't panic or crash,
        // even with malformed input. We don't care about the output correctness,
        // just that it completes without an unhandled error.
        let _ = lexer.tokenize(); // Assuming tokenize() method
        // Or if using next_token() repeatedly:
        // loop {
        //     let token = lexer.next_token().unwrap(); // Use unwrap() if it can't fail, or handle error if it returns Result
        //     if token.token_type == mermaid_tool::token::TokenType::Eof {
        //         break;
        //     }
        // }
    }
});

Explanation:

  • #![no_main] tells Rust not to expect a main function, as libfuzzer provides its own entry point.
  • fuzz_target!(|data: &[u8]| { ... }); is the macro provided by libfuzzer-sys that defines the fuzzing entry point. data is the random input.
  • We attempt to convert data to a &str. Fuzzers often generate non-UTF-8 bytes, so from_utf8 is crucial. We only proceed if it’s valid UTF-8, to focus on the lexer’s handling of valid strings, even if they are syntactically incorrect Mermaid. If you want to test invalid UTF-8, you’d feed the raw data bytes directly to a lexer that handles byte streams.
  • We then instantiate our Lexer and call its tokenize method. The key is that this call should never panic or crash, regardless of the input. Any panic found by the fuzzer is a bug.

fuzz/fuzz_targets/parser_fuzzer.rs

// fuzz/fuzz_targets/parser_fuzzer.rs

#![no_main]
use libfuzzer_sys::fuzz_target;

use mermaid_tool::lexer::Lexer;
use mermaid_tool::parser::Parser;

fuzz_target!(|data: &[u8]| {
    if let Ok(s) = std::str::from_utf8(data) {
        // First, lex the input
        let mut lexer = Lexer::new(s);
        let tokens = lexer.tokenize();

        // Then, parse the tokens
        let mut parser = Parser::new(tokens);
        let _ = parser.parse(); // The goal is to not panic during parsing
    }
});

Explanation:

  • This target takes the raw input, lexes it, and then parses the resulting tokens.
  • Again, the objective is to ensure that the entire lexing-parsing pipeline does not panic or crash, even with arbitrary, syntactically invalid token streams.

e) Running Fuzz Tests

To run a specific fuzz target:

cargo fuzz run lexer_fuzzer
# Or for the parser fuzzer:
cargo fuzz run parser_fuzzer

Interpreting Fuzz Results:

  • libfuzzer will continuously generate and mutate inputs.
  • If it finds an input that causes a crash (e.g., a panic! or segmentation fault), it will stop, print a stack trace, and save the problematic input to a file in fuzz/artifacts/<target_name>/.
  • You then take this artifact file, use it as a test case in your unit or golden tests, debug the issue, and fix it.
  • Fuzzing can run for hours or days. It’s often integrated into CI/CD for continuous security and stability checks.

Production Considerations

  1. CI/CD Integration: All unit, golden, and fuzz tests should be integrated into your CI/CD pipeline.
    • cargo test should run on every pull request or commit.
    • cargo insta review should be run locally, and accepted snapshots committed. CI should only run cargo test which fails if snapshots are not up-to-date.
    • Fuzz tests can be run periodically (e.g., nightly builds) or on dedicated infrastructure due to their long-running nature. A common pattern is to run fuzz tests for a fixed duration (e.g., 5-10 minutes) in CI.
  2. Test Maintenance:
    • Snapshots: When making intentional changes to lexer, parser, or formatter output, remember to cargo insta review and commit the updated snapshots. Neglecting this leads to constant test failures.
    • Fuzzing Corpora: Over time, cargo-fuzz builds up a “corpus” of interesting inputs that have helped find bugs or increase code coverage. This corpus is stored in fuzz/corpus/<target_name>/. Commit these corpus files to your repository to accelerate future fuzzing runs and share findings.
  3. Performance:
    • Large golden test snapshots can increase repository size and cargo test runtime if not managed. Keep snapshots focused.
    • Fuzzing is CPU-intensive. Ensure your CI/CD setup can handle it without blocking other critical tasks.
  4. Security: Fuzz testing directly contributes to the security of our tool by finding unexpected crashes that could potentially be exploited (e.g., leading to denial-of-service if the tool is used in a server context, or simply data corruption).

Code Review Checkpoint

At this stage, we have significantly enhanced our project’s reliability.

Files Created/Modified:

  • Cargo.toml: Added insta to [dev-dependencies].
  • src/lexer.rs: Added comprehensive unit tests within mod tests.
  • src/parser.rs: Added comprehensive unit tests within mod tests.
  • src/validator.rs: (Likely, similar additions for validator unit tests)
  • src/rule_engine.rs: (Likely, similar additions for rule engine unit tests)
  • src/formatter.rs: (Likely, similar additions for formatter unit tests)
  • tests/golden/test_lexer.rs: New file for lexer snapshot tests.
  • tests/golden/test_parser.rs: New file for parser snapshot tests.
  • tests/golden/test_formatter.rs: New file for formatter snapshot tests.
  • tests/golden/lexer_snapshots/*.snap: New snapshot files generated by insta.
  • tests/golden/parser_snapshots/*.snap: New snapshot files generated by insta.
  • tests/golden/formatter_snapshots/*.snap: New snapshot files generated by insta.
  • fuzz/Cargo.toml: New file created by cargo fuzz init, modified to include mermaid-tool path dependency.
  • fuzz/fuzz_targets/lexer_fuzzer.rs: New file for lexer fuzz target.
  • fuzz/fuzz_targets/parser_fuzzer.rs: New file for parser fuzz target.
  • fuzz/corpus/<target_name>/: New directory/files for fuzzing corpus.

Integration with Existing Code:

  • The unit tests directly exercise the public and internal (via super::*) APIs of our modules.
  • Golden tests integrate the lexer, parser, and formatter pipeline to verify end-to-end transformations.
  • Fuzz tests call the core lexer and parser functions, ensuring their robustness against arbitrary inputs.

Common Issues & Solutions

  1. Snapshot Mismatches:
    • Issue: cargo test fails with “snapshot mismatch” errors.
    • Solution: This usually means your code’s output has changed.
      • If the change is intentional: Run cargo insta review to inspect the diff and accept the new snapshot. Commit the updated .snap files.
      • If the change is unintentional (a bug): The diff will help you pinpoint where your code’s behavior diverged from the expectation. Fix the bug, then re-run cargo test.
  2. Fuzzer Finds a Crash:
    • Issue: cargo fuzz run <target> reports a crash and saves an artifact file.
    • Solution:
      1. Take the artifact file (e.g., fuzz/artifacts/lexer_fuzzer/crash-...).
      2. Create a new unit test or golden test case using the content of this artifact file as input.
      3. Run the test to reproduce the crash reliably in a debugger-friendly environment.
      4. Debug the code to understand why it crashed and implement a fix (e.g., better error handling, boundary checks, input validation).
      5. Once fixed, ensure the new test case passes and the fuzzer no longer crashes on that input.
  3. Slow Test Suite:
    • Issue: cargo test takes a long time to run.
    • Solution:
      • Parallelization: Rust tests run in parallel by default, but some tests might be I/O bound or hold locks.
      • Test Granularity: Ensure unit tests are small and fast.
      • Snapshot Size: Very large snapshots can be slow to compare and increase repo size. Consider if a smaller, representative snapshot is sufficient, or if the component needs to be broken down.
      • Profiling: Use cargo profiler or similar tools to identify slow tests.
      • Conditional Tests: For very long-running tests (like extensive fuzzing), run them only in specific CI stages (e.g., nightly builds) or locally, not on every commit.

Testing & Verification

To verify all the testing infrastructure we’ve set up:

  1. Run All Unit and Golden Tests:

    cargo test
    

    All tests should pass. If any golden tests fail, use cargo insta review to accept new snapshots or fix your code.

  2. Run Fuzz Tests (briefly):

    cargo fuzz run lexer_fuzzer -- -runs=10000 # Run for 10,000 iterations
    cargo fuzz run parser_fuzzer -- -runs=10000 # Run for 10,000 iterations
    

    These commands will run the fuzzers for a limited number of iterations. They should complete without reporting any crashes. For full coverage, you’d let them run much longer.

By following these steps, you will have a robust and reliable test suite for your mermaid-tool, ensuring that future changes don’t introduce regressions and that the tool remains stable even with unexpected inputs.

Summary & Next Steps

In this chapter, we established a comprehensive testing strategy for our mermaid-tool, integrating:

  • Unit tests for granular verification of individual components like the lexer, parser, validator, rule engine, and formatter.
  • Golden (snapshot) tests using the insta crate to ensure consistent output for complex transformations, particularly for token streams, AST structures, and formatted Mermaid code.
  • Fuzz testing using cargo-fuzz and libfuzzer to robustly test the lexer and parser against arbitrary, potentially malformed inputs, uncovering crashes and vulnerabilities.

This multi-pronged approach significantly elevates the reliability and production-readiness of our tool. We’ve ensured that our compiler-like pipeline is not only correct for expected inputs but also resilient against unexpected and hostile data.

With a solid testing foundation in place, our mermaid-tool is almost ready for the world. In the next chapter, Chapter 12: Deployment Strategies and CI/CD, we will focus on packaging our Rust CLI tool for distribution, setting up continuous integration and deployment pipelines, and ensuring it can be easily installed and used by other developers.