Welcome to Chapter 10! So far, our Rust-based Static Site Generator (SSG) can parse content, apply templates, generate routes, and output static HTML. However, with every change to a source file, our SSG currently rebuilds the entire site. While fast for small projects, this full rebuild approach quickly becomes a bottleneck for larger sites, leading to frustratingly long development cycles.

In this chapter, we will tackle this performance issue head-on by implementing two crucial features: incremental builds and file system watching. Incremental builds allow our SSG to intelligently detect changes and only re-process the necessary files, drastically reducing build times. Coupled with a file system watcher, this will enable an incredibly smooth developer experience: save a file, and the site automatically rebuilds and refreshes in milliseconds, showing your changes instantly.

We will design a caching mechanism to track file states, use cryptographic hashing to reliably detect modifications, and integrate a robust file system watcher. By the end of this chapter, you will have a highly responsive SSG development server that makes content creation a joy, deeply understanding how modern build tools achieve their speed and efficiency.

Planning & Design

The core idea behind incremental builds is to avoid redundant work. This requires a way to:

  1. Track File State: Store metadata about each source file (e.g., its hash or last modified timestamp) from the previous successful build.
  2. Detect Changes: Compare the current state of files with the tracked state to identify what has been added, modified, or deleted.
  3. Invalidate Cache: Based on detected changes, determine which generated output files (HTML, assets) are now stale and need to be rebuilt.
  4. Rebuild Selectively: Only run the build pipeline for the affected source files and their dependent outputs.
  5. Update Cache: After a successful incremental build, update the tracked state for the modified files.

For file system watching, we need a separate thread or asynchronous task that monitors specified directories for changes and triggers the incremental build process.

Component Architecture for Incremental Builds and Watching

Let’s visualize the enhanced build process:

flowchart TD BuildTrigger[Build Trigger CLI or Watcher] --> A{Is Cache Valid} subgraph Incremental_Build_Flow["Incremental Build Flow"] A --->|No Initial Build| B[Scan All Source Files] A --->|Yes| C[Load Build Cache] B --> D[Compute File Hashes] C --> D D --> E[Compare Hashes with Cache] E --> F{Changes Detected} F --->|Yes| G[Identify Changed Files] F --->|No| BuildComplete[Build Complete No Changes] G --> H[Determine Affected Outputs] H --> I[Execute Partial Build Pipeline] I --> J[Update Build Cache] J --> BuildComplete end subgraph File_System_Watcher["File System Watcher"] K[Initialize Watcher] --> L[Monitor Directories] L --> M{File System Event} M --->|Change Add Delete| N[Debounce Events] N --> O[Send Build Signal] end O --> BuildTrigger I -.-> P[Output Generated Files] BuildComplete -.-> P P --> Q[Serve Files Dev Server]

Explanation of the Flow:

  • Build Trigger: Can be manually invoked (e.g., ssg build) or automatically by the file system watcher.
  • Cache Check: The builder first checks if a build cache exists and is valid.
  • Initial Build: If no cache or invalid, a full scan and hash computation occur.
  • Incremental Path: If a cache exists, it’s loaded, and current file hashes are compared against it.
  • Change Detection: Added, Modified, Deleted files are identified.
  • Affected Outputs: Based on changes, the system determines which specific pages or assets need to be rebuilt. For instance, a change in a content file affects only that page; a change in a template might affect many pages.
  • Partial Build: Only the necessary parts of the build pipeline are executed.
  • Update Cache: The cache is updated with the new file states.
  • File System Watcher: Runs in the background, listening for changes. It debounces multiple rapid events into a single build signal to prevent excessive rebuilds.
  • Serve Files: The development server serves the generated output, refreshing as changes occur.

File Structure Additions/Modifications

We’ll primarily modify our src/build.rs and introduce a new src/watcher.rs module. We’ll also define a new struct for our build cache.

.
├── Cargo.toml
├── src/
│   ├── main.rs
│   ├── config.rs
│   ├── content.rs
│   ├── parser.rs
│   ├── renderer.rs
│   ├── template.rs
│   ├── router.rs
│   ├── server.rs         // Our dev server
│   ├── build.rs          // Will contain core build logic and incremental logic
│   └── watcher.rs        // NEW: Handles file system watching
│   └── cache.rs          // NEW: Defines build cache structure and logic

Step-by-Step Implementation

a) Setup/Configuration

First, let’s add the necessary dependencies to our Cargo.toml.

Cargo.toml

[package]
name = "my_ssg"
version = "0.1.0"
edition = "2021"

[dependencies]
# ... existing dependencies ...
serde = { version = "1.0", features = ["derive"] }
serde_yaml = "0.9"
toml = "0.8"
pulldown-cmark = "0.9"
tera = "1.19"
anyhow = "1.0"
tracing = "0.1"
tracing-subscriber = "0.3"
walkdir = "2.3"
tokio = { version = "1.36", features = ["full"] } # Ensure "fs" feature is enabled for async file ops
lazy_static = "1.4"
regex = "1.10"
# New dependencies for incremental builds and watching
notify = "6.1" # For file system watching
sha2 = "0.10" # For cryptographic hashing of file contents
hex = "0.4"   # To convert hash bytes to hex string

Explanation:

  • notify: A cross-platform file system notification library.
  • sha2: Provides SHA-2 hashing algorithms, essential for reliably detecting file changes.
  • hex: Converts byte arrays (from sha2) into human-readable hexadecimal strings.

b) Core Implementation

We’ll start by creating the cache.rs module to define our build cache structure and utility functions for hashing.

1. src/cache.rs - Build Cache Structure and Hashing

This module will contain the BuildCache struct, which stores information about processed files, and functions to compute file hashes.

src/cache.rs

use std::{
    collections::HashMap,
    fs,
    path::{Path, PathBuf},
    time::SystemTime,
};
use serde::{Deserialize, Serialize};
use sha2::{Digest, Sha256};
use anyhow::{Result, Context};
use tracing::{info, debug, error};

/// Represents the metadata for a single source file in the build cache.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct FileMetadata {
    pub path: PathBuf,
    pub hash: String, // SHA256 hash of the file content
    pub last_modified: SystemTime,
    // Add other relevant metadata if needed, e.g., dependencies, output paths
}

/// The main build cache structure.
#[derive(Debug, Default, Serialize, Deserialize)]
pub struct BuildCache {
    pub files: HashMap<PathBuf, FileMetadata>,
    pub last_full_build_time: Option<SystemTime>,
    // Potentially add more sophisticated dependency tracking here in the future
    // e.g., template_dependencies: HashMap<PathBuf, Vec<PathBuf>> // template -> content files using it
}

impl BuildCache {
    /// Loads the build cache from a JSON file.
    pub fn load(cache_path: &Path) -> Result<Self> {
        if !cache_path.exists() {
            debug!("Build cache file not found at {:?}. Starting with empty cache.", cache_path);
            return Ok(Self::default());
        }

        let content = fs::read_to_string(cache_path)
            .context(format!("Failed to read build cache from {:?}", cache_path))?;
        let cache: Self = serde_json::from_str(&content)
            .context(format!("Failed to deserialize build cache from {:?}", cache_path))?;
        info!("Build cache loaded from {:?}", cache_path);
        Ok(cache)
    }

    /// Saves the build cache to a JSON file.
    pub fn save(&self, cache_path: &Path) -> Result<()> {
        let content = serde_json::to_string_pretty(self)
            .context("Failed to serialize build cache")?;
        fs::write(cache_path, content)
            .context(format!("Failed to write build cache to {:?}", cache_path))?;
        info!("Build cache saved to {:?}", cache_path);
        Ok(())
    }

    /// Computes the SHA256 hash of a file's content.
    pub fn compute_file_hash(path: &Path) -> Result<String> {
        let mut file = fs::File::open(path)
            .context(format!("Failed to open file for hashing: {:?}", path))?;
        let mut hasher = Sha256::new();
        std::io::copy(&mut file, &mut hasher)
            .context(format!("Failed to read file for hashing: {:?}", path))?;
        Ok(hex::encode(hasher.finalize()))
    }

    /// Creates FileMetadata for a given path.
    pub fn create_file_metadata(path: &Path) -> Result<FileMetadata> {
        let hash = Self::compute_file_hash(path)?;
        let metadata = fs::metadata(path)
            .context(format!("Failed to get metadata for file: {:?}", path))?;
        let last_modified = metadata.modified()
            .context(format!("Failed to get last modified time for file: {:?}", path))?;

        Ok(FileMetadata {
            path: path.to_path_buf(),
            hash,
            last_modified,
        })
    }

    /// Updates the metadata for a single file in the cache.
    pub fn update_file(&mut self, path: &Path) -> Result<()> {
        let metadata = Self::create_file_metadata(path)?;
        self.files.insert(path.to_path_buf(), metadata);
        Ok(())
    }

    /// Removes a file from the cache.
    pub fn remove_file(&mut self, path: &Path) {
        self.files.remove(path);
    }
}

Explanation of src/cache.rs:

  • FileMetadata: Stores the path, hash (SHA256), and last_modified timestamp for each file. The hash is crucial for detecting content changes, and last_modified can be a quick check.
  • BuildCache: Contains a HashMap where keys are file paths and values are FileMetadata. It also tracks last_full_build_time for potential future use (e.g., for clearing cache if too old).
  • load and save: Handle serialization/deserialization of the cache to a .ssg_cache.json file using serde_json.
  • compute_file_hash: Reads a file and calculates its SHA256 hash. This is the most reliable way to know if a file’s content has changed.
  • create_file_metadata: A helper to create a FileMetadata instance.
  • update_file and remove_file: Methods to manage individual file entries in the cache.
2. src/build.rs - Integrating Incremental Logic

Now, let’s modify our build.rs to leverage this cache. We’ll introduce a BuildMode (full or incremental) and a BuildContext that holds the cache.

First, ensure src/build.rs has the necessary imports:

src/build.rs (add or modify imports)

use std::{
    fs,
    path::{Path, PathBuf},
    collections::HashMap,
};
use anyhow::{Result, Context};
use tracing::{info, debug, error, warn};
use walkdir::WalkDir;

use crate::{
    config::Config,
    content::{Content, FrontMatter},
    parser::parse_markdown_to_html,
    renderer::render_page,
    router::{Route, Router},
    template::TemplateEngine,
    // New imports
    cache::{BuildCache, FileMetadata},
};

/// Defines the build mode.
pub enum BuildMode {
    Full,
    Incremental,
}

/// Context for the build process, including the cache.
pub struct BuildContext {
    pub config: Config,
    pub template_engine: TemplateEngine,
    pub router: Router,
    pub build_cache: BuildCache,
    pub output_dir: PathBuf,
}

impl BuildContext {
    pub fn new(config: Config, output_dir: PathBuf) -> Result<Self> {
        let template_engine = TemplateEngine::new(&config.template_dir)?;
        let router = Router::new(); // Initialize an empty router for now, it's populated during build.
        let build_cache = BuildCache::load(&output_dir.join(".ssg_cache.json"))?;

        Ok(Self {
            config,
            template_engine,
            router,
            build_cache,
            output_dir,
        })
    }
}

// ... existing functions like `setup_output_directory`, `copy_static_assets` ...

Now, let’s refactor the main build_site function to incorporate incremental logic. This will be a significant change.

src/build.rs (modify build_site and add helper functions)

// ... (previous code for BuildContext, setup_output_directory, copy_static_assets) ...

/// Enum to classify file changes
#[derive(Debug, PartialEq)]
pub enum FileChange {
    Added,
    Modified,
    Deleted,
    Unchanged,
}

/// Scans the source directory and compares current file states with the cache.
/// Returns categorized lists of changed content and template files.
pub fn detect_file_changes(
    build_context: &mut BuildContext,
    source_dir: &Path,
    template_dir: &Path,
    static_dir: &Path,
) -> Result<(
    HashMap<PathBuf, FileChange>, // All changed files (content, template, static)
    Vec<PathBuf>, // List of files that need a full rebuild (e.g., template changes)
)> {
    let mut changed_files: HashMap<PathBuf, FileChange> = HashMap::new();
    let mut files_to_rebuild_all: Vec<PathBuf> = Vec::new(); // Files whose change requires full rebuild

    // Track all current files encountered during scan
    let mut current_files_in_source: HashMap<PathBuf, FileMetadata> = HashMap::new();

    // 1. Scan Content, Template, and Static directories
    for entry in WalkDir::new(source_dir)
        .into_iter()
        .filter_map(|e| e.ok())
    {
        let path = entry.path().to_path_buf();
        if path.is_file() {
            let relative_path = path.strip_prefix(source_dir).unwrap_or(&path).to_path_buf();

            // Ignore cache file itself
            if relative_path.file_name().map_or(false, |name| name == ".ssg_cache.json") {
                continue;
            }

            let current_metadata = BuildCache::create_file_metadata(&path)?;
            current_files_in_source.insert(relative_path.clone(), current_metadata.clone());

            let cached_metadata_option = build_context.build_cache.files.get(&relative_path);

            match cached_metadata_option {
                Some(cached_metadata) => {
                    if cached_metadata.hash != current_metadata.hash {
                        debug!("File modified: {:?}", relative_path);
                        changed_files.insert(relative_path.clone(), FileChange::Modified);
                        // If a template file changes, it might affect many pages, so mark for full rebuild
                        if relative_path.starts_with(template_dir.strip_prefix(source_dir).unwrap_or(template_dir)) {
                            files_to_rebuild_all.push(relative_path.clone());
                        }
                    } else {
                        // Unchanged
                    }
                }
                None => {
                    debug!("File added: {:?}", relative_path);
                    changed_files.insert(relative_path.clone(), FileChange::Added);
                    if relative_path.starts_with(template_dir.strip_prefix(source_dir).unwrap_or(template_dir)) {
                        files_to_rebuild_all.push(relative_path.clone());
                    }
                }
            }
        }
    }

    // 2. Detect Deleted Files (present in cache but not in current scan)
    let mut deleted_files: Vec<PathBuf> = Vec::new();
    for cached_path in build_context.build_cache.files.keys() {
        if !current_files_in_source.contains_key(cached_path) {
            deleted_files.push(cached_path.clone());
        }
    }

    for path in deleted_files {
        debug!("File deleted: {:?}", path);
        changed_files.insert(path, FileChange::Deleted);
    }

    Ok((changed_files, files_to_rebuild_all))
}


/// The main function to build the static site.
/// orchestrates the entire build process.
pub async fn build_site(build_context: &mut BuildContext, mode: BuildMode) -> Result<()> {
    info!("Starting build in {:?} mode...", mode);

    let ssg_root = &build_context.config.root_dir;
    let content_dir = ssg_root.join(&build_context.config.content_dir);
    let static_dir = ssg_root.join(&build_context.config.static_dir);
    let template_dir = ssg_root.join(&build_context.config.template_dir);
    let output_dir = &build_context.output_dir;

    // Ensure output directory exists and is clean for full builds
    if let BuildMode::Full = mode {
        setup_output_directory(output_dir)?;
    }

    // --- Step 1: Detect Changes (for incremental builds) ---
    let mut content_files_to_process: Vec<PathBuf> = Vec::new();
    let mut static_files_to_copy: Vec<PathBuf> = Vec::new();
    let mut template_files_to_update: Vec<PathBuf> = Vec::new();
    let mut pages_to_delete: Vec<PathBuf> = Vec::new(); // Original content paths of deleted pages

    let mut requires_full_rebuild = false;

    if let BuildMode::Incremental = mode {
        let (changed_files, files_triggering_full_rebuild) =
            detect_file_changes(build_context, ssg_root, &template_dir, &static_dir)?;

        if !files_triggering_full_rebuild.is_empty() {
            warn!("Changes in {:?} detected. Triggering full rebuild.", files_triggering_full_rebuild);
            requires_full_rebuild = true;
        }

        if requires_full_rebuild {
            info!("Performing full rebuild due to critical changes.");
            // Clear current cache files for a fresh start, except the cache file itself.
            build_context.build_cache.files.clear();
            setup_output_directory(output_dir)?; // Re-clean output for full rebuild
        } else {
            for (relative_path, change_type) in changed_files.iter() {
                let absolute_path = ssg_root.join(relative_path);

                if absolute_path.starts_with(&content_dir) {
                    match change_type {
                        FileChange::Added | FileChange::Modified => {
                            content_files_to_process.push(absolute_path.clone());
                            build_context.build_cache.update_file(&absolute_path)?;
                        }
                        FileChange::Deleted => {
                            // Mark for deletion from output
                            if let Some(cached_meta) = build_context.build_cache.files.get(relative_path) {
                                // For now, we'll just remove from cache.
                                // A more advanced system would track output paths to delete.
                                debug!("Content file deleted, removing from cache: {:?}", relative_path);
                                pages_to_delete.push(relative_path.clone()); // Store relative path for potential output deletion
                            }
                            build_context.build_cache.remove_file(relative_path);
                        }
                        FileChange::Unchanged => {} // Should not be in changed_files map
                    }
                } else if absolute_path.starts_with(&template_dir) {
                    // Templates are usually handled by requiring a full rebuild if their content changes significantly
                    // For now, we just update their cache entry.
                    match change_type {
                        FileChange::Added | FileChange::Modified => {
                            template_files_to_update.push(absolute_path.clone());
                            build_context.build_cache.update_file(&absolute_path)?;
                            // If template changes, all content files using it *might* need rebuilding.
                            // For simplicity, we are triggering a full rebuild for template changes.
                            // A more advanced system would track template dependencies.
                            // This path should ideally be covered by `files_triggering_full_rebuild`
                        }
                        FileChange::Deleted => {
                            debug!("Template file deleted, removing from cache: {:?}", relative_path);
                            build_context.build_cache.remove_file(relative_path);
                            // This would also trigger a full rebuild for dependency-aware systems.
                        }
                        FileChange::Unchanged => {}
                    }
                } else if absolute_path.starts_with(&static_dir) {
                    match change_type {
                        FileChange::Added | FileChange::Modified => {
                            static_files_to_copy.push(absolute_path.clone());
                            build_context.build_cache.update_file(&absolute_path)?;
                        }
                        FileChange::Deleted => {
                            debug!("Static file deleted, removing from cache: {:?}", relative_path);
                            build_context.build_cache.remove_file(relative_path);
                            // TODO: Implement deletion of corresponding static file from output_dir
                        }
                        FileChange::Unchanged => {}
                    }
                }
            }
        }
    }

    // --- Step 2: Process Content Files ---
    if let BuildMode::Full = mode {
        // For full build, scan all content files
        for entry in WalkDir::new(&content_dir)
            .into_iter()
            .filter_map(|e| e.ok())
            .filter(|e| e.path().is_file() && e.path().extension().map_or(false, |ext| ext == "md"))
        {
            content_files_to_process.push(entry.path().to_path_buf());
        }
    } else if content_files_to_process.is_empty() && static_files_to_copy.is_empty() && !requires_full_rebuild {
        info!("No relevant content or static file changes detected for incremental build.");
        // If nothing changed and no full rebuild required, just save cache and exit.
        build_context.build_cache.save(&output_dir.join(".ssg_cache.json"))?;
        return Ok(());
    }

    // For simplicity, even in incremental builds, we currently re-copy all static assets
    // A more advanced system would only copy changed static assets.
    // For now, `static_files_to_copy` is populated only by incremental changes, not full scan.
    if let BuildMode::Full = mode {
        copy_static_assets(ssg_root, &static_dir, output_dir)?;
    } else if !static_files_to_copy.is_empty() {
        // Only copy changed static files incrementally
        for static_file_path in static_files_to_copy {
            let relative_path = static_file_path.strip_prefix(ssg_root)?;
            let dest_path = output_dir.join(relative_path);
            if let Some(parent) = dest_path.parent() {
                fs::create_dir_all(parent)?;
            }
            fs::copy(&static_file_path, &dest_path)
                .context(format!("Failed to copy static file from {:?} to {:?}", static_file_path, dest_path))?;
            debug!("Copied static file: {:?}", relative_path);
        }
    }


    // Delete pages marked for deletion (from the router and output)
    for deleted_content_path in pages_to_delete {
        // This is a placeholder. A robust system would track the output path
        // associated with the content path and delete that specific file.
        // For now, we rely on full rebuilds to clean up deleted pages.
        warn!("Content file {:?} deleted. Output file deletion not yet implemented for incremental builds. Full rebuild recommended for cleanup.", deleted_content_path);
        build_context.router.remove_route_by_source(&deleted_content_path);
    }


    // Process content files (either all for full build, or only changed for incremental)
    let mut processed_contents: Vec<Content> = Vec::new();
    for content_file_path in content_files_to_process {
        debug!("Processing content file: {:?}", content_file_path);
        let content_result = Content::from_file(&content_file_path, &content_dir);

        match content_result {
            Ok(content) => {
                processed_contents.push(content);
                // Update cache for this file
                let relative_path = content_file_path.strip_prefix(ssg_root)?;
                build_context.build_cache.update_file(&ssg_root.join(relative_path))?;
            },
            Err(e) => {
                error!("Failed to process content file {:?}: {:?}", content_file_path, e);
                // Continue processing other files
            }
        }
    }


    // --- Step 3: Register Routes and Render Pages ---
    // For incremental builds, we need to ensure the router is up-to-date.
    // A full rebuild will clear and re-populate it entirely.
    // For incremental, we add/update routes for processed_contents.

    // If a full rebuild was triggered, or it's an initial full build, clear router.
    if requires_full_rebuild || matches!(mode, BuildMode::Full) {
        build_context.router = Router::new(); // Reset router
        // Re-scan all content files to rebuild the router correctly
        let all_content_files: Vec<PathBuf> = WalkDir::new(&content_dir)
            .into_iter()
            .filter_map(|e| e.ok())
            .filter(|e| e.path().is_file() && e.path().extension().map_or(false, |ext| ext == "md"))
            .map(|e| e.path().to_path_buf())
            .collect();

        processed_contents.clear(); // Clear, then re-populate with all content
        for content_file_path in all_content_files {
            let content_result = Content::from_file(&content_file_path, &content_dir);
            match content_result {
                Ok(content) => processed_contents.push(content),
                Err(e) => error!("Failed to re-process content file {:?} for full rebuild: {:?}", content_file_path, e),
            }
        }
    }

    // Register routes for all processed content (either all or changed)
    for content in &processed_contents {
        build_context.router.register_route(&content, &build_context.config)?;
    }

    // Render pages. If full rebuild, render all. If incremental, render only changed.
    // This is still a simplification. A truly incremental system would only render
    // pages whose content or *dependent templates* have changed.
    // For now, if any content changed, we re-render that specific page.
    // If a template changed, we triggered a full rebuild, so all pages get re-rendered.
    let pages_to_render = if requires_full_rebuild || matches!(mode, BuildMode::Full) {
        // Render all pages
        build_context.router.get_all_routes().cloned().collect()
    } else {
        // Render only pages whose content files were processed (added/modified)
        processed_contents.iter()
            .filter_map(|c| build_context.router.get_route_by_source_path(&c.source_path))
            .cloned()
            .collect()
    };

    for route in pages_to_render {
        match render_page(
            &route,
            &build_context.router,
            &build_context.template_engine,
            output_dir,
            &build_context.config,
        ) {
            Ok(_) => debug!("Rendered: {}", route.output_path.display()),
            Err(e) => error!("Failed to render route {}: {:?}", route.output_path.display(), e),
        }
    }

    // Finalize: Save the updated build cache
    build_context.build_cache.save(&output_dir.join(".ssg_cache.json"))?;
    info!("Build completed successfully in {:?} mode.", mode);
    Ok(())
}

Explanation of src/build.rs changes:

  • BuildMode enum: Distinguishes between a Full build (like the first run) and an Incremental build.
  • BuildContext: Now holds the BuildCache instance.
  • detect_file_changes: This new function is the heart of incremental detection.
    • It walks the source directories (content, templates, static).
    • For each file, it computes its current hash and compares it with the hash stored in build_context.build_cache.
    • It categorizes files as Added, Modified, or Deleted.
    • Crucially, if a template file is changed, it sets a flag requires_full_rebuild, because template changes often impact multiple pages and tracking these dependencies is complex (and often overkill for dev builds).
  • build_site modifications:
    • Takes a BuildMode argument.
    • If Incremental mode, it calls detect_file_changes.
    • If requires_full_rebuild is true (e.g., template change), it acts like a Full build.
    • Otherwise, it populates content_files_to_process and static_files_to_copy only with the detected changed files.
    • It updates the build_cache with new metadata for processed files and removes entries for deleted files.
    • Simplification: For now, if a content file is deleted, we print a warning and rely on a full rebuild to truly clean up the output. A more robust system would track the output path for each source file and delete it directly.
    • Template Dependency: The current implementation triggers a full rebuild if any template file changes. This is a common and practical simplification for SSGs during development. A more advanced system would track which content files use which templates and only rebuild those specific content files.
    • Router update: For full builds, the router is reset. For incremental, it just adds/updates routes for the processed_contents.
    • Page Rendering: For full builds or if a full rebuild was triggered, all pages are re-rendered. Otherwise, only pages corresponding to processed_contents (added/modified content files) are rendered.
    • Finally, the updated build_cache is saved.
3. src/watcher.rs - File System Watcher

This module will use the notify crate to monitor our source directories.

src/watcher.rs

use std::{
    path::{Path, PathBuf},
    time::Duration,
    sync::Arc,
};
use notify::{
    Config, Event, EventKind, RecommendedWatcher, RecursiveMode, Watcher,
};
use tokio::sync::{mpsc, Mutex};
use tracing::{info, debug, error, warn};

// Define an event type that the watcher sends
#[derive(Debug)]
pub enum WatcherEvent {
    Change,
    Shutdown,
}

/// Initializes and runs a file system watcher.
/// Sends `WatcherEvent::Change` on file modifications and `WatcherEvent::Shutdown` on error.
pub async fn start_watcher(
    ssg_root: PathBuf,
    content_dir: PathBuf,
    template_dir: PathBuf,
    static_dir: PathBuf,
    tx: mpsc::Sender<WatcherEvent>,
) -> anyhow::Result<()> {
    info!("Starting file system watcher on {:?}", ssg_root);

    let (event_tx, mut event_rx) = mpsc::channel(100); // Channel for raw notify events

    let mut watcher = RecommendedWatcher::new(
        move |res| match res {
            Ok(event) => {
                if let Err(e) = event_tx.blocking_send(event) {
                    error!("Failed to send watcher event: {:?}", e);
                }
            }
            Err(e) => error!("Watcher error: {:?}", e),
        },
        Config::default(),
    )?;

    // Watch the relevant directories recursively
    watcher.watch(&content_dir, RecursiveMode::Recursive)?;
    watcher.watch(&template_dir, RecursiveMode::Recursive)?;
    watcher.watch(&static_dir, RecursiveMode::Recursive)?;

    info!("Watcher is monitoring: {:?}, {:?}, {:?}", content_dir, template_dir, static_dir);

    let mut debounce_timer: Option<tokio::time::Sleep> = None;

    loop {
        tokio::select! {
            // Receive raw events from the `notify` crate
            Some(event) = event_rx.recv() => {
                debug!("Raw watcher event: {:?}", event);
                // Filter out irrelevant events (e.g., changes to output directory, temporary files)
                if should_trigger_rebuild(&event, &ssg_root) {
                    // Reset debounce timer on any relevant event
                    debounce_timer = Some(tokio::time::sleep(Duration::from_millis(200))); // Debounce for 200ms
                }
            }
            // Wait for debounce timer to expire
            _ = async {
                if let Some(timer) = &mut debounce_timer {
                    timer.await;
                } else {
                    futures::pending!(); // If no timer, just wait
                }
            }, if debounce_timer.is_some() => {
                info!("Debounce timer expired, triggering rebuild.");
                if let Err(e) = tx.send(WatcherEvent::Change).await {
                    error!("Failed to send build signal: {:?}", e);
                    break; // Exit loop on send error
                }
                debounce_timer = None; // Reset timer
            }
            else => {
                // This branch is hit if `event_rx` is closed or `debounce_timer` is always None.
                // In practice, `event_rx` shouldn't close unless `watcher` is dropped.
                // This `else` block makes `tokio::select!` exhaustive and avoids a hang.
                tokio::time::sleep(Duration::from_secs(1)).await; // Prevent busy-looping if no events
            }
        }
    }

    Ok(())
}

/// Determines if a file system event should trigger a rebuild.
fn should_trigger_rebuild(event: &Event, ssg_root: &Path) -> bool {
    // Ignore events in the output directory
    if event.paths.iter().any(|p| p.starts_with(ssg_root.join("public")) || p.starts_with(ssg_root.join(".ssg_cache.json"))) {
        return false;
    }

    match event.kind {
        EventKind::Access(_) => false, // Ignore access events
        EventKind::Modify(_) | EventKind::Create(_) | EventKind::Remove(_) | EventKind::Any => {
            // Filter out temporary files often created by editors (e.g., `.swp`, `~`, `.#`)
            if event.paths.iter().any(|p| {
                p.file_name()
                    .and_then(|name| name.to_str())
                    .map_or(false, |s| s.starts_with('.') || s.ends_with('~') || s.starts_with('#'))
            }) {
                debug!("Ignoring temporary file event: {:?}", event.paths);
                false
            } else {
                true
            }
        }
        _ => false, // Ignore other event kinds by default
    }
}

Explanation of src/watcher.rs:

  • start_watcher: An async function that sets up and runs the notify watcher.
  • mpsc::channel: A multi-producer, single-consumer channel is used to send WatcherEvent messages to the main build loop.
  • RecommendedWatcher: notify’s best-effort watcher for the current platform.
  • watcher.watch: Configures the watcher to monitor content, template, and static directories recursively.
  • Debouncing: tokio::time::sleep is used to implement debouncing. When a file change event occurs, a timer is started. If another event occurs before the timer expires, the timer is reset. Only when the timer successfully expires (meaning no new events for 200ms) is a WatcherEvent::Change sent. This prevents multiple rapid rebuilds from a single save operation (e.g., an editor might save a file, then its metadata, triggering two events).
  • should_trigger_rebuild: A helper function to filter out irrelevant file system events, such as changes in the public output directory or temporary editor files.
4. src/main.rs - Integrating Watcher and Incremental Build

Finally, let’s update our main.rs to handle a watch command, which will start the development server and the file watcher.

src/main.rs (update main function)

use std::path::PathBuf;
use anyhow::Result;
use tracing::{info, Level};
use tracing_subscriber::FmtSubscriber;
use clap::Parser;
use tokio::sync::mpsc;

mod config;
mod content;
mod parser;
mod renderer;
mod template;
mod router;
mod server;
mod build;
mod cache; // NEW
mod watcher; // NEW

#[derive(Parser, Debug)]
#[command(author, version, about, long_about = None)]
struct Args {
    #[arg(subcommand)]
    command: Commands,
}

#[derive(Parser, Debug)]
enum Commands {
    /// Builds the static site
    Build {
        /// Path to the source directory (defaults to current directory)
        #[arg(short, long, default_value = ".")]
        source: PathBuf,
        /// Path to the output directory (defaults to 'public')
        #[arg(short, long, default_value = "public")]
        output: PathBuf,
    },
    /// Starts a development server with live-reloading
    Watch {
        /// Path to the source directory (defaults to current directory)
        #[arg(short, long, default_value = ".")]
        source: PathBuf,
        /// Path to the output directory (defaults to 'public')
        #[arg(short, long, default_value = "public")]
        output: PathBuf,
        /// Port for the development server
        #[arg(short, long, default_value_t = 8080)]
        port: u16,
    },
}

#[tokio::main]
async fn main() -> Result<()> {
    // Setup tracing for logging
    let subscriber = FmtSubscriber::builder()
        .with_max_level(Level::INFO) // Set default logging level
        .finish();
    tracing::subscriber::set_global_default(subscriber)
        .expect("setting default subscriber failed");

    let args = Args::parse();

    match args.command {
        Commands::Build { source, output } => {
            let config = config::Config::load(&source)?;
            let mut build_context = build::BuildContext::new(config, output)?;
            build::build_site(&mut build_context, build::BuildMode::Full).await?;
            info!("Static site built successfully!");
        }
        Commands::Watch { source, output, port } => {
            info!("Starting development server with watcher...");

            let config = config::Config::load(&source)?;
            let content_dir = source.join(&config.content_dir);
            let template_dir = source.join(&config.template_dir);
            let static_dir = source.join(&config.static_dir);

            // Initial full build
            let mut build_context = build::BuildContext::new(config, output.clone())?;
            build::build_site(&mut build_context, build::BuildMode::Full).await?;
            info!("Initial build complete. Serving on http://127.0.0.1:{}", port);

            // Channel for watcher events to trigger rebuilds
            let (tx, mut rx) = mpsc::channel(10);

            // Start file watcher in a separate task
            let watcher_tx = tx.clone();
            let ssg_root_clone = source.clone();
            let content_dir_clone = content_dir.clone();
            let template_dir_clone = template_dir.clone();
            let static_dir_clone = static_dir.clone();

            tokio::spawn(async move {
                if let Err(e) = watcher::start_watcher(
                    ssg_root_clone,
                    content_dir_clone,
                    template_dir_clone,
                    static_dir_clone,
                    watcher_tx,
                ).await {
                    error!("Watcher failed: {:?}", e);
                    // Send a shutdown signal if watcher fails
                    if let Err(send_err) = tx.send(watcher::WatcherEvent::Shutdown).await {
                        error!("Failed to send shutdown signal: {:?}", send_err);
                    }
                }
            });

            // Start development server in a separate task
            let server_handle = tokio::spawn(server::start_dev_server(port, output.clone()));

            // Main loop to listen for watcher events and trigger rebuilds
            loop {
                tokio::select! {
                    Some(event) = rx.recv() => {
                        match event {
                            watcher::WatcherEvent::Change => {
                                info!("File change detected. Triggering incremental build...");
                                // Re-load config and create a new build context for each build
                                // This ensures latest config is used and build_context is fresh
                                let new_config = config::Config::load(&source)?;
                                let mut new_build_context = build::BuildContext::new(new_config, output.clone())?;
                                if let Err(e) = build::build_site(&mut new_build_context, build::BuildMode::Incremental).await {
                                    error!("Incremental build failed: {:?}", e);
                                } else {
                                    info!("Incremental build complete.");
                                    // TODO: Implement live-reload signal to the browser here
                                    // For now, manual refresh is needed, or integrate a WebSocket for live-reload.
                                }
                            }
                            watcher::WatcherEvent::Shutdown => {
                                error!("Watcher shutdown signal received. Exiting.");
                                break;
                            }
                        }
                    }
                    _ = &mut server_handle => {
                        error!("Development server stopped unexpectedly. Exiting.");
                        break;
                    }
                }
            }
        }
    }

    Ok(())
}

Explanation of src/main.rs changes:

  • New Watch command in clap::Parser.
  • When Watch command is run:
    • An initial Full build is performed.
    • An mpsc::channel is created to communicate between the watcher and the main loop.
    • watcher::start_watcher is spawned as a separate Tokio task, sending events to the channel.
    • server::start_dev_server is also spawned.
    • The main loop then tokio::select!s between receiving watcher events and the server handle.
    • Upon receiving WatcherEvent::Change, an Incremental build is triggered.
    • Upon WatcherEvent::Shutdown or server failure, the application exits.
  • Important: We re-load the Config and create a new BuildContext for each build (initial or incremental). This ensures that any changes to config.toml are picked up immediately without restarting the entire application.

c) Testing This Component

To test the incremental build and file watching:

  1. Build the project:

    cargo build
    
  2. Run the watcher:

    cargo run watch --source . --output public --port 8080
    

    You should see output indicating the initial full build and the watcher starting.

  3. Open your browser: Navigate to http://127.0.0.1:8080. You should see your site.

  4. Make a change:

    • Open a Markdown file in your content/ directory (e.g., content/posts/first-post.md).
    • Change some text.
    • Save the file.
  5. Observe the console:

    • You should see logs in your terminal indicating:
      • File modified: "content/posts/first-post.md"
      • File change detected. Triggering incremental build...
      • Processing content file: .../content/posts/first-post.md
      • Rendered: public/posts/first-post/index.html
      • Incremental build complete.
    • The build time should be significantly faster than a full rebuild.
  6. Verify in browser: Refresh your browser. You should see the updated content.

  7. Test template changes:

    • Modify a template file (e.g., templates/base.html).
    • Save the file.
    • You should see a warn! message: Changes in [...] detected. Triggering full rebuild. followed by a full rebuild.
  8. Test static asset changes:

    • Modify a static asset (e.g., static/css/style.css).
    • Save the file.
    • You should see File modified: "static/css/style.css" and Copied static file: "static/css/style.css" logs.
  9. Test adding/deleting files:

    • Add a new Markdown file to content/.
    • Delete an existing Markdown file from content/.
    • Observe the logs. For deletion, you’ll see a warning about output file cleanup.

Debugging Tips:

  • If the watcher isn’t triggering:
    • Ensure your source path is correct.
    • Check tracing logs for Watcher error messages.
    • Verify the watched directories exist.
    • Make sure you are saving the file, not just changing it.
  • If the incremental build is slow:
    • Check tracing logs. Are too many files being processed?
    • Is requires_full_rebuild being triggered unexpectedly?
    • Verify file hashing is working correctly.

Production Considerations

  • Production Builds: Incremental builds are primarily for development. For production deployments, always perform a Full build (cargo run build) to ensure a clean, consistent output. This avoids any potential stale content issues that might arise from complex incremental cache invalidation scenarios.
  • Performance:
    • Hashing: SHA256 is cryptographically secure but might be overkill for just file change detection. For extremely large sites, consider faster non-cryptographic hashes (like FNV) or simply relying on file modification timestamps if absolute reliability isn’t critical (though timestamps can be unreliable across different file systems or during sync operations). We’ll stick with SHA256 for robustness.
    • Cache Serialization: serde_json is generally fast, but for millions of files, saving/loading the cache could become a bottleneck. Binary formats like bincode could be faster.
    • Dependency Tracking: Our current template dependency tracking is basic (full rebuild on template change). For massive sites with many templates and partials, a sophisticated dependency graph (e.g., which content files use which templates/partials) would be necessary to achieve true incremental rendering. This is a significant complexity increase but offers ultimate performance.
  • Security: The build cache (.ssg_cache.json) should not contain any sensitive information. It primarily stores file paths and hashes, which are not security-critical. Ensure .ssg_cache.json is added to .gitignore.
  • Logging and Monitoring: The tracing crate is essential. During development, debug! and info! levels provide insight. In production, info! and warn! are typically sufficient. Ensure errors are always logged for easy debugging.

Code Review Checkpoint

At this point, you have implemented:

  • A BuildCache struct in src/cache.rs to store file metadata (path, hash, last modified).
  • Functions to compute SHA256 hashes of files.
  • Modified src/build.rs to:
    • Load and save the BuildCache.
    • detect_file_changes to compare current file states with the cache and identify Added, Modified, Deleted files.
    • An incremental build logic that only processes changed content/static files or triggers a full rebuild for template changes.
  • A src/watcher.rs module that uses the notify crate to monitor source directories.
  • A debouncing mechanism in the watcher to prevent excessive rebuilds.
  • Updated src/main.rs to include a watch command, which starts the watcher and an incremental build loop alongside the development server.

Files Created/Modified:

  • Cargo.toml (added notify, sha2, hex)
  • src/cache.rs (new)
  • src/watcher.rs (new)
  • src/build.rs (significant modifications to build_site, added BuildContext, detect_file_changes)
  • src/main.rs (added watch command, integrated watcher and incremental build loop)

This completes a major enhancement to our SSG’s development workflow, making it much more responsive and enjoyable to use.

Common Issues & Solutions

  1. Watcher not detecting changes:

    • Issue: You save a file, but the console doesn’t show any File change detected messages.
    • Solution:
      • Check Paths: Ensure the source directory passed to cargo run watch is correct and contains your content, templates, and static folders.
      • File Type/Name: Some editors create temporary files (e.g., .~filename, #filename#, .filename.swp) that our should_trigger_rebuild function might filter. Verify that the actual source file is being monitored.
      • Permissions: On some systems, notify might have issues with file system permissions. Run with elevated privileges if necessary (though usually not required for user directories).
      • Large Projects: For projects with an extremely large number of files, the watcher might struggle. Consider increasing the channel buffer size or the debounce duration.
      • OS-specific quirks: notify relies on OS-specific APIs. Some network drives or virtualized file systems might not emit events reliably.
    • Debugging: Add debug! logs inside should_trigger_rebuild and in the tokio::select! loop in start_watcher to see all raw events.
  2. Stale content after incremental build:

    • Issue: You change a file, an incremental build runs, but the browser still shows old content even after refreshing.
    • Solution:
      • Cache Invalidation Logic: This is the trickiest part. Our current logic for template changes triggers a full rebuild. If you see stale content, it might mean a change (e.g., in a partial included by a template) wasn’t correctly identified as needing a full rebuild or wasn’t correctly linked to affected content files.
      • Clean Build: If you encounter this, always try cargo run build (a full build) to confirm if the issue is with the incremental logic or the core rendering.
      • Debugging: Use debug! logs in detect_file_changes to verify that your file changes are correctly categorized as Added or Modified. Check the build_cache.json file to see if the hashes are being updated.
  3. Build loop/excessive rebuilds:

    • Issue: The SSG enters a continuous rebuild loop or rebuilds too frequently for minor changes.
    • Solution:
      • Debouncing: Adjust the Duration::from_millis(200) in src/watcher.rs. Some editors might save in bursts. Increase it if needed (e.g., to 500ms).
      • Output Directory Filtering: Ensure your output_dir (e.g., public/) is correctly excluded from watching in should_trigger_rebuild. If the watcher monitors its own output, it will trigger an infinite loop. Also ensure .ssg_cache.json is excluded.
      • Temporary Files: Make sure the should_trigger_rebuild logic is robust in filtering out temporary editor files.

Testing & Verification

  1. Start the watcher:
    cargo run watch --source . --output public --port 8080
    
  2. Initial build check: Verify that the first build is a “Full” build and completes successfully.
  3. Content modification: Edit any .md file in your content directory. Save it. Observe the console for Incremental build complete within milliseconds (usually less than 100ms for small sites). Refresh your browser to confirm changes.
  4. New content creation: Create a brand new .md file in content. Save it. Verify it’s detected as Added and rendered correctly.
  5. Content deletion: Delete an existing .md file. Observe the logs. Acknowledge the warning about output file cleanup.
  6. Template modification: Edit a Tera template file in your templates directory. Save it. Verify a full rebuild is triggered and all pages are updated.
  7. Static asset modification: Change a CSS or image file in your static directory. Save it. Verify it’s detected and copied to the public directory.
  8. Configuration modification: Change a value in config.toml. Save it. This should trigger a full rebuild because the BuildContext is recreated, effectively re-parsing the config.
  9. Error handling: Introduce a syntax error in a Markdown file or a Tera template. The build should log an error but ideally not crash the watcher, allowing you to fix the error and trigger another build.

Summary & Next Steps

Congratulations! You’ve successfully implemented incremental builds and a file system watcher for your Rust SSG. This is a monumental step in improving the developer experience, making your SSG feel fast and responsive, comparable to modern static site generators. You now understand the core principles of build caching, change detection, and event-driven build systems.

We’ve laid the groundwork for a highly performant development workflow. While our current incremental logic for template changes is a full rebuild, it’s a practical trade-off for simplicity and correctness during development.

In the next chapter, we will focus on Chapter 11: Search Indexing and Pagefind Integration. We’ll learn how to generate a search index for your content and integrate it with a powerful client-side search library like Pagefind to provide a seamless search experience for your users.