Chapter Introduction
In previous chapters, we laid the foundation for our Rust-based Static Site Generator (SSG) by setting up a project, parsing Markdown into an Abstract Syntax Tree (AST), transforming it into HTML, and integrating a basic templating system with Tera. We also introduced frontmatter for essential metadata like titles and dates. While these are crucial, modern content platforms require more sophisticated management capabilities, especially when dealing with evolving documentation, multi-version APIs, or complex editorial workflows.
This chapter will guide you through implementing advanced content management features, focusing on content versioning and rich metadata handling. We’ll enhance our SSG to recognize version information directly from content file paths (e.g., content/docs/v1.0/my-article.md) and extend our frontmatter schema to include critical metadata such as content status (draft, published, deprecated), last updated timestamps, and relationships to other content. By the end of this chapter, our SSG will be capable of processing and storing a much richer set of content attributes, laying the groundwork for more dynamic routing, navigation, and content display in future chapters.
Planning & Design
Managing content effectively means not just rendering it, but also understanding its lifecycle, target audience, and relationships. Versioning is paramount for documentation sites or APIs, where multiple iterations of content coexist. Metadata provides the context needed to drive advanced features like filtering, search, and conditional rendering.
Architectural Overview for Advanced Content Processing
We will modify our content parsing pipeline to extract version information from the file path and enrich our FrontMatter struct with new fields. This involves updating our Content struct to hold this path-derived version and enhancing the frontmatter module with new, optional fields and custom data types for better validation.
File Structure & Data Model Updates
We’ll primarily be modifying src/frontmatter.rs to define the new metadata fields and src/content.rs to incorporate the path-based versioning and the expanded frontmatter.
src/frontmatter.rs:
- Introduce
ContentStatusenum for content lifecycle. - Add
last_updated: Option<NaiveDate>,related_articles: Option<Vec<String>>,audience: Option<String>, andstatus: Option<ContentStatus>to theFrontMatterstruct.
src/content.rs:
- Add
path_version: Option<String>to theContentstruct. - Update the
Content::from_file(or equivalent) function to parse the version from the file path.
Step-by-Step Implementation
a) Setup/Configuration
First, ensure your Cargo.toml includes serde_yaml and chrono for date parsing if not already present. We’ll specifically need chrono’s NaiveDate for last_updated fields.
Cargo.toml
# ... other dependencies
[dependencies]
# ... existing dependencies like serde, serde_derive, pulldown-cmark, tera, anyhow
serde_yaml = "0.9"
chrono = { version = "0.4", features = ["serde"] } # "serde" feature for (de)serialization
log = "0.4"
env_logger = "0.11"
Next, let’s create a new module for our custom content types, or modify existing ones. We’ll start by expanding src/frontmatter.rs.
b) Core Implementation
1. Define ContentStatus Enum and Update FrontMatter Struct
We need a way to categorize the state of our content (e.g., draft, published). An enum is perfect for this, and serde allows us to easily deserialize string representations into our enum variants.
Create or modify src/frontmatter.rs to include the ContentStatus enum and update the FrontMatter struct.
src/frontmatter.rs
use serde::{Deserialize, Serialize};
use chrono::NaiveDate;
use std::collections::HashMap;
use log::{warn, error};
/// Represents the lifecycle status of a content piece.
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "lowercase")] // Allows "draft", "published", etc. in YAML/TOML
pub enum ContentStatus {
Draft,
Published,
Archived,
Deprecated,
}
/// Structure to hold frontmatter metadata from content files.
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct FrontMatter {
pub title: String,
pub date: NaiveDate,
pub draft: Option<bool>,
pub description: Option<String>,
pub slug: Option<String>,
pub weight: Option<usize>,
pub keywords: Option<Vec<String>>,
pub tags: Option<Vec<String>>,
pub categories: Option<Vec<String>>,
pub author: Option<String>,
pub show_reading_time: Option<bool>,
pub show_table_of_contents: Option<bool>,
pub show_comments: Option<bool>,
pub toc: Option<bool>,
// --- New Advanced Metadata Fields ---
pub status: Option<ContentStatus>,
pub last_updated: Option<NaiveDate>,
pub related_articles: Option<Vec<String>>,
pub audience: Option<String>,
// Allow for arbitrary additional fields
#[serde(flatten)]
pub extra: HashMap<String, serde_json::Value>,
}
impl Default for FrontMatter {
fn default() -> Self {
FrontMatter {
title: "Untitled".to_string(),
date: NaiveDate::from_ymd_opt(1970, 1, 1).unwrap(), // Sensible default
draft: Some(false),
description: None,
slug: None,
weight: None,
keywords: None,
tags: None,
categories: None,
author: None,
show_reading_time: Some(true),
show_table_of_contents: Some(true),
show_comments: Some(false),
toc: Some(true),
status: Some(ContentStatus::Draft), // Default to Draft
last_updated: None,
related_articles: None,
audience: None,
extra: HashMap::new(),
}
}
}
/// Parses the frontmatter string into a FrontMatter struct.
pub fn parse_frontmatter(content: &str) -> anyhow::Result<(FrontMatter, String)> {
let parts: Vec<&str> = content.split("+++").collect();
if parts.len() < 3 {
// Log a warning for malformed frontmatter, but try to proceed with default
warn!("Content file missing valid frontmatter delimiters (+++). Attempting to parse entire file as content.");
return Ok((FrontMatter::default(), content.to_string()));
}
let frontmatter_str = parts[1].trim();
let markdown_content = parts[2..].join("+++").trim().to_string();
match serde_yaml::from_str::<FrontMatter>(frontmatter_str) {
Ok(mut fm) => {
// Ensure essential fields have defaults if not provided
if fm.title == "Untitled" && !fm.extra.is_empty() {
if let Some(val) = fm.extra.get("title") {
if let Some(s) = val.as_str() {
fm.title = s.to_string();
fm.extra.remove("title"); // Remove to avoid duplication
}
}
}
if fm.date == NaiveDate::from_ymd_opt(1970, 1, 1).unwrap() && !fm.extra.is_empty() {
if let Some(val) = fm.extra.get("date") {
if let Some(s) = val.as_str()) {
if let Ok(date) = NaiveDate::parse_from_str(s, "%Y-%m-%d") {
fm.date = date;
fm.extra.remove("date");
} else if let Ok(datetime) = NaiveDate::parse_from_str(s, "%Y-%m-%dT%H:%M:%S%z") { // ISO 8601
fm.date = datetime;
fm.extra.remove("date");
} else if let Ok(datetime) = NaiveDate::parse_from_str(s, "%Y-%m-%d %H:%M:%S") { // Common format
fm.date = datetime;
fm.extra.remove("date");
} else {
warn!("Failed to parse date from frontmatter: {}", s);
}
}
}
}
// Set last_updated if not provided but date is present
if fm.last_updated.is_none() && fm.date != NaiveDate::from_ymd_opt(1970, 1, 1).unwrap() {
fm.last_updated = Some(fm.date);
}
Ok((fm, markdown_content))
}
Err(e) => {
error!("Failed to parse frontmatter: {}. Content:\n{}", e, frontmatter_str);
// In a production SSG, you might want to return an error here
// or return a default FrontMatter with a warning.
// For now, let's return a default and log the error.
Ok((FrontMatter::default(), content.to_string()))
}
}
}
Explanation:
ContentStatusEnum: We use#[serde(rename_all = "lowercase")]to allow case-insensitive deserialization (e.g.,status: draftwill work).- New
FrontMatterFields:status,last_updated,related_articles, andaudienceare added asOption<T>because they are optional in content files. DefaultImplementation: Updated to provide sensible defaults for the new fields, ensuring ourFrontMatterstruct is always valid.parse_frontmatter: The error handling for parsing has been improved. If frontmatter parsing fails, it now logs an error with the problematic content and falls back to a defaultFrontMatterinstance, allowing the build to continue (though with a warning). We also added logic to defaultlast_updatedtodateiflast_updatedisn’t explicitly set. This provides a reasonable fallback.
2. Update Content Struct and Version Extraction
Now, let’s modify src/content.rs to include the path_version field and update our Content::from_file (or similar) function to extract this version from the file path. We’ll use regular expressions to reliably find version patterns like v1.0, v2, 2024, etc., within the path.
src/content.rs
use anyhow::{Context, Result};
use std::path::{Path, PathBuf};
use pulldown_cmark::{Parser, Options, html};
use crate::frontmatter::{FrontMatter, parse_frontmatter, ContentStatus};
use log::{debug, warn, error};
use regex::Regex; // New dependency for path version extraction
/// Represents a single piece of content (ee.g., a blog post, a documentation page).
#[derive(Debug, Clone)]
pub struct Content {
pub file_path: PathBuf,
pub relative_path: PathBuf, // Path relative to the content root
pub front_matter: FrontMatter,
pub markdown: String,
pub html: String,
pub url: String, // The final URL for this content
// --- New Field ---
pub path_version: Option<String>, // Extracted from the file path
}
impl Content {
/// Creates a new `Content` instance by reading and parsing a Markdown file.
pub fn from_file(file_path: &Path, content_root: &Path) -> Result<Self> {
let content_string = std::fs::read_to_string(file_path)
.with_context(|| format!("Failed to read content file: {:?}", file_path))?;
let (front_matter, markdown) = parse_frontmatter(&content_string)
.context("Failed to parse frontmatter from content file")?;
let relative_path = file_path.strip_prefix(content_root)
.with_context(|| format!("Failed to get relative path for {:?}", file_path))?
.to_path_buf();
// --- Extract version from path ---
let path_version = Self::extract_version_from_path(&relative_path);
if let Some(version) = &path_version {
debug!("Extracted version '{}' from path: {:?}", version, relative_path);
}
// Convert Markdown to HTML
let mut options = Options::empty();
options.insert(Options::ENABLE_TABLES);
options.insert(Options::ENABLE_FOOTNOTES);
options.insert(Options::ENABLE_STRIKETHROUGH);
options.insert(Options::ENABLE_TASKLISTS);
options.insert(Options::ENABLE_SMART_PUNCTUATION);
options.insert(Options::ENABLE_HEADING_ATTRIBUTES); // For anchors
let parser = Parser::new_ext(&markdown, options);
let mut html_output = String::new();
html::push_html(&mut html_output, parser);
// Placeholder for URL generation (will be refined later)
let file_name = file_path.file_stem().and_then(|s| s.to_str()).unwrap_or("index");
let url = format!("/{}.html", file_name); // Basic URL, will be improved
Ok(Content {
file_path: file_path.to_path_buf(),
relative_path,
front_matter,
markdown,
html: html_output,
url,
path_version,
})
}
/// Extracts a version string from a given path.
/// Looks for patterns like 'v1.0', 'v2', '2024', etc., typically in a directory name.
fn extract_version_from_path(path: &Path) -> Option<String> {
// Regex to match common version patterns in a path segment
// e.g., /v1.0/, /2024/, /version-2/
// We make it non-greedy and look for words that start with 'v' or 'V' followed by numbers/dots,
// or just numbers, or 'version-' followed by numbers.
let re = Regex::new(r"(?i)(?:^|/)(v?\d[\d\.]*|version-\d+)(?:/|$)")
.expect("Failed to compile version regex");
path.to_str()
.and_then(|s| re.captures(s))
.and_then(|caps| {
// Return the first captured group, which should be the version string
caps.get(1).map(|m| m.as_str().to_string())
})
}
}
Explanation:
path_version: Option<String>: Added to theContentstruct to store the version extracted from the file path.extract_version_from_path: This new private helper function uses theregexcrate to find common version patterns (v1.0,2024,version-2) within the content’s relative path.- We add
regex = "1.10"toCargo.toml. - The regex
r"(?i)(?:^|/)(v?\d[\d\.]*|version-\d+)(?:/|$)"is designed to be flexible:(?i)makes it case-insensitive.(?:^|/)matches the start of the string or a directory separator.(v?\d[\d\.]*|version-\d+)is the core version pattern:v?\d[\d\.]*: Matchesv1,v1.0,1.0,2024, etc. (vis optional, then a digit, then any number of digits or dots).|version-\d+: Matchesversion-1,version-2, etc.
(?:/|$)matches a directory separator or the end of the string.- We capture the actual version string in group
1.
- We add
Content::from_fileUpdate: CallsSelf::extract_version_from_pathand stores the result inpath_version.- Logging: Added
debug!logs to show when a version is extracted, aiding in debugging.
Add regex dependency:
Cargo.toml
# ... other dependencies
[dependencies]
# ... existing dependencies
regex = "1.10" # Add this line
c) Testing This Component
To test these changes, we’ll create a new content file that utilizes both the path-based versioning and the expanded frontmatter fields.
1. Create Sample Content File
Create a new directory and file: content/docs/v1.0/introduction.md.
content/docs/v1.0/introduction.md
+++
title = "Introduction to Our Platform"
date = 2023-01-15
draft = false
description = "An introductory guide to understanding our platform's core concepts."
slug = "introduction"
tags = ["platform", "getting-started", "basics"]
categories = ["Documentation"]
author = "Dev Team"
status = "published"
last_updated = 2024-02-28
related_articles = ["/docs/v1.0/setup", "/docs/v2.0/whats-new"]
audience = "developers"
extra_field = "some_value"
+++
# Welcome to Our Platform (Version 1.0)
This document provides a comprehensive introduction to our platform's version 1.0 features.
## Key Concepts
* **Scalability**: Designed for high load.
* **Security**: Built with best practices.
## What's Next?
Explore our [setup guide](/docs/v1.0/setup) to get started.
And another one: content/blog/2024/new-features.md
content/blog/2024/new-features.md
+++
title = "Exciting New Features for 2024"
date = 2024-03-01
draft = false
description = "A look at the latest enhancements and features released in 2024."
slug = "new-features-2024"
tags = ["features", "updates", "release"]
categories = ["Blog"]
author = "Product Team"
status = "published"
last_updated = 2024-03-02
audience = "all-users"
+++
# New Features Released (2024)
We are thrilled to announce several major updates for our platform in 2024.
## Performance Improvements
Significant optimizations have been made across the board...
2. Update main.rs to Process Content
Ensure your main.rs is set up to load content files from the content directory.
src/main.rs
use anyhow::Result;
use std::path::{Path, PathBuf};
use std::fs;
use crate::content::Content; // Import Content struct
use env_logger::Env;
use log::{info, error, debug};
mod frontmatter;
mod content;
mod template; // Assuming you have this from previous chapters
mod build; // Assuming you have this from previous chapters
fn main() -> Result<()> {
// Initialize logging
env_logger::Builder::from_env(Env::default().default_filter_or("info")).init();
info!("Starting SSG build process...");
let content_dir = PathBuf::from("content");
let output_dir = PathBuf::from("public");
// Ensure output directory exists and is clean
if output_dir.exists() {
fs::remove_dir_all(&output_dir)
.context(format!("Failed to remove existing output directory: {:?}", output_dir))?;
}
fs::create_dir_all(&output_dir)
.context(format!("Failed to create output directory: {:?}", output_dir))?;
let mut contents: Vec<Content> = Vec::new();
// Recursively read content files
for entry in walkdir::WalkDir::new(&content_dir) {
let entry = entry?;
let path = entry.path();
if path.is_file() && path.extension().map_or(false, |ext| ext == "md") {
debug!("Processing content file: {:?}", path);
match Content::from_file(path, &content_dir) {
Ok(content) => {
info!("Successfully parsed content: {:?} (URL: {})", content.relative_path, content.url);
debug!("Frontmatter: {:?}", content.front_matter);
debug!("Path Version: {:?}", content.path_version); // Log the new field
contents.push(content);
}
Err(e) => {
error!("Failed to process content file {:?}: {:?}", path, e);
}
}
}
}
info!("Processed {} content files.", contents.len());
// In a real scenario, you'd now pass `contents` to a build/render step
// For now, we'll just print some info to verify.
for content in &contents {
info!("--- Content Details ---");
info!(" Title: {}", content.front_matter.title);
info!(" Date: {}", content.front_matter.date);
info!(" Status: {:?}", content.front_matter.status.as_ref().map(|s| format!("{:?}", s)).unwrap_or_else(|| "N/A".to_string()));
info!(" Last Updated: {:?}", content.front_matter.last_updated);
info!(" Audience: {:?}", content.front_matter.audience);
info!(" Related Articles: {:?}", content.front_matter.related_articles);
info!(" Path Version: {:?}", content.path_version);
info!(" Relative Path: {:?}", content.relative_path);
info!(" URL: {}", content.url);
info!("-----------------------");
}
info!("SSG build process finished.");
Ok(())
}
Add walkdir dependency:
Cargo.toml
# ... other dependencies
[dependencies]
# ... existing dependencies
walkdir = "2.5" # Add this line
Run the SSG:
cargo run
Expected Behavior: You should see output similar to this (simplified):
INFO ssg_project > Starting SSG build process...
DEBUG ssg_project > Processing content file: "content/docs/v1.0/introduction.md"
DEBUG ssg_project > Extracted version 'v1.0' from path: "docs/v1.0/introduction.md"
INFO ssg_project > Successfully parsed content: "docs/v1.0/introduction.md" (URL: /introduction.html)
DEBUG ssg_project > Frontmatter: FrontMatter { title: "Introduction to Our Platform", date: 2023-01-15, ..., status: Some(Published), last_updated: Some(2024-02-28), related_articles: Some(["/docs/v1.0/setup", "/docs/v2.0/whats-new"]), audience: Some("developers"), extra: {"extra_field": String("some_value")} }
DEBUG ssg_project > Path Version: Some("v1.0")
DEBUG ssg_project > Processing content file: "content/blog/2024/new-features.md"
DEBUG ssg_project > Extracted version '2024' from path: "blog/2024/new-features.md"
INFO ssg_project > Successfully parsed content: "blog/2024/new-features.md" (URL: /new-features.html)
DEBUG ssg_project > Frontmatter: FrontMatter { title: "Exciting New Features for 2024", date: 2024-03-01, ..., status: Some(Published), last_updated: Some(2024-03-02), related_articles: None, audience: Some("all-users"), extra: {} }
DEBUG ssg_project > Path Version: Some("2024")
INFO ssg_project > Processed 2 content files.
INFO ssg_project > --- Content Details ---
INFO ssg_project > Title: Introduction to Our Platform
INFO ssg_project > Date: 2023-01-15
INFO ssg_project > Status: Published
INFO ssg_project > Last Updated: Some(2024-02-28)
INFO ssg_project > Audience: Some("developers")
INFO ssg_project > Related Articles: Some(["/docs/v1.0/setup", "/docs/v2.0/whats-new"])
INFO ssg_project > Path Version: Some("v1.0")
INFO ssg_project > Relative Path: "docs/v1.0/introduction.md"
INFO ssg_project > URL: /introduction.html
INFO ssg_project > -----------------------
INFO ssg_project > --- Content Details ---
INFO ssg_project > Title: Exciting New Features for 2024
INFO ssg_project > Date: 2024-03-01
INFO ssg_project > Status: Published
INFO ssg_project > Last Updated: Some(2024-03-02)
INFO ssg_project > Audience: Some("all-users")
INFO ssg_project > Related Articles: None
INFO ssg_project > Path Version: Some("2024")
INFO ssg_project > Relative Path: "blog/2024/new-features.md"
INFO ssg_project > URL: /new-features.html
INFO ssg_project > -----------------------
INFO ssg_project > SSG build process finished.
This output confirms that:
- The
path_versionis correctly extracted fromcontent/docs/v1.0/introduction.mdas “v1.0” and fromcontent/blog/2024/new-features.mdas “2024”. - The new frontmatter fields (
status,last_updated,related_articles,audience,extra_field) are correctly parsed and deserialized into theFrontMatterstruct. - The
ContentStatusenum works as expected.
Production Considerations
Error Handling for Metadata:
- Invalid
ContentStatus: If a user specifiesstatus: unknownin frontmatter,serdewill fail to deserializeContentStatus. Our currentparse_frontmatterlogs an error and defaults. For production, you might want a stricter approach, potentially failing the build for that specific file or marking the content as “invalid” rather than silently defaulting. - Date Parsing:
NaiveDate::parse_from_strcan fail. We’ve added basic error logging, but robust date parsing might involve trying multiple formats or using a more forgiving library if strict ISO 8601 is not enforced. - Missing Essential Metadata: While optional fields are fine, some metadata (e.g.,
title) might be critical. You could add a validation step after parsing frontmatter to ensure all required fields are present and log critical errors if they are not.
- Invalid
Performance Optimization:
- Regex Compilation: Compiling the version extraction regex inside a loop (if
Content::from_filewere called repeatedly in a hot loop) would be inefficient. By defining it once (e.g., usinglazy_static!or a staticRegexinstance), we avoid recompilation. Our currentextract_version_from_pathcreates it on each call, which is acceptable for typical SSG builds (which are not that hot-loop intensive for individual file parsing) but could be improved. - Frontmatter Parsing Speed:
serde_yamlis generally efficient. For extremely large numbers of files, profiling might reveal bottlenecks, but for typical SSG scales, it’s usually not the primary concern.
- Regex Compilation: Compiling the version extraction regex inside a loop (if
Security Considerations:
- Arbitrary Frontmatter: The
extra: HashMap<String, serde_json::Value>field allows arbitrary data. While this is flexible, ensure that any downstream processing of these extra fields is secure and doesn’t execute untrusted code or lead to injection vulnerabilities if used in dynamic contexts (e.g., client-side JavaScript). For static HTML, this risk is minimal. - Path Traversal: Our
relative_pathandfile_pathhandling should prevent path traversal issues when constructing URLs or accessing files, asPathBufandPathtypes generally handle this safely.
- Arbitrary Frontmatter: The
Logging and Monitoring:
- Granular Logging: We’ve added
debug!,info!, anderror!logs. In a production environment, configuringenv_loggerto filter logs (e.g., onlyinfoanderrorin production,debugin development) is crucial. - Monitoring Build Failures: If a build fails due to critical frontmatter errors, this should trigger alerts in a CI/CD pipeline.
- Granular Logging: We’ve added
Code Review Checkpoint
At this point, we have significantly enhanced our SSG’s ability to understand and categorize content.
Files Created/Modified:
Cargo.toml: Addedregexandwalkdirdependencies. Ensuredchronohasserdefeature.src/frontmatter.rs:- Defined
ContentStatusenum. - Added
status,last_updated,related_articles,audiencetoFrontMatterstruct. - Updated
FrontMatter::default()andparse_frontmatterfor new fields and improved error handling.
- Defined
src/content.rs:- Added
path_version: Option<String>toContentstruct. - Implemented
extract_version_from_pathusingregexto parse versions from file paths. - Updated
Content::from_fileto callextract_version_from_path.
- Added
src/main.rs:- Updated to iterate through content files using
walkdir. - Added logging to display the newly parsed
path_versionand advanced frontmatter fields. - Imported
env_loggerand initialized it for better debugging.
- Updated to iterate through content files using
Integration with Existing Code:
The changes are largely additive and integrate smoothly. The Content struct now holds more data, which will be available for templating and routing logic in subsequent chapters. The parsing logic is more robust due to improved error handling and type-safe enums.
Common Issues & Solutions
Issue:
ContentStatusdeserialization errors (e.g.,invalid type: string "unknown", expected a variant of ContentStatus)- Problem: The frontmatter specified a
statusvalue that doesn’t match any of theContentStatusenum variants (e.g.,status: Pendinginstead ofstatus: draft). - Solution:
- Check spelling: Ensure the value in your frontmatter exactly matches one of the
#[serde(rename_all = "lowercase")]variants (draft,published,archived,deprecated). - Add new variants: If you need a new status, add it to the
ContentStatusenum insrc/frontmatter.rs. - Implement a custom deserializer: For more complex mapping or error recovery for
ContentStatus, you could implementTryFrom<String>forContentStatusand use#[serde(try_from = "String")]or a customserdedeserializer. Our current setup relies onserde’s default string-to-enum mapping.
- Check spelling: Ensure the value in your frontmatter exactly matches one of the
- Problem: The frontmatter specified a
Issue:
NaiveDateparsing errors (e.g.,input contains invalid charactersorthe format string does not match the input)- Problem: The
dateorlast_updatedfield in your frontmatter is not in a formatchronoexpects (e.g.,2023/01/15instead of2023-01-15). - Solution:
- Standardize date format: Always use
YYYY-MM-DD(e.g.,2026-03-02) in your frontmatter. This is the most common and easily parsed format. - Multiple format attempts: For more flexibility, you could modify
parse_frontmatterto try parsing the date string with multipleNaiveDate::parse_from_strformats in a sequence, logging a warning if none succeed. For example, trying"%Y-%m-%d", then"%Y/%m/%d", etc. Be cautious not to make it too permissive, as ambiguity can lead to incorrect dates.
- Standardize date format: Always use
- Problem: The
Issue:
path_versionis alwaysNonedespite content having versioned paths.- Problem: The
extract_version_from_pathregex might not be matching your specific versioning scheme in the file path. - Solution:
- Verify regex: Double-check the regex
r"(?i)(?:^|/)(v?\d[\d\.]*|version-\d+)(?:/|$)"against your actual content file paths. - Test regex separately: Use an online regex tester (like regex101.com) with your content paths (e.g.,
/docs/v1.0/intro.md) and the regex to ensure it captures the desired version string. - Adjust regex: Modify the regex in
extract_version_from_pathto specifically match your versioning convention (e.g., if you userelease-1.0instead ofv1.0). - Check
log::debug!output: Thedebug!messages forExtracted versionshould help you see if the regex is finding anything. Ensure yourRUST_LOGenvironment variable is set todebugto see these logs (RUST_LOG=debug cargo run).
- Verify regex: Double-check the regex
- Problem: The
Testing & Verification
To thoroughly test the changes in this chapter:
Content File Variations:
- Create a file
content/docs/v1.0/article.mdwithstatus: published,last_updated: <date>,related_articles: [...],audience: "devs". - Create a file
content/docs/v2.0/article.mdwithstatus: draft. - Create a file
content/blog/my-post.md(no path version), withstatus: archived. - Create a file with a malformed
status(e.g.,status: invalid-status) and observe the error logging and default fallback. - Create a file with an invalid
dateformat and observe the error logging. - Create a file with no
last_updatedto check if it defaults todate.
- Create a file
Run with
RUST_LOG=debug:RUST_LOG=debug cargo runObserve the detailed logs for each content file:
- Confirm
Path Version: Some(...)is correctly extracted for versioned paths andNoneotherwise. - Confirm all new frontmatter fields (
Status,Last Updated,Audience,Related Articles) are correctly parsed and displayed in theContent Detailssection. - Verify that
extra_fieldis correctly stored in theextraHashMap. - Check for any
WARNorERRORmessages regarding frontmatter parsing for malformed files.
- Confirm
Unit Tests (Future Enhancement): While we’ve done manual verification, for a production SSG, you would write dedicated unit tests for:
frontmatter::parse_frontmatterwith various valid and invalid YAML inputs.Content::extract_version_from_pathwith diverse path strings.Content::from_fileto ensure complete content object deserialization.
By performing these checks, you can verify that our SSG now correctly parses advanced metadata and version information, making the content pipeline much more robust and feature-rich.
Summary & Next Steps
In this chapter, we significantly upgraded our SSG’s content management capabilities. We implemented:
- Path-based Content Versioning: Our SSG can now automatically detect version information (e.g.,
v1.0,2024) from file paths and store it with the content. This is crucial for managing documentation, APIs, or any content that evolves over time. - Enhanced Frontmatter Metadata: We extended our
FrontMatterstruct to include richer, more descriptive metadata fields such asContentStatus(draft, published, archived, deprecated),last_updated,related_articles, andaudience. This gives content creators more control and allows for more dynamic rendering logic. - Robust Parsing and Error Handling: We improved the frontmatter parsing logic with better error reporting and fallback mechanisms, making our SSG more resilient to malformed content files.
This enhanced content model is a cornerstone for building truly powerful and flexible static sites. Having rich metadata associated with each piece of content unlocks a multitude of possibilities for customization, filtering, and dynamic behavior.
In the next chapter, Chapter 10: Component Support in Markdown (Custom Syntax & Rendering), we will tackle a powerful feature inspired by modern frameworks like Astro: embedding interactive or reusable components directly within Markdown content using a custom syntax. This will allow content creators to inject dynamic elements or complex UI patterns without leaving the Markdown file, bridging the gap between static content and interactive web applications.