Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Hugo

mdvs works directly on a Hugo site’s content/ tree. Hugo accepts YAML (---), TOML (+++), and JSON ({...}) frontmatter; mdvs accepts the same three formats and auto-detects per file, so it doesn’t matter which convention your site uses — or whether you’ve drifted across formats over time.

Setup

Point mdvs at the content/ directory:

mdvs init path/to/site/content

This scans every markdown file, infers a typed schema from the frontmatter (across all three formats), and writes mdvs.toml alongside. If auto-build is enabled (the default), it also downloads the embedding model and builds the search index under .mdvs/.

Two artifacts are created next to content/:

  • mdvs.toml — commit to version control
  • .mdvs/ — add to .gitignore (search index, regenerable)

Some Hugo sites prefer to keep the schema and index alongside the site root rather than inside content/. In that case, run mdvs init . from the site root and use a glob:

[scan]
glob = "content/**"

Mixed-format vaults

Hugo’s docs show all three frontmatter formats interchangeably, and real-world sites often end up with a mix — an older --- post sitting next to a newer +++ post and an occasional {...} block emitted by a content tool. mdvs handles this transparently. A single mdvs.toml is inferred across all three formats; the same title, tags, draft fields collapse into one schema regardless of where they were written.

You can verify this with mdvs check after init:

$ mdvs check
Checked 142 files — no violations

Forcing a single format

If your site is opinionated about TOML (Hugo’s default for hugo new), tell mdvs:

[scan]
frontmatter_format = "toml"

Now any file that uses --- (YAML) or { (JSON) raises a FrontmatterUnrepresentable error during check, naming both the configured and detected delimiters. Useful when you want your CI to fail loudly if someone drops in a YAML post by accident.

Native TOML dates

Hugo’s TOML frontmatter often uses native Date / DateTime literals — unquoted, e.g.:

+++
title = "Launching v2"
date = 2024-09-01
publishedAt = 2024-09-01T09:00:00Z
+++

mdvs recognizes both as typed fields: date becomes FieldType::Date, publishedAt becomes FieldType::DateTime. No special configuration. You can then filter on them in search:

mdvs search "release notes" --where "publishedAt > '2024-01-01T00:00:00Z'"

Useful queries

Once the index is built, common Hugo-site workflows become one-liners:

Find drafts that have been sitting around:

mdvs search "" --where "draft = true" --output json

Posts in a particular taxonomy:

mdvs search "machine learning roundup" --where "'ml' = ANY(tags)"

Posts authored by a specific contributor in a date range:

mdvs search "authentication" \
  --where "author = 'alice' AND date >= '2024-01-01' AND date < '2024-04-01'"

The --where clause is SQL against your frontmatter — anything you can express as a column reference works. See the Search Guide for details.

Validating across an editorial workflow

Add mdvs check to your Hugo build pipeline so frontmatter drift fails CI:

# .github/workflows/build.yml (excerpt)
- name: Validate frontmatter
  run: mdvs check
- name: Build site
  run: hugo --minify

mdvs check returns exit code 1 if any file violates the schema (missing required field, wrong type, etc.), which is enough to break the build. The exact same mdvs.toml validates YAML, TOML, and JSON files uniformly — no per-format duplicate rules.

See the CI recipe for a more general-purpose CI workflow.