Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

mdvs treats your markdown directory like a database. It scans your files, infers a typed schema from frontmatter, validates it, and builds a local search index — all in a single binary with no external services.

Not a document database. A database for documents.

The problem

Markdown directories grow organically. You start with a few notes, add frontmatter when it’s useful, and eventually have hundreds of files with inconsistent metadata. Tags are misspelled. Required fields are missing. You can’t find anything without grep.

mdvs gives you structure without forcing you to change how you write.

Frontmatter

Frontmatter is the YAML block between --- fences at the top of a markdown file. It stores structured metadata alongside your content:

---
title: "Experiment A-017: SPR-A1 baseline calibration"    # String
status: completed                                         # String
author: Giulia Ferretti                                   # String
draft: false                                              # Boolean
priority: 2                                               # Integer
drift_rate: 0.023                                         # Float
tags:                                                     # String[]
  - calibration
  - SPR-A1
  - baseline
---
# Your markdown content starts here...

mdvs recognizes these types automatically. When it scans your files, it infers the type of each field from the values it finds — no configuration needed.

Directory-aware schema

mdvs infers a three-dimensional schema from your files:

  • Types — boolean, integer, float, string, arrays, nested objects. Inferred automatically, with widening when files disagree.
  • Paths — which fields belong in which directories. draft only in blog/, sensor_type only in projects/alpha/notes/. Captured as allowed and required glob patterns.
  • Nullability — whether a field can be null. Tracked per field.

This means different directories can have different fields with different constraints — all inferred automatically from your existing files.

Tightest fit: mdvs init infers the strictest schema that’s consistent with your existing files. A field is inferred as allowed in a directory if at least one file there has it. It’s inferred as required if every file there has it. These rules propagate up — if every subdirectory requires a field, the parent directory does too. The result is the tightest set of constraints where check still returns zero violations. You can always loosen them later.

Two layers

mdvs has two distinct capabilities that work independently:

Validation — Scan your files, infer what frontmatter fields exist, which directories they appear in, and what types they have. Write the result to mdvs.toml. Then validate files against that schema. No model, no index, nothing to download.

Search — Chunk your markdown, embed it with a lightweight local model, store the vectors in Parquet files in .mdvs/, and query with natural language. Filter results on any frontmatter field using standard SQL.

You need validation without search? Run mdvs init, customize the fields in mdvs.toml, and run mdvs check.

You want search without validation? Just run mdvs init and mdvs search. The inferred schema is used to extract metadata for search results, but you don’t have to worry about it if you don’t want to.

Use them together for the best experience, or separately if that’s what you need.

Using a nested directory of markdown files as a database

You can think of mdvs as a layer on top of your markdown files that gives you database-like capabilities. Here’s a rough mapping of concepts and commands:

ConceptDatabasemdvs
Define structureCREATE TABLEmdvs init
Per-table columnsDifferent columns per tablePer-directory fields via allowed/required globs
Enforce constraintsConstraint validationmdvs check
Evolve structureALTER TABLEmdvs update
Create an indexCREATE INDEXmdvs build
QuerySELECT ... WHERE ... ORDER BYmdvs search --where

Two artifacts: mdvs.toml (your schema, to be committed) and .mdvs/ (the search index, can be ignored by version control).

What this book covers

This book uses a fictional research lab knowledge base (example_kb) as a running example. Every command, every output, every query is real and reproducible.

  • Getting Started — Install mdvs and run it on the example vault
  • Concepts — How schema inference, types, and validation work
  • Commands — Full reference for all 7 commands
  • Configuration — The mdvs.toml file explained
  • Search Guide — SQL filtering, array queries, and ranking
  • Recipes — Obsidian setup, CI integration