Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Constraints

Constraints are validation rules that go beyond type checking. While types ensure a value is a String or Integer, constraints refine what values are actually valid — for example, restricting a String field to a specific set of allowed values.

Constraints are not a new type. They’re an optional layer on top of the existing type system. A field without constraints is validated by type alone; a field with constraints gets an additional check.

Categories

The categories constraint restricts a field’s values to a declared set. It applies to:

  • String — the value must be one of the listed strings
  • Integer — the value must be one of the listed integers
  • Date — each category is a string in RFC 3339 full-date shape
  • DateTime — each category is a string in RFC 3339 datetime shape
  • Array(String), Array(Integer), Array(Date), Array(DateTime) — each element must be one of the listed values

Boolean, Float, and Object fields don’t support categories — Boolean is already two-valued, Float is continuous, and Object is structural.

TOML representation

Categories live in a [fields.field.constraints] sub-table:

[[fields.field]]
name = "status"
type = "String"
allowed = ["**"]
required = ["blog/**"]
nullable = false

[fields.field.constraints]
categories = ["active", "archived", "completed", "draft", "published"]

Integer categories:

[[fields.field]]
name = "priority"
type = "Integer"

[fields.field.constraints]
categories = [1, 2, 3]

Array categories constrain each element:

[[fields.field]]
name = "tags"
type = "Array(String)"

[fields.field.constraints]
categories = ["go", "python", "rust"]

A field without a [fields.field.constraints] section (or without a categories key) is unconstrained.

Validation

When a value doesn’t match any of the declared categories, check reports an InvalidCategory violation. For arrays, the violation lists the specific offending elements. See Validation for details.

Null values on categorical fields follow the existing nullable logic — if nullable = true, null skips the category check. The category constraint only fires on non-null values that pass the type check.

Auto-inference

During init and update reinfer, mdvs automatically detects categorical fields using a heuristic with two conditions (both must hold):

  1. Max distinct values — the field has at most max_categories distinct values (default: 10)
  2. Minimum repetitiontotal occurrences / distinct values >= min_category_repetition (default: 3)

For array fields, distinct values and occurrences are counted at the element level.

Examples

  • status with 3 distinct values across 30 files: distinct=3, repetition=10 — categorical
  • title with 28 distinct values across 30 files: distinct=28 (exceeds cap) — not categorical
  • author with 5 distinct values across 5 files: repetition=1 (below threshold) — not categorical

Configurable thresholds

The thresholds are configurable in [fields]:

[fields]
max_categories = 10
min_category_repetition = 3

These control automatic inference only. Manually written categories in the TOML are unaffected by thresholds.

CLI flags on update reinfer override the TOML values per-invocation:

mdvs update example_kb reinfer --max-categories 15 --min-repetition 3

Range

The range constraint restricts a numeric field’s value to an inclusive [min, max] interval. It applies to:

  • Integer — value must satisfy min <= value <= max
  • Float — same, with float comparison
  • Array(Integer) — each element must satisfy the range
  • Array(Float) — same, element-wise

Both min and max are optional — you can specify just one bound. Boolean, String, Date, DateTime, and Object fields don’t support range. Date / DateTime bounds (e.g. “published after 2024-01-01”) aren’t supported in v1 — they require JSON Schema’s formatMinimum/formatMaximum vocab and are tracked as a follow-up.

TOML representation

[[fields.field]]
name = "rating"
type = "Integer"

[fields.field.constraints]
min = 1
max = 5

Float bounds (with optional integer bound on a Float field — bounds widen to f64 for comparison):

[[fields.field]]
name = "score"
type = "Float"

[fields.field.constraints]
min = 0
max = 100

Array example — each element checked against the bounds:

[[fields.field]]
name = "ratings"
type = "Array(Integer)"

[fields.field.constraints]
min = 1
max = 10

Validation

When a value is out of bounds, check reports an OutOfRange violation with the rule (min = N, max = N) and the offending value. For arrays, the violation lists the specific elements that are out of range.

Null values follow the existing nullable logic — if nullable = true, null skips the range check.

Type rules

Bound types must match the field type:

  • Integer fields require integer bounds. Float bounds (e.g., min = 0.5) are rejected at config load — likely a mistake; an integer can never equal 0.5.
  • Float fields accept both integer and float bounds (integer bounds widen to f64).

If both bounds are present, min must be <= max — otherwise rejected at config load.

Manual overrides

Use the --with flag on update reinfer to override the default heuristic for specific fields:

# Force categorical (skip heuristic threshold)
mdvs update example_kb reinfer title --with=categorical

# Infer min/max from observed numeric values
mdvs update example_kb reinfer sample_count --with=range

# Strip all constraints
mdvs update example_kb reinfer status --with=none

--with takes a comma-separated list of constraint kinds: categorical, range, or none. Incompatible kinds (e.g., range,categorical on the same field) are rejected at parse time. --with=none cannot be combined with other kinds. The flag requires named fields.

Manual TOML edit — you can also add or remove constraints by hand. Running update (without reinfer) preserves existing constraints as-is. Only update reinfer re-evaluates them.

Length

The length constraint bounds string length or array length. It applies to:

  • Stringmin_length <= len(value) <= max_length, where length is the Unicode scalar count
  • Array(T)min_length <= array length <= max_length
[[fields.field]]
name = "slug"
type = "String"

[fields.field.constraints]
min_length = 3
max_length = 64

Both bounds are optional. Integer fields, Float fields, and Boolean fields don’t support length. Length violations surface as OutOfRange. If both bounds are present, min_length <= max_length is enforced at config load.

Pattern

The pattern constraint runs a regular expression against String values:

[[fields.field]]
name = "version"
type = "String"

[fields.field.constraints]
pattern = '^v\d+\.\d+\.\d+$'

The regex is compiled at config load time — invalid syntax fails fast. Pattern is currently String-only. Pattern violations surface as WrongType (with detail naming the offending value). Categorical fields can’t also have a pattern — categories already enumerate the legal forms. Date and DateTime fields don’t accept pattern either — the type’s format is itself the pattern (see Date and DateTime).

Conflicts between constraint kinds

Some combinations are mutually exclusive on the same field:

  • categories + anything else — categories enumerate the legal values; other constraints would be redundant or contradictory. Rejected at config load.
  • range + length — range bounds numeric values; length bounds size. They apply to different field types (numeric vs. String/Array), so they should never collide in practice; the check is still enforced.

Compatible combinations: min/max together; min_length/max_length together; pattern with min_length/max_length.

Constraint kinds summary

ConstraintField typesViolation
categoriesString, Integer, Array(String), Array(Integer)InvalidCategory
min / maxInteger, Float, Array(Integer), Array(Float)OutOfRange
min_length / max_lengthString, Array(T)OutOfRange
patternStringWrongType

Each constraint kind is a key in the [fields.field.constraints] sub-table. Compatibility is checked at config load time.