26 Unix tools. One binary. Zero dependencies.·the missing coreutils for the agent era·vrk mcp - expose all 26 tools to any AI agent·curl vrk.sh/install.sh | sh - ready in 5 seconds·

vrk validate

vrk validate is a JSON schema validator for shell pipelines - exit 1 on mismatch stops the pipeline before bad data propagates.

The problem

An LLM returns {“sentiment”:“string”,“confidence”:“number”} for 500 product reviews. 487 are correct. 13 return confidence as a string instead of a number. The analysis script crashes on float("high"). The entire pipeline reruns because there was no validation between the LLM and the database.

The solution

vrk validate checks every JSON record in a JSONL stream against a type schema. Valid records pass through to stdout. Invalid records are dropped with a warning to stderr. --strict halts the pipeline on the first bad record. --fix sends invalid records to an LLM for repair.

Before and after

Before

cat records.jsonl | while read line; do
  echo "$line" | jq -e '.name | type == "string"' > /dev/null && \
  echo "$line" | jq -e '.age | type == "number"' > /dev/null && \
  echo "$line"
done

After

cat records.jsonl | vrk validate --schema '{"name":"string","age":"number"}'

Example

cat llm-output.jsonl | vrk validate --schema '{"sentiment":"string","confidence":"number"}' --strict

Exit codes

CodeMeaning
0All records passed
1Record failed in –strict mode, or scanner error
2–schema missing, schema JSON invalid, unknown flag

Flags

FlagShortTypeDescription
--schema-sstringJSON schema inline or path to .json file (required)
--strictboolExit 1 on first invalid record
--fixboolSend invalid records to vrk prompt for repair
--json-jboolAppend metadata trailer after all output
--quiet-qboolSuppress stderr output

Schema format

The schema is a JSON object where keys are required field names and values are type strings: string, number, boolean, array, object.

{"name":"string","age":"number","active":"boolean"}

Extra keys in the input records are ignored - only the schema keys are checked. You can provide the schema as an inline JSON string or as a path to a .json file.

How it works

Valid records pass through silently

$ printf '{"name":"Alice","age":30}\n{"name":"Bob","age":25}\n' | \
    vrk validate --schema '{"name":"string","age":"number"}'
{"name":"Alice","age":30}
{"name":"Bob","age":25}

No output on stderr. Exit 0. Only clean data reaches stdout.

Invalid records are dropped with a warning

$ printf '{"name":"Alice","age":30}\n{"name":"Bob"}\n' | \
    vrk validate --schema '{"name":"string","age":"number"}'
{"name":"Alice","age":30}

Stderr shows: warning: validation failed: age is missing

Bob’s record was missing age, so it was dropped. Alice’s record passed through. The pipeline continues with only clean data.

–strict halts on first failure

$ printf '{"name":"Alice","age":30}\n{"name":"Bob"}\n' | \
    vrk validate --schema '{"name":"string","age":"number"}' --strict
{"name":"Alice","age":30}
$ echo $?
1

Stderr shows: warning: validation failed: age is missing

In strict mode, the first invalid record stops the pipeline with exit 1. Use this when partial results are worse than no results.

–fix sends invalid records to an LLM for repair

cat messy-output.jsonl | \
  vrk validate --schema '{"sentiment":"string","confidence":"number"}' --fix

Invalid records are sent to vrk prompt with instructions to fix them to match the schema. The repaired records are emitted if they now pass validation. Requires ANTHROPIC_API_KEY or OPENAI_API_KEY.

–json appends metadata

$ printf '{"name":"Alice","age":30}\n{"name":"Bob"}\n' | \
    vrk validate --schema '{"name":"string","age":"number"}' --json
{"name":"Alice","age":30}
{"_vrk":"validate","total":2,"valid":1,"invalid":1}

The metadata trailer tells you how many records passed and failed.

Pipeline integration

Validate LLM output before storing

# Ask an LLM a question, validate the response shape, then store it
echo "What are the top 3 risks?" | \
  vrk prompt --schema '{"risks":"array","summary":"string"}' | \
  vrk validate --schema '{"risks":"array","summary":"string"}' --strict | \
  vrk kv set --ns analysis latest-risks

Validate then assert specific values

# Validate schema shape, then check that confidence is high enough
echo "$LLM_RESPONSE" | \
  vrk validate --schema '{"answer":"string","confidence":"number"}' --strict | \
  vrk assert '.confidence >= 0.8'

Process a batch and log failures

# Validate each record, emit structured logs for monitoring
cat batch-output.jsonl | \
  vrk validate --schema '{"result":"string","score":"number"}' --json | \
  vrk emit --tag validation --parse-level

When it fails

Schema missing:

$ cat data.jsonl | vrk validate
usage error: validate: --schema is required
$ echo $?
2

Invalid schema JSON:

$ cat data.jsonl | vrk validate --schema 'not json'
usage error: validate: invalid schema JSON
$ echo $?
2

Strict mode on invalid record:

$ printf '{"name":"Alice"}\n' | vrk validate --schema '{"name":"string","age":"number"}' --strict
$ echo $?
1