Validate LLM output before it propagates
Bad structured output exits 1 before reaching downstream systems - you catch schema drift at the source, not in production.
Gate the pipeline on schema correctness. Exit 1 on mismatch stops the next stage from running.
Pipeline
cat doc.txt \
| vrk prompt --system "Extract entities as JSON" \
| vrk validate --schema entities.json \
| vrk kv set entities
The problem
You asked the LLM for structured JSON. It returned something that looks like JSON but has a missing field, an extra key, or a wrong type. Your downstream code parses it, hits an unexpected null, and fails three steps later with an error that traces back to bad LLM output.
Schema validation at the source catches this immediately. The pipeline stops at the point of failure, not somewhere downstream where the root cause is hidden.
How the pipeline works
vrk prompt calls the LLM and prints the raw response. vrk validate checks the response against entities.json (a JSON Schema file). If it matches, the data passes through. If not, vrk validate exits 1 and vrk kv set never runs.
No bad data reaches storage. No downstream code processes an invalid response.
The schema file
A simple JSON Schema that defines what you expect:
{
"type": "object",
"required": ["entities"],
"properties": {
"entities": {
"type": "array",
"items": {
"type": "object",
"required": ["name", "type"],
"properties": {
"name": { "type": "string" },
"type": { "type": "string" }
}
}
}
}
}
Combining with retries
LLMs sometimes produce valid JSON that doesn’t match your schema. Combine with vrk coax to retry:
vrk coax --times 3 --backoff exp:1s --on 1 -- \
sh -c 'cat doc.txt \
| vrk prompt --system "Extract entities as JSON" \
| vrk validate --schema entities.json'
If validation fails (exit 1), vrk coax retries the entire sub-pipeline up to 3 times with exponential backoff.