vrk tok
vrk tok is a command-line token counter for LLM pipelines.
The problem
A 15,000-token document goes to a model with a 4,096-token window. No error. The response looks plausible. Three days later a QA reviewer finds inconsistencies. The model only saw the first third of the input. Silent truncation is the most expensive bug in LLM pipelines because it looks like success.
The solution
vrk tok counts tokens and gates pipelines in 5ms. Uses cl100k_base (exact for GPT-4, ~95% for Claude). Without --check it prints the count. With --check N it passes input through if within budget, or exits 1 with empty stdout if over, killing the pipeline before it wastes an API call.
Before and after
Before
# "Probably fine" - counting words as a proxy for tokens
wc -w system-prompt.txt
# 1,847 words... is that under 4,096 tokens? Who knows.
After
cat system-prompt.txt | vrk tok
Example
cat system-prompt.txt | vrk tok --check 8000
Exit codes
| Code | Meaning |
|---|---|
| 0 | Measurement success; or –check within limit |
| 1 | –check over limit; I/O error; tokenizer error |
| 2 | Usage error - unknown flag, no stdin, –check without value |
Flags
| Flag | Short | Type | Description |
|---|---|---|---|
--check | int | Pass input through if ≤N tokens; exit 1 with empty stdout if over | |
--model | string | Tokenizer - cl100k_base (default) | |
--json | -j | bool | Emit JSON (measurement) or JSON error (gate). Does not wrap passthrough. |
--quiet | -q | bool | Suppress stderr on failure |
How it works
Measurement mode (default)
Counts tokens and prints the number to stdout:
$ cat system-prompt.txt | vrk tok
8847
$ cat system-prompt.txt | vrk tok --json
{"tokens":8847,"model":"cl100k_base"}
Gate mode (–check N)
--check N turns tok from a measurement tool into a pipeline gate. If the input fits within N tokens, the full input passes through to stdout unchanged - you can pipe it directly to the next stage. If it exceeds N tokens, stdout is empty and the exit code is 1, which stops any pipeline.
# Within budget - input passes through
$ echo 'short input' | vrk tok --check 4000
short input
# Over budget - empty stdout, exit 1
$ printf 'You are a helpful assistant.' | vrk tok --check 3
# (no stdout)
$ echo $?
1
Gate before an LLM call so the pipeline only continues if within budget:
cat document.txt | vrk tok --check 8000 | vrk prompt --system 'Summarize this'
The –json flag
In measurement mode, --json wraps the count in a JSON object:
$ echo 'Hello, world!' | vrk tok --json
{"tokens":4,"model":"cl100k_base"}
When --check fails and --json is active, the error goes to stdout as JSON (stderr stays empty):
$ printf 'You are a helpful assistant.' | vrk tok --check 3 --json
{"code":1,"error":"6 tokens exceeds limit of 3","limit":3,"tokens":6}
The –quiet flag
Suppresses the stderr error message on --check failure. The exit code is still 1, so pipelines still stop - you just don’t get the human-readable message.
Parsing token counts downstream with jq
TOKENS=$(cat prompt.txt | vrk tok --json | jq -r '.tokens')
if [ "$TOKENS" -gt 8000 ]; then
echo "Prompt too large: $TOKENS tokens" >&2
exit 1
fi
Pipeline integration
Budget check in CI
Enforce that a system prompt stays within budget across deploys:
# ci/check-prompt-budget.sh
cat prompts/system.txt | vrk tok --check 6000
if [ $? -ne 0 ]; then
echo "System prompt exceeds 6000-token budget. Refactor before merging." >&2
exit 1
fi
Measure, then chunk what’s too large
# Process a directory of markdown files for summarization.
# Skip anything that fits in one call; chunk anything that doesn't.
for f in docs/*.md; do
TOKENS=$(cat "$f" | vrk tok --json | jq -r '.tokens')
if [ "$TOKENS" -le 8000 ]; then
cat "$f" | vrk prompt --system 'Summarize this document'
else
cat "$f" | vrk chunk --size 4000 --overlap 200 | \
while IFS= read -r chunk; do
echo "$chunk" | jq -r '.text' | vrk prompt --system 'Summarize this section'
done
fi
done
Gate before prompt with mask
# Redact secrets, check budget, then send to an LLM
cat debug-output.log | vrk mask | vrk tok --check 12000 | \
vrk prompt --system 'What went wrong in this log output?'
When it fails
Over budget without --json:
$ printf 'You are a helpful assistant.' | vrk tok --check 3
tok: 6 tokens exceeds limit of 3
$ echo $?
1
Over budget with --json (error goes to stdout, stderr empty):
$ printf 'You are a helpful assistant.' | vrk tok --check 3 --json
{"code":1,"error":"6 tokens exceeds limit of 3","limit":3,"tokens":6}
$ echo $?
1
Unknown flag:
$ echo 'hi' | vrk tok --verbose
usage error: unknown flag: --verbose
$ echo $?
2