26 Unix tools. One binary. Zero dependencies.·the missing coreutils for the agent era·vrk mcp - expose all 26 tools to any AI agent·curl vrk.sh/install.sh | sh - ready in 5 seconds·

vrk urlinfo

vrk urlinfo parses URLs into structured JSON components with dot-path field extraction.

The problem

cut -d'/' -f3 extracts a hostname until a URL has a port number or basic auth. https://api.example.com:8080/path returns “api.example.com:8080” including the port. A URL with user:pass@host breaks the regex entirely. The 10% of URLs that don’t fit a simple pattern break the pipeline.

The solution

vrk urlinfo parses URLs into structured JSON components: scheme, host, port, path, query parameters, fragment, and user. --field with dot-path syntax extracts a single component. Handles edge cases that regex approaches miss. No network calls, pure parsing.

Before and after

Before

echo 'https://api.example.com:8080/path?q=test' | cut -d'/' -f3
# Returns "api.example.com:8080" - includes port

After

vrk urlinfo --field host 'https://api.example.com:8080/path?q=test'

Example

vrk urlinfo 'https://api.example.com:8080/v1/search?q=llm+tools&limit=10'

Exit codes

CodeMeaning
0Success
1Invalid URL that cannot be parsed, I/O error
2Interactive TTY with no stdin or positional arg

Flags

FlagShortTypeDescription
--field-FstringExtract a single field as plain text (supports dot-path for query params)
--json-jboolAppend metadata trailer
--quiet-qboolSuppress stderr output

How it works

Full JSON output

$ vrk urlinfo 'https://api.example.com:8080/v1/search?q=llm+tools&limit=10#results'
{"scheme":"https","host":"api.example.com","port":8080,"path":"/v1/search","query":{"limit":"10","q":"llm tools"},"fragment":"results","user":""}

Every URL component is extracted into a structured JSON object. Query parameters are parsed into a nested object.

Extract a single field (–field)

$ vrk urlinfo --field host 'https://api.example.com:8080/v1/search'
api.example.com

$ vrk urlinfo --field path 'https://api.example.com:8080/v1/search'
/v1/search

$ vrk urlinfo --field query.q 'https://api.example.com?q=llm+tools'
llm tools

Dot-path syntax reaches into query parameters: query.q, query.page, etc.

Batch processing

$ printf 'https://a.com/path\nhttps://b.com/other\n' | vrk urlinfo --field host
a.com
b.com

Processes multiple URLs, one per line.

JSON metadata (–json)

$ printf 'https://a.com\nhttps://b.com\n' | vrk urlinfo --json
{"scheme":"https","host":"a.com",...}
{"scheme":"https","host":"b.com",...}
{"_vrk":"urlinfo","count":2}

Available fields

FieldExample value
schemehttps
hostapi.example.com
port8080 (0 if not specified)
path/v1/search
query{"q":"llm tools","limit":"10"}
query.<key>value of a specific parameter
fragmentresults
userusername (from user@host URLs)

Pipeline integration

Group URLs by domain

# Extract unique domains from a list of URLs
cat urls.txt | while IFS= read -r url; do
  vrk urlinfo --field host "$url"
done | sort -u

Extract and decode query parameters

# Get a query parameter and decode it
vrk urlinfo --field query.q 'https://example.com?q=hello%20world' | vrk pct --decode
# Grab a page, extract links, group by domain
vrk grab https://example.com | vrk links --bare | \
  while IFS= read -r url; do
    vrk urlinfo --field host "$url"
  done | sort | uniq -c | sort -rn

When it fails

Invalid URL:

$ vrk urlinfo 'not a url'
error: urlinfo: invalid URL
$ echo $?
1

No input:

$ vrk urlinfo
usage error: urlinfo: no URL provided
$ echo $?
2