Documentation Index
Fetch the complete documentation index at: https://quintsecurity.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Forward Proxy & Watch Mode
The forward proxy intercepts HTTPS traffic from any AI agent by acting as an HTTP CONNECT proxy with MITM TLS. Unlike the MCP gateway (which requires pointing agents at Quint), the forward proxy works with standard HTTP_PROXY / HTTPS_PROXY environment variables — zero code changes required.
How It Works
On macOS, traffic reaches the forward proxy via two paths: (1) explicit HTTP_PROXY / HTTPS_PROXY env vars, and (2) a transparent NE extension that intercepts specific LLM API hostnames and hands the flow to the daemon’s relay listener on :9091. Both paths share the same MITM pipeline — the only difference is how traffic is routed to it. See Network Extension for the transparent path.
Quick Start
# Start the forward proxy with dashboard
quint watch
# In another terminal, run your agent with proxy env vars
export HTTP_PROXY=http://localhost:9090
export HTTPS_PROXY=http://localhost:9090
export SSL_CERT_FILE=~/.quint/ca/quint-ca-bundle.pem
export NODE_EXTRA_CA_CERTS=~/.quint/ca/quint-ca.crt
# Run any agent — traffic is automatically intercepted
claude --model claude-sonnet-4-20250514 "summarize this repo"
The dashboard opens at http://localhost:8080 showing live agent activity.
CLI Flags
| Flag | Default | Description |
|---|
--port | 9090 | Proxy listen port |
--dashboard-port | 8080 | Dashboard UI port |
--policy | auto-detect | Path to policy.json |
--static-dir | embedded | Serve dashboard from local dir (dev mode) |
--no-dashboard | false | Skip starting the dashboard |
--no-open | false | Don’t auto-open browser |
Agent Environment Variables
Set these in any terminal where your agent runs:
| Variable | Value | Purpose |
|---|
HTTP_PROXY | http://localhost:9090 | Route HTTP traffic through Quint |
HTTPS_PROXY | http://localhost:9090 | Route HTTPS traffic through Quint |
SSL_CERT_FILE | ~/.quint/ca/quint-ca-bundle.pem | Trust Quint’s CA (Go, Python, curl) |
NODE_EXTRA_CA_CERTS | ~/.quint/ca/quint-ca.crt | Trust Quint’s CA (Node.js) |
Named Agents
To explicitly name an agent, use the proxy URL’s username field:
HTTP_PROXY=http://my-research-bot@localhost:9090
HTTPS_PROXY=http://my-research-bot@localhost:9090
This overrides auto-discovery and assigns the agent a fixed identity.
CA Certificate
Quint generates a local CA on first run using ECDSA P-256:
- CA certificate:
~/.quint/ca/quint-ca.crt (valid 10 years)
- CA private key:
~/.quint/ca/quint-ca.key
- Combined bundle:
~/.quint/ca/quint-ca-bundle.pem (system CAs + Quint CA)
- Leaf certificates: generated per-hostname, cached in memory, valid 24 hours
The CA never leaves your machine. Leaf certificates are signed on-the-fly for each unique hostname the agent connects to.
Provider Classification
Quint automatically classifies intercepted traffic into 46+ AI providers using domain matching:
| Provider | Domains |
|---|
anthropic | api.anthropic.com, mcp-proxy.anthropic.com |
openai | api.openai.com, chatgpt.com |
google | generativelanguage.googleapis.com, aiplatform.googleapis.com |
azure-openai | openai.azure.com, cognitive.microsoft.com |
aws-bedrock | bedrock-runtime.*.amazonaws.com |
mistral | api.mistral.ai |
groq | api.groq.com |
deepseek | api.deepseek.com |
cohere | api.cohere.com |
Full list includes 40+ providers: Together, Replicate, Fireworks, Perplexity, xAI, HuggingFace, Cerebras, SambaNova, NVIDIA, OpenRouter, Cloudflare, and 8 Chinese providers (Zhipu, Baidu, Alibaba, ByteDance, Moonshot, 01.AI, MiniMax, SiliconFlow).
Classification uses three tiers:
- Exact domain match — fastest, covers all known API endpoints
- Pattern-based — catches region-specific AWS/Azure/Databricks URLs
- Root domain fallback — handles unknown subdomains (e.g.,
console.anthropic.com → anthropic)
Domain Policy
Control which domains are allowed or blocked:
{
"forward_proxy": {
"default_action": "allow",
"log_bodies": true,
"max_body_log_size": 8192,
"domains": [
{ "domain": "*.openai.com", "action": "allow" },
{ "domain": "pastebin.com", "action": "deny" }
]
}
}
Rules are evaluated first-match-wins with glob pattern support.
LLM API Parsing
The proxy parses 7 LLM API formats from intercepted HTTPS traffic, extracting tool calls with their arguments:
| Format | Detection | Tool Call Extraction |
|---|
| Anthropic | Host anthropic.com | tool_use blocks in response content |
| OpenAI | Host openai.com, mistral.ai | tool_calls array in message |
| OpenAI Responses | Path /v1/responses | function_call in output items |
| Bedrock Converse | Host bedrock + camelCase body | toolUse blocks |
| Gemini | Host googleapis.com or path :generateContent | functionCall in parts |
| Azure OpenAI | Host *.openai.azure.com | Same as OpenAI |
| Generic | Fallback | Model field extraction only |
Tool calls fire OnToolCall callbacks into the daemon, which feeds the unified session tracker.
Transparent Interception (macOS)
On macOS, the forward proxy has a sibling: the Network Extension, a NETransparentProxyProvider that redirects outbound flows to known LLM API hosts into the same MITM pipeline — no HTTP_PROXY, no env vars, no per-app CA trust. Both paths (CONNECT and NE) converge on the same serveMITMImpl inside the daemon, so request parsing, tool-call extraction, audit logging, and session attribution are identical. See the Network Extension page for the complete NE architecture, flow lifecycle, and backpressure contract.
| CONNECT path (HTTP_PROXY) | NE transparent path (macOS) |
|---|
| Works on | Any OS, any runtime | macOS 11+ only |
| Setup | Env vars + CA trust | User approves extension once |
| Best for | Dev, CI, Linux servers | End-user laptops |
Streaming Responses (SSE & AWS eventstream)
Anthropic’s Messages API and Bedrock’s invoke-with-response-stream both return long-lived streaming responses. Two subtle framing rules apply when MITMing these on keep-alive connections:
-
http.ReadResponse strips Transfer-Encoding: chunked from the header map and moves it to resp.TransferEncoding. Any code that checks resp.Header.Values("Transfer-Encoding") will always see no chunked — which would cause us to emit SSE bodies with neither Content-Length nor framing, so the client hangs waiting for an end-of-response that never comes.
-
Streaming responses have no known length and we can’t
Connection: close the keep-alive socket. The only way the client learns the response is finished is chunked framing with a 0-length terminator.
The daemon therefore always re-chunks SSE and AWS eventstream responses regardless of how the upstream framed them. io.Copy streams the body through httputil.NewChunkedWriter directly to the client socket (bypassing resp.Write’s 4 KB bufio buffering so tokens arrive as they’re generated), then cw.Close() emits the terminator chunk.
Without this, Claude Code would display the full response, then hang on the spinner forever, and every subsequent turn on the same keep-alive connection would stall behind it.
Architecture
The forward proxy integrates with all other Quint subsystems:
- ES Extension — OS-level ground truth. Proxy provides content, ES provides file operations. Divergence between the two is the key signal.
- Agent Identity — auto-discovers agent type from User-Agent headers
- Sub-Agent Detection — detects model divergence and concurrency spikes across CONNECT tunnels
- Cloud Scoring — enriches local risk scores with the cloud API
- Cloud Ingestion — events flow through the cloud forwarder to
/v1/ingest, then fan out via SNS FIFO + SQS
- RBAC — enforces cloud JWT token policies