Forward Proxy & Watch Mode

The forward proxy intercepts HTTPS traffic from any AI agent by acting as an HTTP CONNECT proxy with MITM TLS. Unlike the MCP gateway (which requires pointing agents at Quint), the forward proxy works with standard HTTP_PROXY / HTTPS_PROXY environment variables — zero code changes required.

How It Works

On macOS, traffic reaches the forward proxy via two paths: (1) explicit HTTP_PROXY / HTTPS_PROXY env vars, and (2) a transparent NE extension that intercepts specific LLM API hostnames and hands the flow to the daemon’s relay listener on :9091. Both paths share the same MITM pipeline — the only difference is how traffic is routed to it. See Network Extension for the transparent path.

Quick Start

# Start the forward proxy with dashboard
quint watch

# In another terminal, run your agent with proxy env vars
export HTTP_PROXY=http://localhost:9090
export HTTPS_PROXY=http://localhost:9090
export SSL_CERT_FILE=~/.quint/ca/quint-ca-bundle.pem
export NODE_EXTRA_CA_CERTS=~/.quint/ca/quint-ca.crt

# Run any agent — traffic is automatically intercepted
claude --model claude-sonnet-4-20250514 "summarize this repo"

The dashboard opens at http://localhost:8080 showing live agent activity.

CLI Flags

Flag	Default	Description
`--port`	`9090`	Proxy listen port
`--dashboard-port`	`8080`	Dashboard UI port
`--policy`	auto-detect	Path to `policy.json`
`--static-dir`	embedded	Serve dashboard from local dir (dev mode)
`--no-dashboard`	`false`	Skip starting the dashboard
`--no-open`	`false`	Don’t auto-open browser

Agent Environment Variables

Set these in any terminal where your agent runs:

Variable	Value	Purpose
`HTTP_PROXY`	`http://localhost:9090`	Route HTTP traffic through Quint
`HTTPS_PROXY`	`http://localhost:9090`	Route HTTPS traffic through Quint
`SSL_CERT_FILE`	`~/.quint/ca/quint-ca-bundle.pem`	Trust Quint’s CA (Go, Python, curl)
`NODE_EXTRA_CA_CERTS`	`~/.quint/ca/quint-ca.crt`	Trust Quint’s CA (Node.js)

Named Agents

To explicitly name an agent, use the proxy URL’s username field:

HTTP_PROXY=http://my-research-bot@localhost:9090
HTTPS_PROXY=http://my-research-bot@localhost:9090

This overrides auto-discovery and assigns the agent a fixed identity.

CA Certificate

Quint generates a local CA on first run using ECDSA P-256:

CA certificate: ~/.quint/ca/quint-ca.crt (valid 10 years)
CA private key: ~/.quint/ca/quint-ca.key
Combined bundle: ~/.quint/ca/quint-ca-bundle.pem (system CAs + Quint CA)
Leaf certificates: generated per-hostname, cached in memory, valid 24 hours

The CA never leaves your machine. Leaf certificates are signed on-the-fly for each unique hostname the agent connects to.

Provider Classification

Quint automatically classifies intercepted traffic into 46+ AI providers using domain matching:

Provider	Domains
`anthropic`	`api.anthropic.com`, `mcp-proxy.anthropic.com`
`openai`	`api.openai.com`, `chatgpt.com`
`google`	`generativelanguage.googleapis.com`, `aiplatform.googleapis.com`
`azure-openai`	`openai.azure.com`, `cognitive.microsoft.com`
`aws-bedrock`	`bedrock-runtime.*.amazonaws.com`
`mistral`	`api.mistral.ai`
`groq`	`api.groq.com`
`deepseek`	`api.deepseek.com`
`cohere`	`api.cohere.com`

Full list includes 40+ providers: Together, Replicate, Fireworks, Perplexity, xAI, HuggingFace, Cerebras, SambaNova, NVIDIA, OpenRouter, Cloudflare, and 8 Chinese providers (Zhipu, Baidu, Alibaba, ByteDance, Moonshot, 01.AI, MiniMax, SiliconFlow).

Classification uses three tiers:

Exact domain match — fastest, covers all known API endpoints
Pattern-based — catches region-specific AWS/Azure/Databricks URLs
Root domain fallback — handles unknown subdomains (e.g., console.anthropic.com → anthropic)

Domain Policy

Control which domains are allowed or blocked:

{
  "forward_proxy": {
    "default_action": "allow",
    "log_bodies": true,
    "max_body_log_size": 8192,
    "domains": [
      { "domain": "*.openai.com", "action": "allow" },
      { "domain": "pastebin.com", "action": "deny" }
    ]
  }
}

Rules are evaluated first-match-wins with glob pattern support.

LLM API Parsing

The proxy parses 7 LLM API formats from intercepted HTTPS traffic, extracting tool calls with their arguments:

Format	Detection	Tool Call Extraction
Anthropic	Host `anthropic.com`	`tool_use` blocks in response content
OpenAI	Host `openai.com`, `mistral.ai`	`tool_calls` array in message
OpenAI Responses	Path `/v1/responses`	`function_call` in output items
Bedrock Converse	Host `bedrock` + camelCase body	`toolUse` blocks
Gemini	Host `googleapis.com` or path `:generateContent`	`functionCall` in parts
Azure OpenAI	Host `*.openai.azure.com`	Same as OpenAI
Generic	Fallback	Model field extraction only

Tool calls fire OnToolCall callbacks into the daemon, which feeds the unified session tracker.

Transparent Interception (macOS)

On macOS, the forward proxy has a sibling: the Network Extension, a NETransparentProxyProvider that redirects outbound flows to known LLM API hosts into the same MITM pipeline — no HTTP_PROXY, no env vars, no per-app CA trust. Both paths (CONNECT and NE) converge on the same serveMITMImpl inside the daemon, so request parsing, tool-call extraction, audit logging, and session attribution are identical. See the Network Extension page for the complete NE architecture, flow lifecycle, and backpressure contract.

	CONNECT path (`HTTP_PROXY`)	NE transparent path (macOS)
Works on	Any OS, any runtime	macOS 11+ only
Setup	Env vars + CA trust	User approves extension once
Best for	Dev, CI, Linux servers	End-user laptops

Streaming Responses (SSE & AWS eventstream)

Anthropic’s Messages API and Bedrock’s invoke-with-response-stream both return long-lived streaming responses. Two subtle framing rules apply when MITMing these on keep-alive connections:

http.ReadResponse strips Transfer-Encoding: chunked from the header map and moves it to resp.TransferEncoding. Any code that checks resp.Header.Values("Transfer-Encoding") will always see no chunked — which would cause us to emit SSE bodies with neither Content-Length nor framing, so the client hangs waiting for an end-of-response that never comes.
Streaming responses have no known length and we can’t Connection: close the keep-alive socket. The only way the client learns the response is finished is chunked framing with a 0-length terminator.

The daemon therefore always re-chunks SSE and AWS eventstream responses regardless of how the upstream framed them. io.Copy streams the body through httputil.NewChunkedWriter directly to the client socket (bypassing resp.Write’s 4 KB bufio buffering so tokens arrive as they’re generated), then cw.Close() emits the terminator chunk. Without this, Claude Code would display the full response, then hang on the spinner forever, and every subsequent turn on the same keep-alive connection would stall behind it.

Architecture

The forward proxy integrates with all other Quint subsystems:

ES Extension — OS-level ground truth. Proxy provides content, ES provides file operations. Divergence between the two is the key signal.
Agent Identity — auto-discovers agent type from User-Agent headers
Sub-Agent Detection — detects model divergence and concurrency spikes across CONNECT tunnels
Cloud Scoring — enriches local risk scores with the cloud API
Cloud Ingestion — events flow through the cloud forwarder to /v1/ingest, then fan out via SNS FIFO + SQS
RBAC — enforces cloud JWT token policies

Edge

Documentation Index

​Forward Proxy & Watch Mode

​How It Works

​Quick Start

​CLI Flags

​Agent Environment Variables

​Named Agents

​CA Certificate

​Provider Classification

​Domain Policy

​LLM API Parsing

​Transparent Interception (macOS)

​Streaming Responses (SSE & AWS eventstream)

​Architecture

Forward Proxy & Watch Mode

How It Works

Quick Start

CLI Flags

Agent Environment Variables

Named Agents

CA Certificate

Provider Classification

Domain Policy

LLM API Parsing

Transparent Interception (macOS)

Streaming Responses (SSE & AWS eventstream)

Architecture