Skip to main content

Documentation Index

Fetch the complete documentation index at: https://quintsecurity.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

v1.0.0 — Graph Intelligence Engine

Released April 14, 2026. P5 GNN Tier 2 complete — the intelligence loop is closed.

What’s New

Full Intelligence Loop

The platform now has a complete feedback loop from detection to fleet-wide protection:
Proxy (tool call intercepted)
  → NATS
    → BI Service Stage 1: Rule scoring
      → Stage 2: Baseline computation
        → Stage 3: Calibration (false neg/pos detection)
          → Stage 4: GraphIngester → Memgraph (action graph)
            → Stage 5: GNNScorer → VGAE inference → anomaly score
              → SignatureDistiller → FlowMatrix ThreatSignature
                → NATS → ALL proxies learn the new pattern (~30s)
An attack pattern detected at one proxy is automatically distilled into a structural signature and pushed to every proxy in the fleet.

Memgraph Graph Database

Agent actions are now materialized as a property graph in Memgraph, enabling structural pattern detection that the proxy’s FlowMatrix signatures cannot express. Node types:
  • Action — one MCP tool call with capability, risk score, deviation, confidence band
  • Session — groups actions into a single agent invocation
  • Agent — persistent identity across sessions
  • Resource — files, APIs, databases accessed by actions
Edge types:
  • NEXT — temporal ordering between consecutive actions (with capability transition labels)
  • BELONGS_TO — action → session membership
  • STARTED_BY — session → agent ownership
Performance: 6,577 events/sec ingestion, cross-batch NEXT edge continuity, tenant-scoped MERGE keys.

VGAE Autoencoder

Variational Graph Autoencoder for unsupervised anomaly detection. Learns to reconstruct normal session graphs — high reconstruction error = anomalous.
  • Encoder: GraphSAGE (input → 128 → 64, mean aggregation)
  • Decoder: Inner product (edge reconstruction) + MLP (feature reconstruction)
  • Loss: Edge reconstruction + feature reconstruction + beta * KL divergence
  • Anomaly score: tanh-normalized reconstruction error in [0, 1] with NaN guard
  • Negative sampling: Excludes positive edges to prevent training corruption

Signature Distillation

When the GNN flags a session as anomalous:
  1. Extracts the session’s capability transition matrix from Memgraph
  2. Normalizes to a [12x12] probability distribution
  3. Computes a JSD threshold (tighter for higher-confidence detections)
  4. Packages as a proxy-compatible ThreatSignature JSON
  5. Publishes to NATS quint.signatures.{org_id}
  6. Every proxy in the fleet receives and adds it to their ThreatSignatureRegistry
Signature format:
{
  "id": "QT-GNN-A1B2C3D4",
  "name": "GNN-learned: high anomaly",
  "flow_shape": [[0.0, 0.3, ...], ...],
  "max_jsd": 0.25,
  "severity": "high",
  "weight": 0.85,
  "agent_types": [],
  "min_depth": 0
}

Feature Enrichment (Phase 2)

Node features extended from 15-dim (Phase 1) to 21-dim (Phase 2):
FeatureDimSource
Capability one-hot12From event
Risk score1Proxy scoring
Deviation score1Proxy behavioral
Confidence band1Proxy gates
Hour of day (cyclical)2From timestamp
Inter-action gap1From NEXT edge
Depth1Agent nesting
In-degree / out-degree2From Memgraph
Phase 3 (143-dim) will add 90 GraphReasoner rule bits and graph centrality features.

Deployment Architecture

Memgraph is core at team tier and above — not enterprise-only:
TierMemgraphModels
LocalNo graph — proxy fingerprints onlyN/A
TeamShared Memgraph instanceQuint’s pre-trained models
EnterpriseDedicated per-tenant instanceCustom-trained on tenant data
GlobalFederated (anonymized embeddings)Universal threat signatures

What Changed from v0.9.0

Componentv0.9.0v1.0.0
BI Service stages3 (score, baseline, calibrate)5 (+graph ingest, +GNN scoring)
Detection methodFlowMatrix signatures onlyFlowMatrix + graph structural analysis
Signature learning5 hand-crafted5 hand-crafted + GNN-learned
Attack propagationNone (local only)Fleet-wide via NATS (~30s)
Graph databaseNoneMemgraph (28K+ actions, 600+ sessions)
Feature dimensionsN/A21-dim (Phase 2)
Test count513700+ (191 BI Service + 500+ proxy)

Bug Bounty Results

Two rounds of 4-agent bug bounty across the entire P5 codebase:
RoundBugs FoundFixed
Round 17 (3 critical, 4 high)7
Round 24 (1 high, 3 medium)4
Key fixes:
  • Import crash on startup (dead SessionSnapshotEvent import)
  • Sync torch blocking event loop → run_in_executor
  • Anomaly score range [0.5, 1.0] → [0.0, 1.0] via tanh
  • Negative sampling 9% collision rate → positive edge exclusion
  • Memory leaks: bounded _last_action_by_session, _scored_sessions, _distilled_sessions
  • Dedup race condition in signature distiller
  • Null guard on Memgraph capability fields
  • Session ID removed from signature description (info leak)

By the Numbers

MetricValue
PRs merged (platform)5 (#15, #16, #17, #18, #19)
PRs merged (proxy)3 (#43, #44, #41)
New Python modules10 (graph package)
BI Service tests191 (all pass)
Stress tests16 (6,577 events/sec, 0 cross-tenant edges)
VGAE model tests14
Distiller tests7
GNNScorer tests6
Bug bounty bugs fixed11
Memgraph nodes seeded28,019
Intelligence loop latency~30 seconds end-to-end

What’s Next

VersionScopeDependencies
v1.1.0P6 — Compliance (SOC2, NIST, ISO mapping, audit export)Schema design
v1.2.0P7 — Adaptive (federated intelligence, dynamic thresholds)Production data
-Feature enrichment to 143-dim (90 rule bits + centrality)GraphReasoner wiring
-Proxy signature receiver (NATS subscriber for learned sigs)NATS infrastructure
-Model training on real captured data200+ real sessions

v1.0.1 — Accuracy Hardening (April 15, 2026)

Multi-Level Detection Stack

Replaced the single VGAE autoencoder with a 4-level ensemble detector:
LevelMethodWhat it catchesWeight
1Per-node VGAE reconstructionSuspicious individual actions8.3%
2Supervised GAT classifierKnown attack patterns (10 types)47.7%
3Mahalanobis distanceNovel/zero-day attacks (never trained)30.5%
4Node-max scoringSessions with any extreme action13.5%
Weights are learned via logistic regression on validation data, not hand-tuned.

Accuracy Results

MetricBefore (v1.0.0)After (v1.0.1)
AUROCNot measured1.000
FPR @0.65 threshold71%0.0%
Detection rate30%90%+
Separation+0.117+0.437

Key Fix: GAT Classifier Collapse

The supervised classifier was outputting identical predictions for every input (P(attack)=0.63 constant). Root cause: training data was not shuffled — all normals processed before all attacks. Fixed with:
  • Epoch-level shuffling
  • Gradient accumulation over 8 samples
  • Gradient clipping at 1.0

Training Data at Scale

Propertyv1.0.0v1.0.1
Sessions1,50050,000
Archetypes48
Attack types510 (35 variants)
TopologiesLinear only5 types
Stealth attacksNone30% (normal features, attack structure)
Features21-dim38-dim (+ n-gram transitions)

N-gram Features (+3 dims)

  • Bigram surprise: How rare is this capability transition? Novel transitions (read to upload) score high.
  • Window entropy: Shannon entropy of capabilities in last 10 actions. Attack kernels have high diversity.
  • Export density: Fraction of upload/send/download in last 20 actions. Catches slow-drip exfiltration.

Adversarial Robustness

7 evasion scenarios tested:
AttackDetectionKey finding
Business-hours evasionPer-level: 77-100%Ensemble needs max-boost
Slow drip (5% malicious)Per-level: 100%N-gram export density catches it
Mimicry (normal + 5 uploads)Node-max: 97%Per-node scoring catches outliers
Novel zero-day100%Mahalanobis distance is the zero-day backstop
Capability-consistent0%Confirmed limit — needs resource sensitivity

Score Calibration

  • Per-level percentile scoring against normal baseline distribution
  • Alert tiers: Hard (any level > 0.9), Soft (2+ levels > 0.5), Standard
  • Calibrator persisted alongside model for production deployment

GNN vs Baseline

Proved the graph structure adds value beyond flat features:
ModelAUROCAUPRCF1
GNN Ensemble1.0001.0001.000
Feature Baseline0.9820.9780.966

AWS Deployment Architecture

Team Tier (5-50 agents)

Shared ECS cluster. Each team gets a BI Service task with Memgraph sidecar (1-2GB). Data isolation via tenant_id on all graph queries. Shared NATS, Redis, and Postgres.

Enterprise Tier (100-10K agents)

Dedicated per-tenant infrastructure. Memgraph on memory-optimized EC2 (r6g.xlarge, 32GB). BI Service auto-scales horizontally. Can deploy in customer’s own AWS account for data residency.

Global Tier (Federated Intelligence)

Anonymized embeddings only cross tenant boundaries:
  • FlowMatrix [12x12] capability transitions (no tool names)
  • Latent embeddings [64-dim] (compressed, non-invertible)
  • Capability distributions [12] (percentages only)
Never shared: tool names, resource paths, agent IDs, arguments, customer identity. Attack detected at Tenant A becomes a universal threat signature pushed to all tenants within ~5 minutes.
Tenant A detects anomaly → anonymize → Global BI
  → cluster with Tenant B, C detections
    → universal ThreatSignature
      → push to ALL tenants → ALL proxies

v1.0.2 — Rule Engine + 143-dim Features (April 15, 2026)

GraphReasoner Rule Engine

90 inference rules across 7 categories, mapped to 11 compliance frameworks. Replaces the fallback scorer (deviation_score * 100) with domain-knowledge-based scoring. Rule Categories:
CategoryRulesExamples
Capability risk20Exec after reads, exfiltration indicators, C2 patterns
Tool risk15Shell execution, sudo, netcat, Docker, SSH
Temporal10Off-hours, burst patterns, session duration
Sequence15Read→upload exfil, credential relay, payload staging
Resource12Credential files, private keys, path traversal
Behavioral11Novel tools, deviation spikes, multi-signal corroboration
Compliance7Unencrypted exports, missing MFA, human oversight
Compliance Frameworks: SOC2, NIST 800-53, ISO 27001, OWASP LLM Top 10, MITRE ATT&CK, GDPR, EU AI Act, NIST AI RMF, PCI DSS, HIPAA, CCPA

Baseline-Aware Scoring

Rules are no longer blind to agent behavior. Each rule’s score is modulated by how surprising the action is for this specific agent:
effective_score = rule.score_delta × baseline_surprise(capability)
  • DevOps agent runs exec (normal for them) → surprise = 0.0 → score = 0
  • Coding assistant runs exec (never does this) → surprise = 1.0 → score = full
  • Cold start agent (unknown) → surprise = 0.5 → score = half
This eliminates ~60% of false positives from baseline-blind rules.

Capability-Based Tool Detection

Tool risk rules use the 12-capability classification system instead of regex pattern matching on tool names. Renaming bash to custom_executor doesn’t evade detection — the capability is still exec.

143-dim GNN Features

Rule firing bits are now wired into the GNN feature pipeline as node features:
Feature BlockDimsSource
Phase 1 (capability + scores)15Proxy behavioral scoring
Phase 2 (temporal + degree)6Timestamps + Memgraph
Phase 3 (action type + novelty + n-grams)14Event metadata + sequence analysis
Phase 4 (rule bits + top-K scores)108RuleEngine evaluation
Total143
The 90 binary rule bits encode domain knowledge directly as GNN features. The model learns which rules correlate with attacks and how they interact — something hand-tuned score deltas cannot express.

How to Enable

# Proxy policy
behavioral:
  enabled: true
  shadow_mode: true

# BI Service environment
BI_GRAPH_REASONER_ENABLED=true   # 90-rule engine
BI_MEMGRAPH_ENABLED=true         # graph database
BI_GNN_ENABLED=true              # multi-level detector

Performance

ComponentLatency
Proxy Gates 1-3<1us per action
Rule evaluation (90 rules)0.133ms per event
Phase 4 encoding (143-dim)8.1ms per 1000 nodes
GNN inference per session0.44ms avg
Full pipeline (proxy → GNN → signature)~30 seconds