Graph Neural Network
The GNN provides Layer 2 (structural) scoring by analyzing the relational context around each action. It operates on a heterogeneous graph with 13 node types and 20 edge types, using multi-task learning for risk classification, severity regression, and threat category detection.
Architecture: SubgraphGNN
Event → Subgraph Extraction (k=3 hop) → Node Feature Encoding
→ 2-layer HeteroConv (SAGEConv per edge type, sum aggregation)
→ Global Mean Pooling (over Action nodes)
→ Multi-Task Heads
Multi-Task Outputs
| Head | Output | Loss |
|---|
| Risk level classification | 5 classes (none/low/medium/high/critical) | Cross-entropy |
| Severity score regression | 0-100 continuous | MSE |
| Threat category multi-label | 10 binary labels | Binary cross-entropy |
| Binary threat detection | is_threat (0/1) | Binary cross-entropy |
Threat Categories (10)
| Category | Description |
|---|
policy_violation | Customer policy breach |
exfiltration | Data leaving security boundary |
privilege_escalation | Increasing access beyond scope |
scope_creep | Accessing new resource types |
excessive_agency | Agent acting beyond instructions |
multi_step_attack | Coordinated action chain |
prompt_injection | Prompt manipulation attempt |
tool_poisoning | MCP tool description manipulation |
supply_chain | Dependency/server compromise |
anomalous_access | Statistical behavioral anomaly |
Node Types & Features
Core Nodes
Data Nodes
MCP Nodes
| Node Type | Feature Dim | Key Features |
|---|
| Action | 52 | action_type(20) + verb_risk(1) + tool_hash(8) + tool_risk(1) + param_features(20) + time(2) |
| Agent | 24 | agent_type(14) + model(4) + deployment_context(6) |
| Resource | 22 | sensitivity(1) + resource_type(13) + data_classification(5) + name_risk(3) |
| User | 17 | clearance(1) + department(10) + role(6) |
| Policy | 10 | severity(1) + rule_type(6) + scope_hash(3) |
| Session | 4 | message_count(1) + tool_call_count(1) + time_features(2) |
| Node Type | Feature Dim | Key Features |
|---|
| DataField | 3 | is_pii(1) + is_sensitive(1) + field_risk(1) |
| ExternalEndpoint | 6 | is_whitelisted(1) + domain_hash(4) + suspicious(1) |
| Node Type | Feature Dim | Key Features |
|---|
| MCPServer | 4 | is_verified(1) + is_local(1) + transport(1) + version_changed(1) |
| MCPTool | 2 | has_description_changed(1) + is_duplicate_name(1) |
| MCPResource | 2 | sensitivity_level(1) + is_static(1) |
| MCPPrompt | 1 | single feature |
| OAuthToken | 2 | is_broad_scope(1) + scope_breadth(1) |
Edge Types (20)
("Agent", "PERFORMED", "Action") # Core behavioral
("Action", "ACCESSED", "Resource")
("Action", "TOUCHED_FIELD", "DataField")
("Action", "SENT_TO", "ExternalEndpoint")
("Session", "INITIATED_BY", "User")
("Agent", "BELONGS_TO", "Session")
("Resource", "GOVERNED_BY", "Policy")
("Policy", "PERMITS", "Action") # Policy evaluation
("Policy", "BLOCKS", "Action")
("Action", "PRECEDED_BY", "Action") # Temporal chains
("Action", "ESCALATED_FROM", "Action")
("Action", "SIMILAR_TO", "Action")
("Agent", "CONNECTED_TO", "MCPServer") # MCP topology
("Action", "INVOKED", "MCPTool")
("Action", "READ_RESOURCE", "MCPResource")
("Action", "USED_PROMPT", "MCPPrompt")
("MCPServer", "EXPOSES", "MCPTool")
("MCPServer", "AUTHENTICATED_BY", "OAuthToken")
("MCPTool", "DESCRIPTION_CHANGED", "MCPTool")
("MCPTool", "CROSS_SERVER_CALL", "MCPTool")
Score Production
The GNN maps its probability distribution to a 0-100 score using risk class centers:
centers = {"none": 5, "low": 20, "medium": 45, "high": 72, "critical": 92}
gnn_score = sum(prob[class] * centers[class] for class in risk_classes)
GNN Predictor API
from quint_graph.gnn import GNNPredictor
predictor = GNNPredictor(
model_path="path/to/model.pt",
hidden_channels=64,
num_layers=2,
device="cpu"
)
# Full inference
result = predictor.predict(event, centrality=None)
# Returns: risk_class, probabilities, confidence, severity, is_threat
# Score only
score = predictor.predict_score(event, centrality=None)
# Returns: float 0-100
Training Data
The GNN trains on synthetic data generated without proprietary customer data:
| Source | Examples | Percentage |
|---|
| Simulated (benign) | 12,000-18,000 | ~40% |
| Simulated (threat) | 8,000-12,000 | ~25% |
| Adversarial evasion | 5,000-8,000 | ~18% |
| LLM-generated scenarios | 3,000-5,000 | ~12% |
| Real-world (design partners) | 500-2,000 | ~5% |
Evaluation Targets
| Metric | Target |
|---|
| Risk Score MAE | < 8 points |
| Risk Level Accuracy | > 92% |
| Threat Detection F1 | > 95% |
| False Positive Rate | < 3% |
| Adversarial Detection Rate | > 75% |
GPU is NOT required for inference. For our graph size (~1,500 ontology + dynamic events), CPU inference is 5-15ms.