umma.dev

AI Regulation: Architecting for Compliance

The EU AI Act represents a shift from voluntary ethics to mandatory legal engineering. For architects, the Regulations’ “risk-based approach” (Recital 14) is now a non-functional requirement as critical as latency or scalability.

The Risk Gateway Pattern

The Act specifies that risk management cannot be a static, one-off assessment.

What are we evaluating against? The Act defines specific tiers of risk that must be identified:

  • Article 5 (Prohibited AI Practices): Bans systems that pose an “unacceptable risk”, such as those deploying “subliminal techniques beyond a person’s consciousness” to materially distort their behaviour.
  • Article 6 (High-Risk AI Systems): regulates systems intended to be used as a “safety component of a product” or in critical areas like biometrics and critical infrastructure.

The most robust architectural pattern to handle this is a “Compliance Gateway”. Like a specialised firewall for your AI. A Web Application Firewall (WAF) blocks SQL injection attacks, a Compliance Gateway blocks legally risky prompts.

It sits between the user and your LLM, analysing the content of every request. This “inspection” can be implemented using lightweight, fast-running classification models (like a distilled BERT model) that score the input for specific prohibited patterns. If a prompt triggers a high risk score for example, by matching patterns of subliminal manipulation—the gateway rejects it immediately, acting as a hard safety circuit before the request ever reaches your core model.

type RiskLevel = 'LOW' | 'HIGH' | 'UNACCEPTABLE';

interface AIRequest {
  userId: string;
  context: string;
  prompt: string;
}

const classifyRisk = (request: AIRequest): RiskLevel => {
  // Art. 5(1): Prohibited AI Practices
  // We can deploy lightweight classifier models at the edge to detect 
  // "subliminal techniques" or manipulation attempts in real-time.
  if (containsSubliminalTechniques(request)) {
    return 'UNACCEPTABLE';
  }
  
  // Art. 6(1): High-risk AI Systems
  // If the request context involves critical infrastructure or safety 
  // components (Annex III), we tag it for high-integrity processing pathways.
  if (isSafetyComponent(request)) {
    return 'HIGH';
  }
  
  return 'LOW';
};

Traceability in the RAG Stack

Art. 12 mandates “record-keeping” to ensure “traceability of the functioning” of the system. In the context of Retrieval Augmented Generation (RAG). This creates a significant data engineering challenge. To support the “post-market monitoring” required by Art. 61, we cannot simply log the output; we must reconstruct the entire cognitive state of the system at the moment of decision.

Logging the Vector Database interactions

When a model hallucinates or provides a biased answer, the root cause is often the retrieved context, not the model weights. To build a defensible audit trail, your observability stack must capture the precise document chunks retrieved, their similarity scores, and the system prompts used to condition the LLM. Without this snapshot, we cannot prove whether a failure was a retrieval error or a reasoning error, making it impossible to fulfil our monitoring obligations.

Distributed Transparency (Art. 13)

Art. 13(1) requires systems to be “sufficiently transparent to enable users to interpret the system’s output”. In a distributed microservices backend, transparency is a function of request tracing.

The resolution to this is by implementing strict ID propagation, through standards like OpenTelemetry. A unique compliance_id is generated at the ingress and must survive every network hop—from the API gateway, through the vector search service, to the inference engine and finally to the structured logger. This stitches together a fragmented technical execution into a coherent legal narrative, allowing us to definitively answer why a specific user received a specific output.

{
  "traceId": "abc-123",
  "compliance_ref": "ART-13-TRANSPARENCY",
  "step": "vector_search",
  "query_context": {
    "vector_db_latency_ms": 45,
    "retrieved_chunk_ids": ["doc_882", "doc_991"],
    "similarity_threshold_applied": 0.85
  }
}

Documentation as a Build Artifact

Art. 11 mandates “technical documentation” that demonstrates compliance, with the critical constraint that it must be kept “up-to-date at all times”. In a DevOps environment with frequent deployments, manual documentation is obsolete the moment it is written.

The solution is to embrace Documentation as Code. We move compliance artifacts into the CI/CD pipeline, generating the required “Model Cards” automatically during the build process. By cryptographically hashing the training dataset and linking it to the specific git commit of the inference code, we create an immutable chain of custody.

This automation ensures that every production deployment carries its own compliance passport. We can objectively prove which data version was used (satisfying Art. 10(2)) and which automated evaluation tests passed (proving robustness under Art. 15), turning the “technical documentation” from a bureaucratic burden into a verified build artifact.

# Generated by CI/CD Pipeline
model_version: v1.2.0
regulatory_status: "compliant_art_11"
data_provenance:
  snapshot_uri: "s3://training-data/2026-02-09.parquet"
  sha256: "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
evaluation_metrics:
  robustness_score: 0.94
  bias_parity_check: "PASSED"
code_commit: 7b3f1a...