Annexo
← Home

About Annexo

The independent trust layer for AI agents.

Lifts and cars get independently inspected before people rely on them. AI agents now make real decisions inside real companies — yet nothing independent observes how they actually behave. Annexo is building that independent layer for the AI agent economy: we test how an agent behaves, keep watching it live, and issue independent assurance — observed behaviour, with evidence and provenance on every finding. We didn't build the agent, so we have no stake in making it look good. That's exactly why the company running it, its buyers, and regulators can all rely on the same result.

Founder

Benjamin Hellmich

Data & AI transformation leader · 12 years

Benjamin has spent over a decade building and governing AI at scale. As a Senior Manager for Data & AI and GenAI at Accenture's strategy practice in Munich, he has led more than 100 AI use cases for large, regulated enterprises.

That work included designing BaFin- and GDPR-aligned AI governance for a global asset manager running 200+ models across 20 countries, and delivering GenAI programmes with returns above 15× for industrial and energy groups. Annexo turns that same governance discipline into independent verification of the AI agents enterprises now deploy.

Earlier he worked as a software engineer, an investment analyst and in M&A, and began his career in banking. He holds an MSc in Finance from Cranfield and a BSc in Computer Science, and advises an international humanitarian organisation on data architecture across 30+ countries.

Connect on LinkedIn

Selected companies the team has worked with

AccentureCapgeminiSiemens AGSiemens EnergyAllianzMetroDeutsche BahnDeutsche Bank
12 yrs
governing AI inside regulated firms
100+
AI & GenAI use cases led
20
countries in one governance programme
15×+
return on flagship GenAI programmes

Why Annexo

Verification is the moat.

No team can grade its own homework — the people who build an agent can't be the ones who vouch for it. Independent verification is the credible signal, and it only works if it's neutral: the company deploying the agent, the buyers relying on it, and the regulators watching all trust the same observed result. Regulators learned this long ago, and the EU AI Act, GDPR, DORA and NIS2 now define what good behaviour looks like — they expect observed evidence, not the vendor's word. Annexo tests an agent's behaviour against those expectations and traces every finding back to what we actually saw, so the person who has to defend it can.

Put your agent on the bench.

Tell us about the agent you're deploying and where it runs. We'll show you exactly how independent verification works — what we observe, what we watch live, and the evidence you walk away with. Insurer-backed assurance is on the horizon (in development); independent assurance comes first.

About Annexo

Annexo is the independent trust layer for AI agents: it verifies how a third party’s AI agent actually behaves with live tests, watches it for drift, and produces audit-ready evidence for buyers, regulators and insurers. Every result is observed behaviour at the time of testing — never a certification, conformity assessment, guarantee, or legal advice. Annexo is not a notified body.

Frequently asked questions

What is Annexo?
Annexo is an independent trust layer for AI agents. It verifies how a third party’s AI agent actually behaves with live behavioural probes, watches it for drift over time, and produces audit-ready assurance evidence a buyer, regulator or insurer can rely on. The thesis is simple: a builder cannot credibly grade its own homework, so verification has to be independent.
Who is Annexo for?
EU and DACH enterprises deploying AI agents in regulated settings — insurance, banking, industrial — and the consultancies that build agents for them. Later, insurers underwriting agent risk.
How does Annexo verify an AI agent?
Point the verify console at your own AI agent endpoint or run a built-in sample agent. A live probe battery runs against it — prompt injection, tool poisoning, guardrails under pressure, AI disclosure, PII handling, request logging — and resolves into an evidence dashboard. Your agent’s API key is held in memory for that one request only and is never stored.
Does Annexo certify or guarantee that an AI agent is compliant?
No. Annexo is not a notified body and does not certify, guarantee, or give legal advice. Every result is observed behaviour at the time of testing, reported as a status — holding, watch, or surfaced — never a pass/fail verdict or a conformity assessment.
What about EU regulations like the EU AI Act, GDPR, DORA and NIS2?
Annexo also produces done-for-you EU conformity dossiers — the evidence and technical documentation mapped to the EU AI Act, GDPR, DORA and NIS2, produced from your system and audit-ready. It is the deliverable, not a substitute for your own counsel or a conformity assessment body.
Where is Annexo’s data processed?
In the EU. Compute runs in the Frankfurt (fra1) region and persisted data uses an EU-region store, in line with EU data-residency expectations.