Annexo is an independent trust layer for AI agents. It verifies how a third party’s AI agent actually behaves with live behavioural probes, watches it for drift over time, and produces audit-ready assurance evidence a buyer, regulator or insurer can rely on. The thesis is simple: a builder cannot credibly grade its own homework, so verification has to be independent.

EU and DACH enterprises deploying AI agents in regulated settings — insurance, banking, industrial — and the consultancies that build agents for them. Later, insurers underwriting agent risk.

How does Annexo verify an AI agent?

Point the verify console at your own AI agent endpoint or run a built-in sample agent. A live probe battery runs against it — prompt injection, tool poisoning, guardrails under pressure, AI disclosure, PII handling, request logging — and resolves into an evidence dashboard. Your agent’s API key is held in memory for that one request only and is never stored.

Does Annexo certify or guarantee that an AI agent is compliant?

No. Annexo is not a notified body and does not certify, guarantee, or give legal advice. Every result is observed behaviour at the time of testing, reported as a status — holding, watch, or surfaced — never a pass/fail verdict or a conformity assessment.

What about EU regulations like the EU AI Act, GDPR, DORA and NIS2?

Annexo also produces done-for-you EU conformity dossiers — the evidence and technical documentation mapped to the EU AI Act, GDPR, DORA and NIS2, produced from your system and audit-ready. It is the deliverable, not a substitute for your own counsel or a conformity assessment body.

Where is Annexo’s data processed?

In the EU. Compute runs in the Frankfurt (fra1) region and persisted data uses an EU-region store, in line with EU data-residency expectations.

Annexo

Launch check Talk to us

Independent model-risk verification for live AI agents

Your AI agents carry model risk. We test how they behave — independently.

We run live behavioural probes on your AI agents — and hand you the evidence.

Probed under attack

Watched as it drifts

Independent evidence

Talk to us See it in action

No agent we built, no policy we sell — no conflictLive behaviour, re-tested as it driftsEvidence your risk owner can sign off on

What you're buying into — in three beats

01 · Today

Your AI agents make real decisions no one watches.

02 · With Annexo

We test and watch them live — independently.

03 · What you get

Evidence your risk owner can sign off on.

The trust gap

Your agent acts on its own. Who do you trust to say it's safe?

Before customers, a regulator, or a regulated buyer rely on it, someone has to vouch for how it actually behaves. Three answers fail. One holds.

You can't see what your AI agents are doing — they make real decisions, sealed inside a black box, out of your sight.

Trust yourself

Your team built it, tested it, signed it off. Homework, self-graded.

Trust the vendor

The platform that built it says it's fine. The maker can't be the grader.

Trust a dashboard

Green tiles your team typed in by hand. Recorded — never observed.

The right answer · Annexo

Independent verification of observed behaviour.

Annexo has no stake in the answer. We observe how your agent actually behaves and put the evidence on the record — independent, continuous, ready for your risk owner to sign.

Book a scoping call

The builder hopes for a pass. The insurer hopes for a pass. Annexo records what the agent actually did — either way.

The rail

One spine for agent trust: Test, Monitor, Attest, Insure.

Watch the agent get caught — then watch the rail that catches it forever. Independent verification of how your agents actually behave, from a first test through to underwritten risk. Two rungs run today; two are forming with our partners.

Test

Monitor

Attest

Insure

Test the AI agent, watch it live, attest it, and — on the horizon — insure it. One independent rail, from first test to underwritten risk.

Test and Monitor are live today. Attest is in pilot and is not a conformity assessment — Annexo is not a notified body. Insure is a direction, in development with risk partners and not yet available. Nothing here is legal advice.

See it in action

Three ways to watch it work.

No setup, no sign-up — open one and watch independent verification catch what a vendor demo never shows. Fictional sample agents, illustrative findings.

The Aria catch

Watch a claims agent hold under inspection — then cave under pressure and pay a claim it had flagged as fraud.

Universal chatbot check

Point the check at any chatbot and see how it handles disclosure, pressure and questions it shouldn't answer.

The adaptive auditor

An auditor that probes an agent, follows what it finds, and writes up the observed behaviour as it goes.

Illustrative · fictional sample agents

Continuous agent monitoring

Don't vouch for your AI agents. Prove them.

Every other tool automates the paperwork and asks everyone to trust it. We connect to your live agents and continuously prove how they behave — guardrails, prompt-injection resistance, logging, Art. 50 disclosure — each mapped to the obligations it must meet, and watched for drift as your estate changes. That evidence is what lets you turn an agent on, and what lets you sell it to a regulated buyer.

Proof you can demonstrate — not trust you assert.

Open the live console

Agents under assurance

continuously monitored

Clear to run unsupervised

1 of 3

holding on every probe

Findings caught

surfaced for review

Example agents · open showcaseWatch them get caught →

Live today · Test + Monitor

How verification produces audit-ready evidence today.

The first two rungs already ship. Connect your stack, we verify behaviour against the obligations that apply, and you walk away with audit-ready evidence — including a done-for-you EU conformity dossier across the EU AI Act, GDPR, DORA and NIS2. This is the present output of the rail, not the whole of it.

See a redacted sample dossier

✓independent

We don't build your agents

no reason to pass them

3agents verified

On the sample fleet

illustrative · sample fleet

2findings surfaced

Caught, not shipped

illustrative · sample fleet

✓continuous

Verified in seconds, watched continuously

behaviour, not paperwork

See your agents verified — in 30 minutes.

Bring one AI agent. We run it on the rail live and show you exactly how it behaves — what holds, and what gets caught — then talk about putting your whole fleet under independent watch. No paperwork, no commitment.

Book a scoping call See what's blocking your launch See an example dossier

About Annexo

Annexo is the independent trust layer for AI agents: it verifies how a third party’s AI agent actually behaves with live tests, watches it for drift, and produces audit-ready evidence for buyers, regulators and insurers. Every result is observed behaviour at the time of testing — never a certification, conformity assessment, guarantee, or legal advice. Annexo is not a notified body.

Frequently asked questions

What is Annexo?: Annexo is an independent trust layer for AI agents. It verifies how a third party’s AI agent actually behaves with live behavioural probes, watches it for drift over time, and produces audit-ready assurance evidence a buyer, regulator or insurer can rely on. The thesis is simple: a builder cannot credibly grade its own homework, so verification has to be independent.
Who is Annexo for?: EU and DACH enterprises deploying AI agents in regulated settings — insurance, banking, industrial — and the consultancies that build agents for them. Later, insurers underwriting agent risk.
How does Annexo verify an AI agent?: Point the verify console at your own AI agent endpoint or run a built-in sample agent. A live probe battery runs against it — prompt injection, tool poisoning, guardrails under pressure, AI disclosure, PII handling, request logging — and resolves into an evidence dashboard. Your agent’s API key is held in memory for that one request only and is never stored.
Does Annexo certify or guarantee that an AI agent is compliant?: No. Annexo is not a notified body and does not certify, guarantee, or give legal advice. Every result is observed behaviour at the time of testing, reported as a status — holding, watch, or surfaced — never a pass/fail verdict or a conformity assessment.
What about EU regulations like the EU AI Act, GDPR, DORA and NIS2?: Annexo also produces done-for-you EU conformity dossiers — the evidence and technical documentation mapped to the EU AI Act, GDPR, DORA and NIS2, produced from your system and audit-ready. It is the deliverable, not a substitute for your own counsel or a conformity assessment body.
Where is Annexo’s data processed?: In the EU. Compute runs in the Frankfurt (fra1) region and persisted data uses an EU-region store, in line with EU data-residency expectations.