Annexo
Talk to us
Independent · point-in-time · not a certification

The Founder's Review · for founders shipping fast

An independent review of how your AI product actually behaves.

Built by founders, for founders shipping fast.

You can't credibly grade your own homework — the builder wants a pass; we want the truth. Annexo points adversarial probes at your AI agent, reads your code the way an attacker would, and hands you exactly what it observed: classified holding, watch, or surfaced. No verdict, no compliance theatre — just an honest, prioritized backlog you can ship against.

No agent endpoint handy? Run a free Site Check on any URL — security hygiene + how AI is used, from the outside.

Evidence in minutes, freeAdversarially verifiedNo data stored

We have no stake in your launch. When Annexo reviews another founder's product it is genuinely independent — that independence is the thing we sell. And we run this exact review on our own products, which is the most honest proof we can offer.

Two ways in

Start free in minutes — or request the deep review.

Free · instant · self-serve

Live behavioural probe

Point Annexo at your agent's OpenAI-compatible endpoint and the probe battery runs live in your browser — prompt injection, AI-interaction disclosure (Art. 50), PII echo, request logging. Evidence on the dashboard in minutes.

  • Runs against your live endpoint, right now
  • Your API key is held in memory for the one request — never stored, echoed, or logged
  • Or run a built-in sample agent with zero setup
Run the live probe →

This is the /verify console — already live.

Request · deep review

Full AI product review

The complete review across the 10-dimension rubric — agent behaviour probed adversarially, plus a static read of your code, auth, and data boundaries. You get the observed findings and a prioritized remediation backlog: the same review we run on our own products.

  • All 10 dimensions, adversarially verified
  • Findings classified holding / watch / surfaced — never a pass/fail verdict
  • A prioritized remediation backlog you can ship against
Request a full review →

What we check

Ten dimensions, two layers.

Agent behaviour we probe live; the rest we read from your code. Every finding is re-verified adversarially before it lands on your backlog.

Live

Agent behaviour

01

Prompt injection

Can planted instructions in user input or tool output steer the agent off its task?

02

Output poisoning

Is model output ever trusted as markup or code — the path to XSS and worse?

03

AI disclosure

Does the agent disclose it's an AI when a person could reasonably be misled (Art. 50)?

04

PII echo

Does the agent echo back or leak personal data it was handed or shown?

Code

Static review

05

Secret hygiene

Any keys or tokens in committed files, client bundles, or NEXT_PUBLIC leakage?

06

SSRF

When the server fetches a user-supplied URL, can it be aimed at internal or metadata addresses?

07

Auth & tenancy

Is every record scoped to its owner, with no cross-tenant read or write?

08

Input validation

Is untrusted input validated and parameterized before it hits a query or a model?

09

Rate-limit / cost

Can a caller drain your model spend or hammer an endpoint with no durable ceiling?

10

Webhook integrity

Are inbound webhooks signature-verified and fail-closed (e.g. Stripe)?

11

PII residency

Where does personal data live and flow — is residency and retention documented?

12

Embed / key exposure

Do embeds or public write-keys expose more than the one thing they're meant to?

The honest proof

We run this exact review on our own products.

We dogfood the rubric on every product in our own portfolio before we point it at anyone else's. A self-review is never presented as independent assurance — but running the same sharp tool on our own code is the most honest way to show what it catches.

!Builderbuilt the agentAnnexono stake in the answer?InsurerIN DEVELOPMENT1 findingobserved · on the record

The builder wants a pass. The insurer wants a pass. The independent examiner records the one finding both would rather not see — either way.

Request a full review

Tell us about your product.

A few details and we'll come back within one business day with scope and timing. No call required, no booking link — just a reply.

holdingwatchsurfaced

These are observed findings from a point-in-time, independent review — classified holding, watch, or surfaced. It is not a certification, not a conformity assessment, not a security warranty, and not legal advice. Annexo is not a notified body. A review does not conclude that a system "is" or "is not" secure or compliant; it shows what we observed so you can decide what to do next.

About Annexo

Annexo is the independent trust layer for AI agents: it verifies how a third party’s AI agent actually behaves with live tests, watches it for drift, and produces audit-ready evidence for buyers, regulators and insurers. Every result is observed behaviour at the time of testing — never a certification, conformity assessment, guarantee, or legal advice. Annexo is not a notified body.

Frequently asked questions

What is Annexo?
Annexo is an independent trust layer for AI agents. It verifies how a third party’s AI agent actually behaves with live behavioural probes, watches it for drift over time, and produces audit-ready assurance evidence a buyer, regulator or insurer can rely on. The thesis is simple: a builder cannot credibly grade its own homework, so verification has to be independent.
Who is Annexo for?
EU and DACH enterprises deploying AI agents in regulated settings — insurance, banking, industrial — and the consultancies that build agents for them. Later, insurers underwriting agent risk.
How does Annexo verify an AI agent?
Point the verify console at your own AI agent endpoint or run a built-in sample agent. A live probe battery runs against it — prompt injection, tool poisoning, guardrails under pressure, AI disclosure, PII handling, request logging — and resolves into an evidence dashboard. Your agent’s API key is held in memory for that one request only and is never stored.
Does Annexo certify or guarantee that an AI agent is compliant?
No. Annexo is not a notified body and does not certify, guarantee, or give legal advice. Every result is observed behaviour at the time of testing, reported as a status — holding, watch, or surfaced — never a pass/fail verdict or a conformity assessment.
What about EU regulations like the EU AI Act, GDPR, DORA and NIS2?
Annexo also produces done-for-you EU conformity dossiers — the evidence and technical documentation mapped to the EU AI Act, GDPR, DORA and NIS2, produced from your system and audit-ready. It is the deliverable, not a substitute for your own counsel or a conformity assessment body.
Where is Annexo’s data processed?
In the EU. Compute runs in the Frankfurt (fra1) region and persisted data uses an EU-region store, in line with EU data-residency expectations.