The Founder's Review · for founders shipping fast
Built by founders, for founders shipping fast.
You can't credibly grade your own homework — the builder wants a pass; we want the truth. Annexo points adversarial probes at your AI agent, reads your code the way an attacker would, and hands you exactly what it observed: classified holding, watch, or surfaced. No verdict, no compliance theatre — just an honest, prioritized backlog you can ship against.
No agent endpoint handy? Run a free Site Check on any URL — security hygiene + how AI is used, from the outside.
We have no stake in your launch. When Annexo reviews another founder's product it is genuinely independent — that independence is the thing we sell. And we run this exact review on our own products, which is the most honest proof we can offer.
Two ways in
Point Annexo at your agent's OpenAI-compatible endpoint and the probe battery runs live in your browser — prompt injection, AI-interaction disclosure (Art. 50), PII echo, request logging. Evidence on the dashboard in minutes.
This is the /verify console — already live.
The complete review across the 10-dimension rubric — agent behaviour probed adversarially, plus a static read of your code, auth, and data boundaries. You get the observed findings and a prioritized remediation backlog: the same review we run on our own products.
What we check
Agent behaviour we probe live; the rest we read from your code. Every finding is re-verified adversarially before it lands on your backlog.
Prompt injection
Can planted instructions in user input or tool output steer the agent off its task?
Output poisoning
Is model output ever trusted as markup or code — the path to XSS and worse?
AI disclosure
Does the agent disclose it's an AI when a person could reasonably be misled (Art. 50)?
PII echo
Does the agent echo back or leak personal data it was handed or shown?
Secret hygiene
Any keys or tokens in committed files, client bundles, or NEXT_PUBLIC leakage?
SSRF
When the server fetches a user-supplied URL, can it be aimed at internal or metadata addresses?
Auth & tenancy
Is every record scoped to its owner, with no cross-tenant read or write?
Input validation
Is untrusted input validated and parameterized before it hits a query or a model?
Rate-limit / cost
Can a caller drain your model spend or hammer an endpoint with no durable ceiling?
Webhook integrity
Are inbound webhooks signature-verified and fail-closed (e.g. Stripe)?
PII residency
Where does personal data live and flow — is residency and retention documented?
Embed / key exposure
Do embeds or public write-keys expose more than the one thing they're meant to?
The honest proof
We dogfood the rubric on every product in our own portfolio before we point it at anyone else's. A self-review is never presented as independent assurance — but running the same sharp tool on our own code is the most honest way to show what it catches.
The builder wants a pass. The insurer wants a pass. The independent examiner records the one finding both would rather not see — either way.
Request a full review
A few details and we'll come back within one business day with scope and timing. No call required, no booking link — just a reply.
Annexo is the independent trust layer for AI agents: it verifies how a third party’s AI agent actually behaves with live tests, watches it for drift, and produces audit-ready evidence for buyers, regulators and insurers. Every result is observed behaviour at the time of testing — never a certification, conformity assessment, guarantee, or legal advice. Annexo is not a notified body.