← Innovation Labs/Lab L2LiveAI safety & governance

Trustworthy AI Toolkit — Red-team · Eval · Lineage · EU AI Act

Trustworthy AI cannot be a slide deck. The Toolkit ships continuous red-teaming, bias and toxicity evals, model lineage and signed releases — all wired into CI so a model that fails audit literally cannot reach production. Compliance becomes a build artefact.

Research thesis

If safety is a checklist at release, you've already lost. It must be a CI gate on every commit.

Audit-ready releases

100%

Red-team coverage (MITRE ATLAS)

+ 4.1×

Time to regulator dossier

9d → 0d

Become a design partner →See live experiments

Active experiments

What the lab is testing right now

Continuous prompt-injection red-team

Adversarial agents probe production endpoints daily across MITRE ATLAS coverage.

Bias suites per protected class

Statistical parity, equalised odds, calibration — measured on every model card revision.

Lineage proofs

Cryptographically signed chain from data → features → model → deploy → response.

Policy-as-code guardrails

Rego rules enforced at the model gateway: residency, consent, retention, exit-list filtering.

Shippable artefacts

Everything the lab ships

Eval harness
Quality, bias, toxicity, jailbreak and ATLAS suites running on every commit.
Red-team agents
Continuous adversarial probes with severity scoring and auto-tickets to detection-as-code.
Lineage plane
Signed graph of every artefact in the model lifecycle, queryable for audit.
Regulator dossier
Auto-generated EU AI Act, NIST AI RMF and ISO 42001 evidence packs, updated continuously.
Policy gateway
Rego rules at the model gateway: residency, consent, retention, response filters.

Lab team

Responsible AI Principal
Adversarial ML Lead
Policy & Legal Engineer
Lineage / MLOps Lead

Partners we collaborate with

NIST AI RMFMITRE ATLASHugging FaceOpenAIAnthropicMicrosoft Responsible AI

Example output · Policy · model.gateway.regorego

package axp.gateway

default allow = false

allow {
  input.tenant == "acme-emea"
  input.region == "eu-west"
  input.consent.has("model_training") == false  # never train on PII
  not contains(input.prompt, sensitive_terms[_])
  input.eval_score >= 0.94
}

audit[a] {
  a := {
    "ts":      time.now_ns(),
    "model":   input.model,
    "tenant":  input.tenant,
    "score":   input.eval_score,
    "reason":  "policy.allow",
  }
}

Engagement timeline

Weeks 1–6 · first regulator dossier signed off by week 4

1
Weeks 1–2
Baseline + lineage
Wire eval harness, lineage capture, model cards across the production fleet.
2
Weeks 2–4
Red-team + dossier
Continuous adversarial probes, severity scoring, first regulator dossier signed.
3
Weeks 4–6
CI gates live
Quality, bias and ATLAS gates fail-build on regressions; exec scorecard live.

Flagship pods

Productionised by these squads

EU AI Act Readiness Pod

Red-Team Continuous Pod

Bias & Fairness Pod

Model Lineage Pod

Selected publications

Receipts, not just thesis

Continuous red-teaming reduces jailbreak success by 78% at iso-cost
USENIX Security Workshop·2025
From policy doc to Rego: making AI Act controls executable
AXP Internal Whitepaper·2026

FAQs

What partners actually ask

Is this a one-off audit?

No — it's a continuous control plane. Every commit re-runs evals, red-team and lineage; the dossier auto-updates.

EU AI Act ready?

Yes — high-risk and limited-risk obligations are mapped, with Annex IV evidence pack auto-generated.

Does it slow shipping?

The opposite. Failing fast in CI is far cheaper than failing in front of a regulator.

Can we use our own evals?

Yes — the harness is plug-in. Bring HELM, Big-Bench Hard, internal golden sets or domain suites.

Design-partner programme · L2 Trustworthy AI Toolkit

Co-build Trustworthy AI Toolkit with us in Weeks 1–6.

We'll respond within one business day with a scoping note, a fixed-price outcome contract, and a named principal cleared for your domain. Design partners get first-look access, joint publication rights and roadmap influence.

• Outcome-priced — no T&M.
• Sovereign by default — your data, your region, your keys.
• Refund-backed if the contracted KPI isn't hit.
• Joint publication rights and conference slots.