← Innovation Labs/Lab L2LiveAI safety & governance

Trustworthy AI ToolkitRed-team · Eval · Lineage · EU AI Act

Trustworthy AI cannot be a slide deck. The Toolkit ships continuous red-teaming, bias and toxicity evals, model lineage and signed releases — all wired into CI so a model that fails audit literally cannot reach production. Compliance becomes a build artefact.

Research thesis
If safety is a checklist at release, you've already lost. It must be a CI gate on every commit.
Audit-ready releases
100%
Red-team coverage (MITRE ATLAS)
+ 4.1×
Time to regulator dossier
9d → 0d
Active experiments

What the lab is testing right now

Continuous prompt-injection red-team

Adversarial agents probe production endpoints daily across MITRE ATLAS coverage.

Bias suites per protected class

Statistical parity, equalised odds, calibration — measured on every model card revision.

Lineage proofs

Cryptographically signed chain from data → features → model → deploy → response.

Policy-as-code guardrails

Rego rules enforced at the model gateway: residency, consent, retention, exit-list filtering.

Shippable artefacts

Everything the lab ships

  • Eval harness
    Quality, bias, toxicity, jailbreak and ATLAS suites running on every commit.
  • Red-team agents
    Continuous adversarial probes with severity scoring and auto-tickets to detection-as-code.
  • Lineage plane
    Signed graph of every artefact in the model lifecycle, queryable for audit.
  • Regulator dossier
    Auto-generated EU AI Act, NIST AI RMF and ISO 42001 evidence packs, updated continuously.
  • Policy gateway
    Rego rules at the model gateway: residency, consent, retention, response filters.
Lab team
  • Responsible AI Principal
  • Adversarial ML Lead
  • Policy & Legal Engineer
  • Lineage / MLOps Lead
Partners we collaborate with
NIST AI RMFMITRE ATLASHugging FaceOpenAIAnthropicMicrosoft Responsible AI
Example output · Policy · model.gateway.regorego
package axp.gateway

default allow = false

allow {
  input.tenant == "acme-emea"
  input.region == "eu-west"
  input.consent.has("model_training") == false  # never train on PII
  not contains(input.prompt, sensitive_terms[_])
  input.eval_score >= 0.94
}

audit[a] {
  a := {
    "ts":      time.now_ns(),
    "model":   input.model,
    "tenant":  input.tenant,
    "score":   input.eval_score,
    "reason":  "policy.allow",
  }
}
Engagement timeline

Weeks 1–6 · first regulator dossier signed off by week 4

  1. 1
    Weeks 1–2
    Baseline + lineage

    Wire eval harness, lineage capture, model cards across the production fleet.

  2. 2
    Weeks 2–4
    Red-team + dossier

    Continuous adversarial probes, severity scoring, first regulator dossier signed.

  3. 3
    Weeks 4–6
    CI gates live

    Quality, bias and ATLAS gates fail-build on regressions; exec scorecard live.

Flagship pods

Productionised by these squads

EU AI Act Readiness Pod
Red-Team Continuous Pod
Bias & Fairness Pod
Model Lineage Pod
Selected publications

Receipts, not just thesis

  • Continuous red-teaming reduces jailbreak success by 78% at iso-cost
    USENIX Security Workshop·2025
  • From policy doc to Rego: making AI Act controls executable
    AXP Internal Whitepaper·2026
FAQs

What partners actually ask

Is this a one-off audit?

No — it's a continuous control plane. Every commit re-runs evals, red-team and lineage; the dossier auto-updates.

EU AI Act ready?

Yes — high-risk and limited-risk obligations are mapped, with Annex IV evidence pack auto-generated.

Does it slow shipping?

The opposite. Failing fast in CI is far cheaper than failing in front of a regulator.

Can we use our own evals?

Yes — the harness is plug-in. Bring HELM, Big-Bench Hard, internal golden sets or domain suites.

Design-partner programme · L2 Trustworthy AI Toolkit

Co-build Trustworthy AI Toolkit with us in Weeks 1–6.

We'll respond within one business day with a scoping note, a fixed-price outcome contract, and a named principal cleared for your domain. Design partners get first-look access, joint publication rights and roadmap influence.

  • • Outcome-priced — no T&M.
  • • Sovereign by default — your data, your region, your keys.
  • • Refund-backed if the contracted KPI isn't hit.
  • • Joint publication rights and conference slots.
By submitting you agree to our outreach for this enquiry. Your details are stored in our governed lead system.