v0.2.0 now on GitHub · EU AI Act enforcement: August 2026

Your AI agent shipped.
Did you test it?

The Preflight suite is the pre-deploy test pipeline for AI agents — regression, compliance, orchestration gates, config promotion, and observability. Five tools. One pipeline. One exit code.

❯ npm install -g github:StanislavBG/agent-gate github:StanislavBG/agent-shift github:StanislavBG/agent-trace

View on GitHub →

~ agent-gate run && agent-shift promote staging

$ agent-gate run

  ✔  stepproof      18/18 scenarios passed      (4.2s)
  ✔  agent-comply   EU AI Act: Low risk          (1.1s)
  ✔  cost           $0.31 / run (budget: $0.50) (0.4s)

  VERDICT: PASS  — gates cleared, receipt saved

$ agent-shift promote staging --require-gate-pass ./gate-result.json

  ✔  gate receipt   verified sha256 match
  ✔  config diff    model unchanged, 1 prompt delta
  ✔  snapshot       snap-20260320-a3f1 saved

  PROMOTED  — staging → production

MIT · No telemetry · Runs offline · All gates run in parallel

The gap

Five things that silently break
every AI deploy.

Behavioral regression

A model update or prompt change breaks step 3 of your 7-step workflow.

You find out from a customer.

Compliance violation

Someone adds a model call that crosses an EU AI Act Annex III threshold.

You find out from legal.

Cost overrun

A prompt tweak triples token usage. No alert. No CI check.

You find out from the invoice.

Config drift

Staging ran with model A. Production promoted with model B. No diff, no snapshot, no rollback path.

You find out from the incident report.

No observability

Your agent ran. Was that 3 LLM calls or 47? Did step 5 retry? Was that 2 seconds or 20?

You find out from an unexplained invoice.

Your existing CI catches none of these. That's the gap.

The Preflight suite

Five tools, one story.

validate → comply → gate → deploy → observe. Each tool is independently useful. Together they form a complete pre-deploy pipeline for AI agents.

◈

stepproof

Behavioral regression testing

YAML scenario definitions, N iterations each
Per-step pass rate thresholds
LLM-judge assertions
Exit 1 on regression

"Did my AI pipeline break?"

$ stepproof run ./scenarios/

GitHub →

◎

agent-comply

EU AI Act compliance scanning

Scans for AI model calls across your codebase
EU AI Act Annex III risk classification
Model card & AI-SBOM generation
Fails on undocumented high-risk usage

"Is this legal in the EU?"

$ agent-comply report

GitHub →

Suite flagship

⬡

agent-gate

Unified pre-deploy CI gate

Runs stepproof + agent-comply in parallel
Token cost estimation & budget cap
Model allowlist enforcement
Single PASS/FAIL exit code for CI

"Is this safe to ship?"

$ agent-gate run

GitHub →

⇅

agent-shift

Config versioning & environment promotion

Snapshot agent config: model, prompts, tools, guardrails
Diff environments or snapshots side-by-side
Promote with optional gate-pass receipt verification
Rollback to any prior snapshot instantly

"Your agent passed the gates. Now promote it safely."

$ agent-shift promote staging

GitHub →

▣

agent-trace

Local agent observability

OTel GenAI spans stored in local SQLite
Query traces from the terminal — no dashboard
Works offline, no cloud account required
SARIF export for GitHub Security tab

"What did my agent actually do?"

$ agent-trace record 'node agent.js'

GitHub →

Install the tool you need. Add the rest when you need it.

"I want to catch AI regressions"

stepproof

"I need EU compliance artifacts for legal"

agent-comply

"I want one CI gate for everything"

agent-gate

"I need safe environment promotion with rollback"

agent-shift

"I want to see what my agent did at runtime"

agent-trace

"Starting fresh — what do I install?"

agent-gate + agent-shift + agent-trace — the full pipeline

How it works

One CI step. Five tools. Ship with confidence.

validate → comply → gate → deploy → observe. The first three checks run in parallel. agent-shift closes the loop. agent-trace records what happened.

↑

git push

Triggers CI

→

◈

stepproof

~4.2s

18 scenarios, pass rate thresholds, regression exit

↕

◎

agent-comply

~1.1s

Annex III scan, model card, AI-SBOM

↕

cost check

~0.4s

Token estimate vs. budget, model allowlist

→

⬡

agent-gate

PASS/FAIL

Unified verdict + gate receipt JSON

→

⇅

agent-shift

~0.2s

Snapshot config, verify receipt, promote

→

✔

deploy

Confident push with full audit trail

stepproof, agent-comply, and cost check run in parallel via Promise.allSettled. agent-shift only runs after agent-gate exits 0 — no gate receipt, no promotion.

~ gate failure + config drift

$ agent-gate run

  ✗  stepproof      14/18 scenarios failed
     → stepproof run --verbose to debug step failures

  ✔  agent-comply   EU AI Act: Low risk          (1.1s)

  ✗  cost           $1.82 / run (budget: $0.50) — OVER BUDGET
     → Check for prompt expansion in recent commits

  VERDICT: FAIL  — 2 of 3 gates blocked deploy

$ agent-shift check

  ✗  config drift   staging ≠ production
     → model: gpt-4o-mini (staging) vs gpt-4o (production)
     → Run: agent-shift diff staging production
     → Run: agent-shift rollback production   (to revert)

Drop this into your CI. Done.

# .github/workflows/preflight.yml
name: Preflight Agent Check
on: [push, pull_request]

jobs:
  preflight:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install Preflight suite
        run: |
          npm install -g github:StanislavBG/stepproof
          npm install -g github:StanislavBG/agent-comply
          npm install -g github:StanislavBG/agent-gate
          npm install -g github:StanislavBG/agent-shift
          npm install -g github:StanislavBG/agent-trace

      - name: Run Preflight gates
        run: agent-gate run --config .preflight.json
        # Exits 1 on regression, compliance failure, or budget breach

      - name: Promote config to staging
        if: github.ref == 'refs/heads/main'
        run: agent-shift promote staging --require-gate-pass ./gate-result.json
        # No gate receipt → no promotion. Enforced.

Why five tools, not one?

Each tool solves a distinct, independently valuable problem. stepproof tests behavior. agent-comply scans for legal risk. agent-gate unifies both for CI. agent-shift handles what comes after the gate: promoting the validated config to production with a rollback path if something goes wrong post-deploy. agent-trace is the observability layer: wrap any agent run to record OTel spans in local SQLite — no cloud account, no dashboard, data stays on your machine.

Passing a gate is a point-in-time verdict. Promoting a config is a stateful operation with environment history, diffs, and rollback. Observing production behavior closes the loop: agent-trace traces wire back to the stepproof scenarios that defined expected behavior. You need all five — and they wire together. agent-gate writes a receipt; agent-shift verifies it; agent-trace records what actually ran.

Pricing

Free at the CLI. Paid at the team layer.

All four CLI tools are MIT, forever. The cloud dashboard — for team compliance history, PDF artifacts, and audit exports — is where we charge.

Free

$0 always

OSS CLI — all five tools

✔ stepproof + agent-comply + agent-gate + agent-shift + agent-trace
✔ Local execution, zero telemetry
✔ MIT License
✔ Full scan, test, gate, snapshot, promote
✔ JSON + Markdown artifact output
✔ Community YAML policy library
✔ CI/CD integration (GitHub Actions, etc.)

$ npm i -g github:StanislavBG/agent-gate github:StanislavBG/agent-shift github:StanislavBG/agent-trace

Your AI agent shipped.
Did you test it?

Five things that silently break
every AI deploy.

Behavioral regression

Compliance violation

Cost overrun

Config drift

No observability

Five tools, one story.

One CI step. Five tools. Ship with confidence.

Why five tools, not one?

Free at the CLI. Paid at the team layer.

Cloud dashboard coming.

Run your first check
in 60 seconds.

Your AI agent shipped.Did you test it?

Five things that silently breakevery AI deploy.

Behavioral regression

Compliance violation

Cost overrun

Config drift

No observability

Five tools, one story.

One CI step. Five tools. Ship with confidence.

Why five tools, not one?

Free at the CLI. Paid at the team layer.

Cloud dashboard coming.

Run your first checkin 60 seconds.

Your AI agent shipped.
Did you test it?

Five things that silently break
every AI deploy.

Run your first check
in 60 seconds.