The Preflight suite is the pre-deploy test pipeline for AI agents — regression, compliance, orchestration gates, config promotion, and observability. Five tools. One pipeline. One exit code.
$ agent-gate run ✔ stepproof 18/18 scenarios passed (4.2s) ✔ agent-comply EU AI Act: Low risk (1.1s) ✔ cost $0.31 / run (budget: $0.50) (0.4s) VERDICT: PASS — gates cleared, receipt saved $ agent-shift promote staging --require-gate-pass ./gate-result.json ✔ gate receipt verified sha256 match ✔ config diff model unchanged, 1 prompt delta ✔ snapshot snap-20260320-a3f1 saved PROMOTED — staging → production
A model update or prompt change breaks step 3 of your 7-step workflow.
Someone adds a model call that crosses an EU AI Act Annex III threshold.
A prompt tweak triples token usage. No alert. No CI check.
Staging ran with model A. Production promoted with model B. No diff, no snapshot, no rollback path.
Your agent ran. Was that 3 LLM calls or 47? Did step 5 retry? Was that 2 seconds or 20?
Your existing CI catches none of these. That's the gap.
validate → comply → gate → deploy → observe. Each tool is independently useful. Together they form a complete pre-deploy pipeline for AI agents.
stepproofagent-complyagent-gateagent-shiftagent-traceagent-gate + agent-shift + agent-trace — the full pipelinevalidate → comply → gate → deploy → observe. The first three checks run in parallel. agent-shift closes the loop. agent-trace records what happened.
Promise.allSettled.
agent-shift only runs after agent-gate exits 0 — no gate receipt, no promotion.
$ agent-gate run ✗ stepproof 14/18 scenarios failed → stepproof run --verbose to debug step failures ✔ agent-comply EU AI Act: Low risk (1.1s) ✗ cost $1.82 / run (budget: $0.50) — OVER BUDGET → Check for prompt expansion in recent commits VERDICT: FAIL — 2 of 3 gates blocked deploy $ agent-shift check ✗ config drift staging ≠ production → model: gpt-4o-mini (staging) vs gpt-4o (production) → Run: agent-shift diff staging production → Run: agent-shift rollback production (to revert)
# .github/workflows/preflight.yml name: Preflight Agent Check on: [push, pull_request] jobs: preflight: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: '20' - name: Install Preflight suite run: | npm install -g github:StanislavBG/stepproof npm install -g github:StanislavBG/agent-comply npm install -g github:StanislavBG/agent-gate npm install -g github:StanislavBG/agent-shift npm install -g github:StanislavBG/agent-trace - name: Run Preflight gates run: agent-gate run --config .preflight.json # Exits 1 on regression, compliance failure, or budget breach - name: Promote config to staging if: github.ref == 'refs/heads/main' run: agent-shift promote staging --require-gate-pass ./gate-result.json # No gate receipt → no promotion. Enforced.
Each tool solves a distinct, independently valuable problem. stepproof tests behavior. agent-comply scans for legal risk. agent-gate unifies both for CI. agent-shift handles what comes after the gate: promoting the validated config to production with a rollback path if something goes wrong post-deploy. agent-trace is the observability layer: wrap any agent run to record OTel spans in local SQLite — no cloud account, no dashboard, data stays on your machine.
Passing a gate is a point-in-time verdict. Promoting a config is a stateful operation with environment history, diffs, and rollback. Observing production behavior closes the loop: agent-trace traces wire back to the stepproof scenarios that defined expected behavior. You need all five — and they wire together. agent-gate writes a receipt; agent-shift verifies it; agent-trace records what actually ran.
All four CLI tools are MIT, forever. The cloud dashboard — for team compliance history, PDF artifacts, and audit exports — is where we charge.
Team compliance history, shared audit trails, PDF reports for legal, agent-shift promotion timelines. Join the list to get early access.
Install any tool and run it against your codebase. No account. No config file required to start.
agent-gate run