Private early access · 12 teams onboarded

Stop chasing logs.
Know why your deploy failed.

NessForge correlates CI pipeline output, service traces, and your recent diff to surface the root cause — before you're paging your team at 2am.

$ nessforge analyze --deploy staging-deploy-1847 ▸ Fetching pipeline run #1847 from GitHub Actions... ▸ Correlating 3 service traces across staging cluster... ▸ Comparing against 14 commits merged since last stable deploy... ✗ Deploy failed. Root cause identified (confidence: 91%) auth-service v2.14.1 → v2.15.0 └─ Config key SESSION_TTL renamed to SESSION_TIMEOUT_MS auth-proxy still reads SESSION_TTL → received undefined → crashed Introduced in: 9f3c2a1 · Alice Chen · 2h ago → nessforge.com/investigate/deploy-1847

The investigation shouldn't take longer than the outage

When a deployment fails across a distributed system, the investigation typically starts with five or six tabs: GitHub Actions for the pipeline run, Datadog or Grafana for service metrics, kubectl events for pod crashes, your changelog tool for recent PRs, and Slack to find out what else changed that afternoon. Twenty minutes in, you've found the change. Another hour to confirm which downstream service it broke, and why.

The underlying data exists — it's just scattered. CI logs, git history, environment configs, and service dependency information all live in separate systems with no shared vocabulary. NessForge builds a live model of the causal relationships between your CI runs, service graph, config changes, and deploy history. When something breaks, you see the chain immediately instead of assembling it by hand in the middle of an incident.

What NessForge does

Deployment root cause analysis

After a failed deploy, NessForge presents a ranked list of probable causes, each linked to the specific commit, config value, or environment variable that changed. It doesn't just show which service failed — it reconstructs the cascade: where the chain started, what propagated the failure, and why the CI passed anyway.

Pre-merge risk scoring

Before a change hits your main branch, NessForge compares the diff against your pipeline history to flag statistically unusual patterns. Not "this might break things" — but "this changes SESSION_TIMEOUT config and the last 3 times that key changed, auth-proxy failed to start within 60 seconds."

Pipeline health monitoring

NessForge tracks build time trends, flaky test clusters, and failure rates by service and team over time. It flags when your CI is quietly accumulating technical debt — a test that's been intermittent for two weeks, a build step that doubled in duration after a dependency upgrade — before it compounds into an incident.

How it works

No agent to install, no changes to your pipeline YAML. NessForge reads from your existing CI provider and git host.

Connect your CI provider

OAuth into GitHub Actions, GitLab CI, or CircleCI. NessForge indexes your pipeline history (last 90 days by default) in about 5 minutes — no YAML changes, no self-hosted agent.

NessForge maps your service graph

It reads job definitions and deploy step patterns across your repos to build a model of which services get built, tested, and deployed together — and which ones share config or talk to each other.

Every deploy gets correlated in real time

On each new deploy, NessForge compares incoming changes against known-good baselines and historical failure patterns. It knows what "normal" looks like for each service, so anomalies surface immediately.

Structured report when something breaks

Failures produce a ranked root cause report: the probable cause, the change that introduced it, affected services, and a confidence score based on how well the evidence fits. You get the investigation — not just the alert.

Used by teams shipping fast

"We cut our mean time to root cause from about 45 minutes to under 5. The first time NessForge pointed us at the exact commit and config key before we'd even opened Datadog, we knew this was the right tool."

— Engineering Lead, Series B SaaS company (name withheld during early access)

Join the early access cohort

We're working with a small number of engineering teams to validate the analysis models against real production environments. Spots are limited.