Getting Started

Rounds is an AI SRE that lives next to your existing telemetry and operations tools. Get it running in under five minutes — no schema changes, no second copy of your data.

1. Install

Pick the install path that matches your environment:

npm (single machine)Helm (Kubernetes)

bash

npx @syntropize/rounds

bash

helm upgrade --install rounds \
  oci://ghcr.io/syntropize/charts/rounds \
  --namespace observability \
  --create-namespace \
  --set secretEnv.LLM_API_KEY='replace-with-your-provider-key'

For more detail, see Install with npm or Install with Helm.

2. Open the web UI

The web UI runs on http://localhost:5173 (npm install) or whatever Ingress / port-forward you configured for the Kubernetes deployment.

The setup wizard walks you through the minimum needed to start asking questions:

Create your administrator account (name, email, password — minimum 12 characters)
Configure an LLM provider — paste an Anthropic, OpenAI, or Gemini API key, point at a local Ollama server, or supply an apiKeyHelper script for short-lived / vault-issued credentials
Add a metrics datasource — Prometheus, VictoriaMetrics, Mimir, Thanos, Cortex, or any Prometheus-API-compatible backend
Optionally add Loki for log search and a Kubernetes ops connector so investigations can inspect pods, events, rollouts, and prepare approval-gated remediations

You don't have to do step 3 or 4 in the wizard — once Rounds is running, you can also add datasources, ops connectors, and low-risk org settings by chatting with the agent ("connect my prod Prometheus at http://..."). The agent collects what it needs, previews the change, and applies it under your RBAC and the GuardedAction risk model.

3. Try a prompt

Once setup is complete, click the chat button and ask:

Create a dashboard for HTTP latency

Rounds will discover your metrics, build queries, validate them, and create a dashboard with overview stats, trend charts, and per-handler breakdowns — all grounded in your actual data.

Then try an investigation prompt:

Why is checkout latency high right now?

If metrics, logs, and a Kubernetes connector are configured, Rounds will query telemetry, inspect cluster state, write a report with citations on every claim, and recommend next actions. Mutating cluster actions are never executed silently:

When you ask the agent to do something risky, it surfaces a Run / Confirm / Apply prompt inline in chat.
When the agent is running unattended (auto-investigation triggered by a firing alert), the proposed fix is delivered as a RemediationPlan with formal Approve / Reject / Modify controls; the owning team / on-call is notified.

Common first prompts

Goal	Prompt
Build a dashboard	`Create a dashboard for checkout latency and errors`
Edit a dashboard	`Add p99 latency by route and remove panels with no data`
Understand a dashboard	`Explain what this dashboard tells me and what looks abnormal`
Create an alert	`Alert me when p95 latency is above 500ms for 10 minutes`
Investigate an alert	`Why did the high latency alert fire?`
Investigate the cluster	`Check whether Kubernetes is causing the latency spike`

What's next

Configuration — environment variables for production tuning
Chat & agents — dashboard, alert, investigation, and remediation workflows
Authentication — adding users, OAuth providers, role-based access control
API Reference — automate via REST and service account tokens

Getting Started ​

1. Install ​

2. Open the web UI ​

3. Try a prompt ​

Common first prompts ​

What's next ​