An issue represents a single metric violation on a specific target (application, host, pod, or other entity). When a rule's threshold is breached, Atatus creates an issue for each affected target and groups it into an incident based on the policy's incident preference.
Issues vs incidents
| Concept | What it represents | Example |
|---|---|---|
| Issue | One metric on one target breached a threshold | "Web Response Time exceeded 2s on checkout-service" |
| Incident | A container that groups one or more related issues | "Production API Health — Critical" (may contain multiple issues) |
A single incident can contain multiple issues. For example, if CPU usage spikes on three hosts, the policy creates three issues (one per host) grouped into one or more incidents depending on the incident preference.
Issue details
Every issue records:
| Field | Description |
|---|---|
| Rule | The alert rule that was violated |
| Policy | The alert policy the rule belongs to |
| Target | The application, host, or entity where the violation occurred |
| Metric | The specific metric that breached (e.g., Web Response Time, CPU Used Percentage) |
| Severity | Critical or Warning — determined by which threshold tier was breached |
| Operator | The comparison used (above, below, or equal) |
| Threshold | The configured threshold value that was exceeded |
| Duration | How long the issue has been open |
| Start time | When the violation was first detected |
| Status | Opened or Closed |
Issue lifecycle
Threshold breached Condition resolves
│ │
▼ ▼
Opened ──────────────────► Closed
│ ▲
│ User closes │
└───────────────────────────┘
- Opened — The metric crossed the threshold for the configured duration. Atatus creates the issue, links it to an incident, and sends notifications.
- Closed — The condition resolved (metric returned to normal) or a user manually closed the issue.
Issues do not have an "Acknowledged" state — acknowledgment happens at the incident level.
Severity levels
| Severity | When it triggers |
|---|---|
| Critical | The metric breached the Critical threshold |
| Warning | The metric breached the Warning threshold but not the Critical threshold |
If both Warning and Critical thresholds are configured and the metric exceeds both, the issue is created with Critical severity.
Closing issues
Issues can be closed in two ways:
- Automatic — The alerting engine detects that the metric has returned to normal (no longer violating the threshold). The issue is closed automatically and the
closedByfield is set to "Atatus". - Manual — A user closes the issue from the Alerting page. The
closedByfield records the user's name.
When all issues in an incident are closed, the incident is also closed automatically.
Viewing issues
Navigate to Alerting > Issues to see all issues. You can filter by:
- Status — Open or Closed
- Incident — View issues belonging to a specific incident
- Project — Filter by application or project
- Search — Search by rule name, target name, or metric
Each issue links to its parent incident and shows a chart of the metric's value over time relative to the configured threshold.
Example
A policy "Production API Health" has a rule: Web Response Time above 2 seconds for 5 minutes, using the all time function.
At 14:00, the checkout-service application starts responding slowly. For every minute from 14:00 to 14:05, the average response time exceeds 2 seconds.
At 14:05, the alerting engine evaluates the rule: 1. Issue created — "Web Response Time exceeded 2 seconds on checkout-service" with severity Critical 2. The issue is grouped into an incident based on the policy's incident preference 3. Notifications are sent to the configured channels
At 14:12, response time drops back below 2 seconds and stays there for the full evaluation window. The engine closes the issue automatically.
+1-415-800-4104