Overview
Analytics alerts let you define rules that monitor API request performance and error rates. These rules evaluate metrics from the analytics.api_requests ClickHouse table, covering throughput, duration percentiles, and HTTP error counts.
Every analytics alert rule uses the Analytics Metric rule type.
Available metrics
| Metric | Summary Function | Unit |
|---|---|---|
| Throughput | throughput | calls |
| Response Time - avg | average | seconds |
| Response Time - p75 | average | seconds |
| Response Time - p90 | average | seconds |
| Response Time - p95 | average | seconds |
| 4XX Failure Count | throughput | — |
| 5XX Failure Count | throughput | — |
| 4XX and 5XX Failure Count | throughput | — |
All metrics are queried from the analytics.api_requests table.
Enter response time thresholds in seconds. The alert engine converts units automatically.
Targets
Analytics alert rules do not use project-level targets. Use Group By to scope which API endpoints or dimensions the rule evaluates.
Filters
Analytics rules inherit the standard filter mechanism. You can narrow evaluation to specific API endpoints, status codes, or other request attributes using filter conditions.
Group By
Group By is a required field for analytics alert rules. It groups the evaluation by API endpoint or another request dimension. Each group is evaluated independently against the threshold, and each group that violates the threshold creates a separate incident.
Common Group By values include:
- API endpoint path
- HTTP method
- Service name
- Status code category
Evaluation logic
The supported operators are:
| Operator | Triggers when |
|---|---|
above |
Value is greater than or equal to the threshold |
below |
Value is less than or equal to the threshold |
equal |
Value equals the threshold |
Each group produced by the Group By field is compared against the threshold independently. If three API endpoints all exceed the threshold, three separate incidents are opened.
Examples
API latency alert
Detect when the 95th percentile response time for any API endpoint exceeds an acceptable limit.
| Field | Value |
|---|---|
| Metric | Response Time - p95 |
| Operator | above |
| Threshold | 3 (3 seconds) |
| Group By | API endpoint |
This rule evaluates the 95th percentile duration for each API endpoint. If any endpoint's P95 latency exceeds 3 seconds in the evaluation window, an incident is opened for that endpoint. This helps catch slow endpoints before they affect user experience.
5XX error spike
Alert when server error responses surge for any API endpoint, indicating a potential backend failure.
| Field | Value |
|---|---|
| Metric | 5XX Failure Count |
| Operator | above |
| Threshold | 50 |
| Group By | API endpoint |
This rule counts 5XX responses per API endpoint. If any endpoint returns more than 50 server errors in the evaluation window, an incident is created. Pairing this with a throughput rule gives visibility into whether errors correlate with traffic spikes.
+1-415-800-4104