Overview
Logs alerts let you define rules that monitor log event volume and size across your infrastructure. When the count or byte volume of log events crosses a threshold you specify, Atatus opens an incident and notifies the channels attached to the alert policy.
Every logs alert rule uses the Logs Metric rule type. Evaluation runs against the logs.events_v3 ClickHouse table on a per-minute bucket.
Available metrics
| Metric | Summary Function | Unit |
|---|---|---|
| Event Count | throughput | events |
| Used Bytes | throughput | bytes |
Both metrics are queried from the logs.events_v3 table.
Targets
Logs alert rules do not use project-level targets. Use filters and Group By to scope which log events the rule evaluates.
Filters
Logs rules support two filtering mechanisms:
ATQL filter strings
The logsFilters field accepts ATQL (Atatus Trace Query Language) filter strings. These are free-form query expressions that match against log event attributes.
Filter conditions
The filterConditions field accepts structured conditions with the following operators:
| Operator | Description |
|---|---|
is |
Exact match |
is not |
Exclude exact match |
contains |
Substring match |
not contains |
Exclude substring match |
You can combine multiple filter conditions. All conditions must be satisfied for a log event to be included in the evaluation.
Group By
Group By is optional. When set, the rule splits evaluation by a log attribute such as service, level, or host. Each distinct value of the grouped attribute is evaluated independently, and each value that violates the threshold creates a separate incident.
If Group By is omitted, all matching log events are aggregated into a single value for threshold comparison.
Evaluation logic
The alerting engine queries ClickHouse with a structure similar to:
SELECT
groupByField,
countIf(value operator threshold) AS violationCount,
count() AS totalCount
FROM (
SELECT
groupByField,
count() AS value
FROM logs.events_v3
WHERE accountId = :accountId
AND <filters>
GROUP BY groupByField
)
The supported operators are:
| Operator | Triggers when |
|---|---|
above |
Value is greater than or equal to the threshold |
below |
Value is less than or equal to the threshold |
equal |
Value equals the threshold |
Examples
Log volume spike
Detect when total log ingestion exceeds a byte threshold, which may indicate a noisy deployment or a logging loop.
| Field | Value |
|---|---|
| Metric | Used Bytes |
| Operator | above |
| Threshold | 500000000 (500 MB) |
| Group By | service |
| Filters | none |
This rule evaluates log byte volume per service per minute bucket. If any single service produces more than 500 MB of logs in an evaluation window, an incident is opened for that service.
Error log frequency
Alert when a high number of error-level log events appear from a specific environment.
| Field | Value |
|---|---|
| Metric | Event Count |
| Operator | above |
| Threshold | 1000 |
| Group By | host |
| Filter conditions | level is error, environment is production |
This rule counts log events where level is error and environment is production, grouped by host. If any host exceeds 1,000 error log events in the evaluation window, an incident is created for that host.
+1-415-800-4104