Overview

Logs alerts let you define rules that monitor log event volume and size across your infrastructure. When the count or byte volume of log events crosses a threshold you specify, Atatus opens an incident and notifies the channels attached to the alert policy.

Every logs alert rule uses the Logs Metric rule type. Evaluation runs against the logs.events_v3 ClickHouse table on a per-minute bucket.

Available metrics

Metric Summary Function Unit
Event Count throughput events
Used Bytes throughput bytes

Both metrics are queried from the logs.events_v3 table.

Targets

Logs alert rules do not use project-level targets. Use filters and Group By to scope which log events the rule evaluates.

Filters

Logs rules support two filtering mechanisms:

ATQL filter strings

The logsFilters field accepts ATQL (Atatus Trace Query Language) filter strings. These are free-form query expressions that match against log event attributes.

Filter conditions

The filterConditions field accepts structured conditions with the following operators:

Operator Description
is Exact match
is not Exclude exact match
contains Substring match
not contains Exclude substring match

You can combine multiple filter conditions. All conditions must be satisfied for a log event to be included in the evaluation.

Group By

Group By is optional. When set, the rule splits evaluation by a log attribute such as service, level, or host. Each distinct value of the grouped attribute is evaluated independently, and each value that violates the threshold creates a separate incident.

If Group By is omitted, all matching log events are aggregated into a single value for threshold comparison.

Evaluation logic

The alerting engine queries ClickHouse with a structure similar to:

SELECT
  groupByField,
  countIf(value operator threshold) AS violationCount,
  count() AS totalCount
FROM (
  SELECT
    groupByField,
    count() AS value
  FROM logs.events_v3
  WHERE accountId = :accountId
    AND <filters>
  GROUP BY groupByField
)

The supported operators are:

Operator Triggers when
above Value is greater than or equal to the threshold
below Value is less than or equal to the threshold
equal Value equals the threshold

Examples

Log volume spike

Detect when total log ingestion exceeds a byte threshold, which may indicate a noisy deployment or a logging loop.

Field Value
Metric Used Bytes
Operator above
Threshold 500000000 (500 MB)
Group By service
Filters none

This rule evaluates log byte volume per service per minute bucket. If any single service produces more than 500 MB of logs in an evaluation window, an incident is opened for that service.

Error log frequency

Alert when a high number of error-level log events appear from a specific environment.

Field Value
Metric Event Count
Operator above
Threshold 1000
Group By host
Filter conditions level is error, environment is production

This rule counts log events where level is error and environment is production, grouped by host. If any host exceeds 1,000 error log events in the evaluation window, an incident is created for that host.