Aggregation¶

Count¶

* | count()
event_id=1 | count()

Count with Field¶

Count non-null values of a specific field, or use unique=true for distinct counting:

* | count(computer_name)
* | count(computer_name, unique=true)

Sum / Avg / Max / Min¶

* | sum(bytes)
* | avg(response_time)
* | max(bytes)
* | min(response_time)

Percentile and Standard Deviation¶

* | percentile(response_time)
* | stdDev(response_time)

Returns p50, p75, p99 for percentile().

Median and MAD¶

* | median(response_time)
* | mad(response_time)
* | groupBy(computer_name, function=mad(latency)) | _mad > 50

median() returns the median value (_median). mad() returns both the median (_median) and the median absolute deviation (_mad), a robust measure of variability resistant to outliers. Unlike standard deviation, MAD is not skewed by extreme values, making it ideal for anomaly detection on noisy data.

Select First / Last¶

Return the value from the earliest or latest event in each group:

* | groupBy(user) | selectFirst(timestamp)
* | groupBy(user) | selectLast(timestamp)

Multiple Aggregations with Multi¶

* | multi(count(), avg(response_time), sum(bytes))

Count with Named Parameters¶

* | groupby(user) | multi(count(field=event_id, distinct=true, as=unique_events))
* | groupby(user) | multi(count(field=event_id, as=total))

Use distinct=true or unique=true for unique counts (uniqExact), and as= to name the output column.

Collect (groupArray)¶

* | groupby(user) | multi(collect(image))

Collects all values of a field into an array per group.

Top (Frequency Distribution)¶

* | groupby(user) | multi(top(field=event_id, percent=true, as=top_events))

Shows the top values with their frequency. Use percent=true to show percentages.

Skewness and Kurtosis¶

* | skewness(response_time)
* | kurtosis(response_time)

skewness() returns the population skewness (_skewness). kurtosis() returns the population excess kurtosis (_kurtosis). Both work on chained aggregation outputs:

* | groupby(computer_name, function=count()) | skewness(_count)
* | groupby(computer_name, function=count()) | kurtosis(_count)

Frequency¶

Build a frequency table with count, percentage, and cumulative percentage:

* | frequency(event_name)
* | frequency(status_code)

Returns value, _count, _percentage, and _cumulative_pct columns, sorted by count descending.

IQR (Interquartile Range)¶

* | iqr(response_time)
* | groupby(computer_name, function=count()) | iqr(_count)

Returns _q1 (25th percentile), _q3 (75th percentile), and _iqr (Q3 - Q1).

Head/Tail (Pareto Analysis)¶

Segment values into "head" and "tail" groups based on cumulative percentage (80/20 rule):

* | headTail(event_name)
* | headTail(computer_name, threshold=90)

Returns value, _count, _percentage, _cumulative_pct, and _segment (head or tail). Default threshold is 80%.

Modified Z-Score¶

Per-row modified z-score using median and MAD, robust to outliers:

* | modifiedZScore(response_time)

Returns _median, _mad, _modified_z for each row. Formula: 0.6745 * (x - median) / MAD.

MAD Outlier Detection¶

Modified z-score with an outlier flag:

* | madOutlier(response_time)
* | madOutlier(latency, threshold=2.5)

Returns _median, _mad, _modified_z, and _is_outlier (1 if |_modified_z| > threshold). Default threshold: 3.5. Filter outliers with:

* | madOutlier(response_time) | _is_outlier = 1

Analyze Fields¶

Compute statistics across all or selected fields:

* | analyzeFields()
* | analyzeFields(response_time, bytes, limit=100000)

Returns per-field stats: field_name, _events, _distinct_vals, _mean, _min, _max, _stdev. Default scan limit: 50,000 rows (max: 200,000, use limit=max for maximum).

Group By¶

* | groupBy(image)
* | groupBy(image, user)

The default aggregation function is count(). Use function= to specify a different aggregation:

* | groupBy(user, function=sum(bytes))
* | groupBy(user, function=avg(response_time))

Counting Groups¶

Piping count() after groupBy() counts the number of groups (not per-group counts). Combine with singleval() to display as a single metric:

* | groupBy(computer_name) | count() | singleval()

Distinct Count with groupBy¶

* | groupBy(computer_name, function=count(field=user, unique=true))

Multiple Aggregations with groupBy¶

Use function=multi(...) to compute multiple aggregations per group:

* | groupBy(computer_name, function=multi(count(computer_name), count(user, unique=true), sum(bytes)))