`dedup`, `head` & `fields`

Once data is off disk, these commands trim the result set further down the pipeline — removing duplicate rows, capping how many rows you keep, and dropping columns you don't need. They're filters too, just operating on the in-flight table rather than the index.

dedup — remove duplicate rows

dedup removes results that repeat a value (or combination of values) you specify, keeping the first occurrence:

... | dedup clientip                ← one row per client IP
... | dedup host, sourcetype        ← one row per host+sourcetype pair

By default it keeps the first event per group in search order. Useful for "show me the distinct X" without a full stats.

head / tail — cap the row count

head N returns the first N results; tail N returns the last N (in reverse order):

... | head 20                       ← first 20 rows
... | tail 20                       ← last 20 rows

head is also a cheap way to sanity-check a search on a small sample before running it over everything.

fields — remove columns

fields trims columns. + keeps only the named fields; - removes them:

... | fields + host, ip             ← keep only host and ip, in that order
... | fields - _raw, punct          ← drop noisy columns

Drop fields early for speed

Removing large unused fields (like _raw) early in the pipeline reduces the data each later command has to carry. It's a filtering optimization, not just cosmetics.

Where these sit in the pipeline

These come after your base filters and time range have already done the heavy lifting:

index=web sourcetype=access_combined earliest=-1h   ← stage 1: filter off disk
| dedup clientip                                    ← trim duplicate rows
| fields + clientip, uri_path, status               ← keep only what you need
| head 100                                          ← cap the sample

That's the end of the filter stage. From here you move on to transforming the data.

dedup — remove duplicate rows​

head / tail — cap the row count​

fields — remove columns​

Where these sit in the pipeline​

dedup — remove duplicate rows

head / tail — cap the row count

fields — remove columns

Where these sit in the pipeline