Use the stats command and functions (2024)

This topic discusses how to use the statistical functions with the transforming commands chart, timechart, stats, eventstats, and streamstats.

  • For more information about the stat command and syntax, see the "stats" command in the Search Reference.
  • For the list of stats functions, see "Statistical and charting functions" in the Search Reference.

About the stats commands and functions

The stats, streamstats, and eventstats commands each enable you to calculate summary statistics on the results of a search or the events retrieved from an index. The stats command works on the search results as a whole. The streamstats command calculates statistics for each event at the time the event is seen, in a streaming manner. The eventstats command calculates statistics on all search results and adds the aggregation inline to each event for which it is relevant. See more about the differences between these commands in the next section.

The chart command returns your results in a data structure that supports visualization as a chart (such as a column, line, area, and pie chart). You can decide what field is tracked on the x-axis of the chart. The timechart command returns your results formatted as a time-series chart, where your data is plotted against an x-axis that is always a time field. Read more about visualization features and options in the Visualization Reference of the Data Visualization Manual.

The stats, chart, and timechart commands (and their related commands eventstats and streamstats) are designed to work in conjunction with statistical functions. The list of statistical functions lets you count the occurrence of a field and calculate sums, averages, ranges, and so on, of the field values.

For the list of statistical functions and how they're used, see "Statistical and charting functions" in the Search Reference.

Stats, eventstats, and streamstats

The eventstats and streamstats commands are variations on the stats command.

The stats command works on the search results as a whole and returns only the fields that you specify. For example, the following search returns a table with two columns (and 10 rows).

sourcetype=access_* | head 10 | stats sum(bytes) as ASumOfBytes by clientip

The ASumOfBytes and clientip fields are the only fields that exist after the stats command. For example, the following search returns empty cells in the bytes column because it is not a result field.

sourcetype=access_* | head 10 | stats sum(bytes) as ASumOfBytes by clientip | table bytes, ASumOfBytes, clientip

To see more fields other than ASumOfBytes and clientip in the results, you need to include them in the stats command. Also, if you want to perform calculations on any of the original fields in your raw events, you need to do that before the stats command.

The eventstats command computes the same statistics as the stats command, but it also aggregates the results to the original raw data. When you run the following search, it returns an events list instead of a results table, because the eventstats command does not change the raw data.

sourcetype=access_* | head 10 | eventstats sum(bytes) as ASumOfBytes by clientip

You can use the table command to format the results as a table that displays the fields you want. Now, you can also view the values of bytes (or any of the original fields in your raw events) in your results.

sourcetype=access_* | head 10 | eventstats sum(bytes) as ASumOfBytes by clientip | table bytes, ASumOfBytes, clientip

The streamstats command also aggregates the calculated statistics to the original raw event, but it does this at the time the event is seen. To demonstrate this, include the _time field in the earlier search and use streamstats.

sourcetype=access_* | head 10 | sort _time | streamstats sum(bytes) as ASumOfBytes by clientip | table _time, clientip, bytes, ASumOfBytes

Instead of a total sum for each clientip (as returned by stats and eventstats), this search calculates a sum for each event based on the time that it is seen. The streamstats command is useful for reporting on events at a known time range.

Examples

Example 1

This example creates a chart of how many new users go online each hour of the day.

... | sort _time | streamstats dc(userid) as dcusers | delta dcusers as deltadcusers | timechart sum(deltadcusers)

The dc (or distinct_count) function returns a count of the unique values of userid and renames the resulting field dcusers.

If you don't rename the function, for example "dc(userid) as dcusers", the resulting calculation is automatically saved to the function call, such as "dc(userid)".

The delta command is used to find the difference between the current and previous dcusers value. Then, the sum of this delta is charted over time.

Example 2

This example calculates the median for a field, then charts the count of events where the field has a value less than the median.

... | eventstats median(bytes) as medbytes | eval snap=if(bytes>=medbytes, bytes, "smaller") | timechart count by snap

Eventstats is used to calculate the median for all the values of bytes from the previous search.

Example 3

This example calculates the standard deviation and variance of calculated fields.

sourcetype=log4j ERROR earliest=-7d@d latest=@d | eval warns=errorGroup+"-"+errorNum | stats count as Date_Warns_Count by date_mday,warns | stats stdev(Date_Warns_Count), var(Date_Warns_Count) by warns

This search returns errors from the last 7 days and creates the new field, warns, from extracted fields errorGroup and errorNum. The stats command is used twice. First, it calculates the daily count of warns for each day. Then, it calculates the standard deviation and variance of that count per warns.

Example 4

You can use the calculated fields as filter parameters for your search.

sourcetype=access_* | eval URILen = len(useragent) | eventstats avg(URILen) as AvgURILen, stdev(URILen) as StdDevURILen| where URILen > AvgURILen+(2*StdDevURILen) | chart count by URILen span=10 cont=true

In this example, eventstats is used to calculate the average and standard deviation of the URI lengths from useragent. Then, these numbers are used as filters for the retrieved events.

Use the stats command and functions (2024)

FAQs

How to use the stats command in Splunk? ›

Getting Started with the Splunk tstats Command
  1. Aggregation Functions: Choose an appropriate aggregation function, such as count, sum, avg, min, or max, based on your analysis needs.
  2. Fields and Time Field: Specify the fields you want to analyze and the time field over which you want to aggregate data.
Sep 30, 2023

What is the function of the stats command? ›

Use this command to provide summary statistics, optionally grouped by a field. The output for this query includes one field for each of the fields specified in the query, along with one field for each aggregation.

What is by default the sort command lists results in Splunk? ›

By default, the sort command tries to automatically determine what it is sorting. If the field contains numeric values, the collating sequence is numeric. If the field contains IP address values, the collating sequence is for IP addresses.

When you use the stats command with a by clause, what is returned? ›

If the stats command is used without a BY clause, only one row is returned, which is the aggregation over the entire incoming result set. If a BY clause is used, one row is returned for each distinct value in the field specified in the BY clause.

How do you use stat command? ›

The 'stat' command in Linux is a powerful tool used to display detailed information about a file or file system. It is used with the syntax, stat [options] [file. txt or /path/to/directory] . In this example, we used the 'stat' command on 'myfile.

Which of the following are common functions used with stats command in Splunk? ›

Common aggregate functions include Average, Count, Minimum, Maximum, Standard Deviation, Sum, and Variance. Most aggregate functions are used with numeric fields.

What is the stat function call? ›

The stat() system call returns data on the size and parameters associated with a file. The call is issued by the ls -l command and other similar functions.

What are the command functions? ›

Command functions intend to change the state of the system. They create a Side Effect. Unlike queries, commands do not enquire as to the state of the system, and therefore no return value should be required — Commands return void .

How does the function command work? ›

With function files, you can group together multiple commands into a single file and execute all of them with a single command. Minecraft: Bedrock Edition cannot run more than 10,000 commands in one function call.

How do I sort values in Splunk? ›

The basic steps to create a custom sort order are:
  1. Use the eval command to create a new field, which we'll call sort_field.
  2. Use the case function to assign a number to each unique value and place those values in the sort_field.
  3. Use the sort command to sort the results based on the numbers in the sort_field.
Aug 29, 2019

What are the three default roles in Splunk? ›

The predefined roles are: admin : This role has the most capabilities. power : This role can edit all shared objects and alerts, tag events, and other similar tasks. user : This role can create and edit its own saved searches, run searches, edit preferences, create and edit event types, and other similar tasks.

How to use sort command? ›

If the File parameter specifies more than one file, the sort command concatenates the files and sorts them as one file. A -(minus sign) in place of a file name specifies standard input. If you do not specify any file names, the command sorts standard input. An output file can be specified with the -o flag.

What is the stats command in Splunk? ›

The stats command is a fundamental Splunk command. It will perform any number of statistical functions on a field, which could be as simple as a count or average, or something more advanced like a percentile or standard deviation.

What is the difference between stats and eventstats commands? ›

Eventstats calculates a statistical result same as stats command only difference is it does not create statistical results, it aggregates them to the original raw data. Streamstats command uses events before the current event to compute the aggregate statistics that are applied to each event.

What is the top function in Splunk? ›

The top command in Splunk serves as a tool for identifying the most frequent or highest-ranking values within a dataset. By specifying fields and criteria, users can pinpoint the top values, facilitating trend analysis, anomaly detection, and performance monitoring.

What is the difference between stats and chart command in Splunk? ›

Note that you can specify any number of "group by" fields to the stats command, whereas the chart/timechart command can only have one "group by" (with timechart it is always _time) and one "split by". This is why our first example was able to incorporate the "host" field easily whereas the second example did not.

What is the difference between stats and transaction commands in Splunk? ›

Both the stats command and the transaction command are similar in that they enable you to aggregate individual events together based on field values. The stats command is meant to calculate statistics on events grouped by one or more fields and discard the events (unless you are using eventstats or streamstats).

What is the difference between Tstats and stats command in Splunk? ›

tstats is faster than stats since tstats only looks at the indexed metadata (the . tsidx files in the buckets on the indexers) whereas stats is working off the data (in this case the raw events) before that command. Since tstats can only look at the indexed metadata it can only search fields that are in the metadata.

What is the metadata command in Splunk? ›

The metadata command is a generating command, which means it is the first command in a search. For those not fully up to speed on Splunk, there are certain fields that are written at index time. These fields are: _time, source (where the event originated; could be a filepath or a protocol/port value)

Top Articles
Latest Posts
Article information

Author: Roderick King

Last Updated:

Views: 5883

Rating: 4 / 5 (71 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Roderick King

Birthday: 1997-10-09

Address: 3782 Madge Knoll, East Dudley, MA 63913

Phone: +2521695290067

Job: Customer Sales Coordinator

Hobby: Gunsmithing, Embroidery, Parkour, Kitesurfing, Rock climbing, Sand art, Beekeeping

Introduction: My name is Roderick King, I am a cute, splendid, excited, perfect, gentle, funny, vivacious person who loves writing and wants to share my knowledge and understanding with you.