Using Splunk Streamstats to Calculate Alert Volume (2024)

By Josh Neubecker|Published On: November 10th, 2020|

Dynamic thresholding using standard deviation is a common method we used to detect anomalies in Splunk correlation searches. However, one of the pitfalls with this method is the difficulty in tuning these searches. This is where the wonderful streamstats command comes to the rescue.

This Splunk tutorial will cover why tuning standard deviation searches is different from using a static threshold, how to use streamstats, and how we can use streamstats to get immediate feedback on alert volume.

Tuning Using Streamstats

1. Understanding the problem

With a static threshold search that runs over 60 minutes, calculating alert volume over 30 days is as simple as running the count by 60 minutes over 30 days. This is different with a dynamic threshold.

Typically, a standard deviation search will calculate a threshold based on the last 7 to 30 days to compare against the last hour of data. Running the same search to see approximately how many notables would be generated in 30 days will calculate the threshold differently than when it runs as a correlation search.

When running a correlation search, the threshold is based on historical data. Using the same search to calculate the alert volume for the whole 30 days the threshold will be based on historical, current, and future data for any given hour but the last.

This is where we can use streamstats to calculate the threshold based on the last 30 days for any given hour.

Still confusing? Let’s take a look at a few examples.

2. What does streamstats even do?

To understand how we can do this, we need to understand how streamstats works. In my experience, streamstats is the most confusing of the stats commands. I find it’s easier to show than explain. Let’s start with a basic example using data from the makeresults command and work our way up.

Example 1: streamstats without options

Copy to Clipboard

Using Splunk Streamstats to Calculate Alert Volume (1)

The streamstats command will run statistics as events come in. In this case, counting how many times each color appears and generating an incremental count for our testing.

Example 2: streamstats with a window

Copy to Clipboard

Using Splunk Streamstats to Calculate Alert Volume (2)

With a window, streamstats will calculate statistics based on the number of events specified. In this case, streamstats looks at the current event and the previous. This causes the count by color to be 1 for each event because the previous event is always a different color. A common expectation with streamstats is that the window by default would be separate for each color. To do this, see our next example.

Example 3: streamstats with a windows and global=false

Copy to Clipboard

Using Splunk Streamstats to Calculate Alert Volume (3)

When global=false a separate window is kept for each color. This is the behavior we need for testing alert volume.

3. How can we use streamstats to help us?

Let’s take a simplified standard deviation search for finding an anomalous amount of failed logins.

Copy to Clipboard

Running this search over 7 days will count the number of failed authentications by src for each hour. Then it will calculate an upper bound based on the average and standard deviation of the counts for each hour by src. Finally, it will only show events where the failure count for the last hour was above the upper bound.

Removing the time constraint would show anomalies for the entire time frame, but the results will be different then when the search runs every hour. To get a more accurate representation, we need to use streamstats to look at the previous 7 days for each individual hour.

Copy to Clipboard

Changing eventstats to streamstats and specifying a window of 168 (168 hours in a week since our time span is 1 hour) with global=false will calculate the threshold for each event based on the last 7 days of counts. Constraining the time to the past 7 days in the where clause and running the search over 14 days will ensure there’s a full history for each event. To exclude the current event from the threshold simply set current=false if desired.

Bonus Example: Creating an alert volume test for ESCU SMB Traffic Spike

Below shows the original search–taken from Splunk’s Enterprise Security Content Update app:

Copy to Clipboard

  • Change stats to streamstats window=168 (for 7 days of history) and global=false
  • In this search, the calculations are done on (maxtime, “-70m@m”) so set current=false
  • Remove `max(eval(if(_time >= relative_time(maxtime, “-70m@m”), count, null))) as count`. We want to keep the original count from each event
  • Add the time constraint `_time>relative_time(now(), “-7d”)` and run over 14 days

Putting all that together, here is the search:

Copy to Clipboard

Using this, we can change the threshold, filter noise, and immediately see what our changes will do.

Conclusion

Using this method, you can immediately see how many alerts would be generated from a standard deviation search. This allows for instant feedback to tune out sources, users, or failure reasons. Hopefully, this post helped you have a better understanding of how you can apply streamstats when creating and tuning alerts as well as other applications.

Share with your network!

About Hurricane Labs

Hurricane Labs is a dynamic Managed Services Provider that unlocks the potential of Splunk and security for diverse enterprises across the United States. With a dedicated, Splunk-focused team and an emphasis on humanity and collaboration, we provide the skills, resources, and results to help make our customers’ lives easier.

For more information, visitwww.hurricanelabs.comand follow us on Twitter@hurricanelabs.

Using Splunk Streamstats to Calculate Alert Volume (4)

Using Splunk Streamstats to Calculate Alert Volume (2024)

FAQs

What is the use of Streamstats in Splunk? ›

The streamstats command

Streamstats builds upon the basics of the stats command but it provides a way for statistics to be generated as each event is seen. This can be very useful for things like running totals or looking for averages as data is coming into the result set.

What is the difference between stats and eventstats in Splunk? ›

Eventstats calculates a statistical result same as stats command only difference is it does not create statistical results, it aggregates them to the original raw data. Streamstats command uses events before the current event to compute the aggregate statistics that are applied to each event.

What is the stream stats command? ›

The SPL2 streamstats command adds a cumulative statistical value to each search result as each result is processed.

What is the index volume limit for Splunk? ›

" If you go over 500MB/day more than 3 times in a 30 day period, Splunk will continue to index your data, but search will be disabled until you are back down to 3 or fewer times in the 30 day period. "

What is StreamStats used for? ›

StreamStats provides estimates of various streamflow statistics for user-selected sites by solving equations that were developed through a process known as regionalization.

How does Splunk stream work? ›

Stream collects network data and forwards it to Splunk Enterprise or Splunk Cloud. Stream does not analyze logs. If you can use a UF to send logs to Splunk then you don't need Stream.

What are the 4 types of searches in Splunk by performance? ›

How search types affect Splunk Enterprise performance
Search typeRef. indexer throughputPerformance impact
DenseUp to 50,000 matching events per second.CPU-bound
SparseUp to 5,000 matching events per second.CPU-bound
Super-sparseUp to 2 seconds per index bucket.I/O bound
RareFrom 10 to 50 index buckets per second.I/O bound

What is the difference between report and alert in Splunk? ›

A report can be used in a dashboard. It does have to trigger anything. An alert is based on a scheduled saved search that whenever certain conditions are overcome, generates one or more actions to be executed.

How does stats work in Splunk? ›

The SPL2 stats command calculates aggregate statistics, such as average, count, and sum, over the incoming search results set. This is similar to SQL aggregation. If the stats command is used without a BY clause, only one row is returned, which is the aggregation over the entire incoming result set.

What is the formula for the stream? ›

Stream's Speed = ½ (Downstream Speed – Upstream Speed) Boat's Speed (Still Water) = ½ (Downstream Speed + Upstream Speed)

What is the formula for the stream function? ›

(10.6) u r = 1 r 2 sin θ ∂ ψ ∂ r u θ = 1 r sin θ ∂ ψ ∂ r , where u and v are the velocity vectors in the x and y directions, respectively, ψ is the stream function, r is the distance from the origin, and θ is the polar angle measured from the x-axis.

What is the alert limit for Splunk? ›

How many recipients can get an email alert through Splunk? What is the limit? In default it is 100(Number of recipients). maximum we can increase till 10000.

What is indexed search volume? ›

Search volume index refers to the measure of how frequently a specific keyword is searched on search engines during a given period. It provides insights into the popularity and demand for a particular keyword.

What is the maximum number of indexes in Splunk? ›

You can create as many as you want, however more indexes do not mean better performance. If you keep your data in many different indexes it's rather the opposite, as if you don't specify a specific index in your search Splunk will need to open each index to check if events that you're searching for are in there.

What are streaming commands in Splunk? ›

A streaming command applies a transformation to each event returned by a search. For example, the rex command is streaming because it extracts and adds fields to events at search time.

What is the use of Splunk monitoring tool? ›

Splunk's software can be used to examine, monitor, and search for machine-generated big data through a browser-like interface. It makes searching for a particular piece of data quick and easy, and more importantly, does not require a database to store data as it uses indexes for storage.

Can Splunk track user activity? ›

The User Activity dashboard displays panels representing common risk-generating user activities such as suspicious website activity. For more information about risk scoring, see How Splunk Enterprise Security assigns risk scores.

What is the purpose of Splunk index? ›

The index is the repository for Splunk Enterprise data. Splunk Enterprise transforms incoming data into events, which it stores in indexes. An indexer is a Splunk Enterprise instance that indexes data.

Top Articles
Latest Posts
Article information

Author: Patricia Veum II

Last Updated:

Views: 5869

Rating: 4.3 / 5 (64 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Patricia Veum II

Birthday: 1994-12-16

Address: 2064 Little Summit, Goldieton, MS 97651-0862

Phone: +6873952696715

Job: Principal Officer

Hobby: Rafting, Cabaret, Candle making, Jigsaw puzzles, Inline skating, Magic, Graffiti

Introduction: My name is Patricia Veum II, I am a vast, combative, smiling, famous, inexpensive, zealous, sparkling person who loves writing and wants to share my knowledge and understanding with you.