Loading

Grok processor

Serverless Stack GA 9.2.0

The grok processor parses unstructured log messages using a set of predefined patterns to match the log messages and extract the fields. The grok processor is very powerful and can parse a wide variety of log formats.

You can provide multiple patterns to the grok processor. The grok processor tries to match the log message against each pattern in the order they are provided. If a pattern matches, it extracts the fields and the remaining patterns won't be used.

If a pattern doesn't match, the grok processor tries the next pattern. If no patterns match, the Grok processor will fail and you can troubleshoot the issue. Instead of writing grok patterns, you can have Streams generate patterns for you. Refer to generate patterns for more information.

Tip

To improve pipeline performance, start with the most common patterns first, then add more specific patterns. This reduces the number times the grok processor has to run.

To parse a log message with a grok processor:

  1. Set the Source Field to the field you want to search for grok matches.
  2. Set the patterns you want to use in the Grok patterns field. Refer to the example pattern for more information on patterns.

This functionality uses the Elasticsearch Grok pipeline processor. Refer to the Grok processor Elasticsearch documentation for more information.

Grok patterns are defined in the following format:

{
  "MY_DATE": "%{YEAR}-%{MONTHNUM}-%{MONTHDAY}"
}
		

Where MY_DATE is the name of the pattern. The previous pattern can then be used in the processor.

%{MY_DATE:date}
		
Note

Requires an LLM Connector to be configured.

Instead of writing the Grok patterns by hand, you can use the Generate Patterns button to generate the patterns for you.

Generated patterns work best on semi-structured data. For very custom logs with a lot of text, creating patterns manually generally creates more accurate results.

To add a generated grok pattern:

  1. Select CreateCreate processor.
  2. Select Grok from the Processor menu.
  3. Select Generate pattern.
  4. Select Accept to add a generated pattern to the list of patterns used by the grok processor.

Under the hood, the 100 samples on the right side are grouped into categories of similar messages. For each category, a Grok pattern is generated by sending a few samples to the LLM. Matching patterns are then shown in the UI.

Warning

This can incur additional costs, depending on the LLM connector you are using. Typically a single iteration uses between 1000 and 5000 tokens depending on the number of identified categories and the length of the messages.