Loading

Elasticsearch output plugin v10.5.1

Stack

  • Plugin version: v10.5.1
  • Released on: 2020-04-30
  • Changelog

For other versions, see the overview list.

To learn more about Logstash, see the Logstash Reference.

For questions about the plugin, open a topic in the Discuss forums. For bugs or feature requests, open an issue in Github. For the list of Elastic supported plugins, please consult the Elastic Support Matrix.

If you plan to use the Kibana web interface to analyze data transformed by Logstash, use the Elasticsearch output plugin to get your data into Elasticsearch.

This output only speaks the HTTP protocol as it is the preferred protocol for interacting with Elasticsearch. In previous versions it was possible to communicate with Elasticsearch through the transport protocol, which is now reserved for internal cluster communication between nodes communication between nodes. Using the transport protocol to communicate with the cluster has been deprecated in Elasticsearch 7.0.0 and will be removed in 8.0.0

You can learn more about Elasticsearch at https://www.elastic.co/products/elasticsearch

Compatibility Note

When connected to Elasticsearch 7.x, modern versions of this plugin use the required _doc document-type when inserting documents.

If you are using an earlier version of Logstash and wish to connect to Elasticsearch 7.x, first upgrade Logstash to version 6.8 to ensure it picks up changes to the Elasticsearch index template.

If you are using a custom template, ensure your template uses the _doc document-type before connecting to Elasticsearch 7.x.

You can run Elasticsearch on your own hardware, or use our hosted Elasticsearch Service on Elastic Cloud. The Elasticsearch Service is available on AWS, Google Cloud Platform, and Microsoft Azure. Try the Elasticsearch Service for free.

You cannot use dynamic variable substitution when ilm_enabled is true and when using ilm_rollover_alias.

If you’re sending events to the same Elasticsearch cluster, but you’re targeting different indices you can:

  • use different Elasticsearch outputs, each one with a different value for the index parameter
  • use one Elasticsearch output and use the dynamic variable substitution for the index parameter

Each Elasticsearch output is a new client connected to the cluster:

  • it has to initialize the client and connect to Elasticsearch (restart time is longer if you have more clients)
  • it has an associated connection pool

In order to minimize the number of open connections to Elasticsearch, maximize the bulk size and reduce the number of "small" bulk requests (which could easily fill up the queue), it is usually more efficient to have a single Elasticsearch output.

Example:

output {
  elasticsearch {
    index => "%{[some_field][sub_field]}-%{+YYYY.MM.dd}"
  }
}
		

What to do in case there is no field in the event containing the destination index prefix?

You can use the mutate filter and conditionals to add a [@metadata] field (see https://www.elastic.co/guide/en/logstash/7.8/event-dependent-configuration.html#metadata) to set the destination index for each event. The [@metadata] fields will not be sent to Elasticsearch.

Example:

filter {
  if [log_type] in [ "test", "staging" ] {
    mutate { add_field => { "[@metadata][target_index]" => "test-%{+YYYY.MM}" } }
  } else if [log_type] == "production" {
    mutate { add_field => { "[@metadata][target_index]" => "prod-%{+YYYY.MM.dd}" } }
  } else {
    mutate { add_field => { "[@metadata][target_index]" => "unknown-%{+YYYY}" } }
  }
}
output {
  elasticsearch {
    index => "%{[@metadata][target_index]}"
  }
}
		

The retry policy has changed significantly in the 8.1.1 release. This plugin uses the Elasticsearch bulk API to optimize its imports into Elasticsearch. These requests may experience either partial or total failures. The bulk API sends batches of requests to an HTTP endpoint. Error codes for the HTTP request are handled differently than error codes for individual documents.

HTTP requests to the bulk API are expected to return a 200 response code. All other response codes are retried indefinitely.

The following document errors are handled as follows:

  • 400 and 404 errors are sent to the dead letter queue (DLQ), if enabled. If a DLQ is not enabled, a log message will be emitted, and the event will be dropped. See DLQ Policy for more info.
  • 409 errors (conflict) are logged as a warning and dropped.

Note that 409 exceptions are no longer retried. Please set a higher retry_on_conflict value if you experience 409 exceptions. It is more performant for Elasticsearch to retry these exceptions than this plugin.

Mapping (404) errors from Elasticsearch can lead to data loss. Unfortunately mapping errors cannot be handled without human intervention and without looking at the field that caused the mapping mismatch. If the DLQ is enabled, the original events causing the mapping errors are stored in a file that can be processed at a later time. Often times, the offending field can be removed and re-indexed to Elasticsearch. If the DLQ is not enabled, and a mapping error happens, the problem is logged as a warning, and the event is dropped. See dead-letter-queues for more information about processing events in the DLQ.

The Index Lifecycle Management feature requires plugin version 9.3.1 or higher.

This feature requires an Elasticsearch instance of 6.6.0 or higher with at least a Basic license

Logstash can use Index Lifecycle Management to automate the management of indices over time.

The use of Index Lifecycle Management is controlled by the ilm_enabled setting. By default, this setting detects whether the Elasticsearch instance supports ILM, and uses it if it is available. ilm_enabled can also be set to true or false to override the automatic detection, or disable ILM.

This will overwrite the index settings and adjust the Logstash template to write the necessary settings for the template to support index lifecycle management, including the index policy and rollover alias to be used.

Logstash will create a rollover alias for the indices to be written to, including a pattern for how the actual indices will be named, and unless an ILM policy that already exists has been specified, a default policy will also be created. The default policy is configured to rollover an index when it reaches either 50 gigabytes in size, or is 30 days old, whichever happens first.

The default rollover alias is called logstash, with a default pattern for the rollover index of {now/d}-00001, which will name indices on the date that the index is rolled over, followed by an incrementing number. Note that the pattern must end with a dash and a number that will be incremented.

See the Rollover API documentation for more details on naming.

The rollover alias, ilm pattern and policy can be modified.

See config below for an example:

output {
  elasticsearch {
    ilm_rollover_alias => "custom"
    ilm_pattern => "000001"
    ilm_policy => "custom_policy"
  }
}
		

Custom ILM policies must already exist on the Elasticsearch cluster before they can be used.

If the rollover alias or pattern is modified, the index template will need to be overwritten as the settings index.lifecycle.name and index.lifecycle.rollover_alias are automatically written to the template

If the index property is supplied in the output definition, it will be overwritten by the rollover alias.

This plugin attempts to send batches of events as a single request. However, if a request exceeds 20MB we will break it up into multiple batch requests. If a single document exceeds 20MB it will be sent as a single request.

This plugin uses the JVM to lookup DNS entries and is subject to the value of networkaddress.cache.ttl, a global setting for the JVM.

As an example, to set your DNS TTL to 1 second you would set the LS_JAVA_OPTS environment variable to -Dnetworkaddress.cache.ttl=1.

Keep in mind that a connection with keepalive enabled will not reevaluate its DNS value while the keepalive is in effect.

This plugin supports request and response compression. Response compression is enabled by default and for Elasticsearch versions 5.0 and later, the user doesn’t have to set any configs in Elasticsearch for it to send back compressed response. For versions before 5.0, http.compression must be set to true in Elasticsearch to take advantage of response compression when using this plugin

For requests compression, regardless of the Elasticsearch version, users have to enable http_compression setting in their Logstash config file.

Authentication to a secure Elasticsearch cluster is possible using one of the user/password, cloud_auth or api_key options.

This plugin supports the following configuration options plus the Common options described later.

Setting Input type Required
action string No
api_key password No
bulk_path string No
cacert a valid filesystem path No
cloud_auth password No
cloud_id string No
custom_headers hash No
doc_as_upsert boolean No
document_id string No
document_type string No
failure_type_logging_whitelist array No
healthcheck_path string No
hosts uri No
http_compression boolean No
ilm_enabled string, one of ["true", "false", "auto"] No
ilm_pattern string No
ilm_policy string No
ilm_rollover_alias string No
index string No
keystore a valid filesystem path No
keystore_password password No
manage_template boolean No
parameters hash No
parent string No
password password No
path string No
pipeline string No
pool_max number No
pool_max_per_route number No
proxy uri No
resurrect_delay number No
retry_initial_interval number No
retry_max_interval number No
retry_on_conflict number No
routing string No
script string No
script_lang string No
script_type string, one of ["inline", "indexed", "file"] No
script_var_name string No
scripted_upsert boolean No
sniffing boolean No
sniffing_delay number No
sniffing_path string No
ssl boolean No
ssl_certificate_verification boolean No
template a valid filesystem path No
template_name string No
template_overwrite boolean No
timeout number No
truststore a valid filesystem path No
truststore_password password No
upsert string No
user string No
validate_after_inactivity number No
version string No
version_type string, one of ["internal", "external", "external_gt", "external_gte", "force"] No

Also see Common options for a list of options supported by all output plugins.

  • Value type is string
  • Default value is "index"

Protocol agnostic (i.e. non-http, non-java specific) configs go here Protocol agnostic methods The Elasticsearch action to perform. Valid actions are:

  • index: indexes a document (an event from Logstash).
  • delete: deletes a document by id (An id is required for this action)
  • create: indexes a document, fails if a document by that id already exists in the index.
  • update: updates a document by id. Update has a special case where you can upsert—update a document if not already present. See the doc_as_upsert option. NOTE: This does not work and is not supported in Elasticsearch 1.x. Please upgrade to ES 2.x or greater to use this feature with Logstash!
  • A sprintf style string to change the action based on the content of the event. The value %{[foo]} would use the foo field for the action

For more details on actions, check out the Elasticsearch bulk API documentation

  • Value type is password
  • There is no default value for this setting.

Authenticate using Elasticsearch API key. Note that this option also requires enabling the ssl option.

Format is id:api_key where id and api_key are as returned by the Elasticsearch Create API key API.

  • Value type is string
  • There is no default value for this setting.

HTTP Path to perform the _bulk requests to this defaults to a concatenation of the path parameter and "_bulk"

  • Value type is path
  • There is no default value for this setting.

The .cer or .pem file to validate the server’s certificate

  • Value type is password
  • There is no default value for this setting.

Cloud authentication string ("<username>:<password>" format) is an alternative for the user/password pair.

For more details, check out the Logstash-to-Cloud documentation

  • Value type is string
  • There is no default value for this setting.

Cloud ID, from the Elastic Cloud web console. If set hosts should not be used.

For more details, check out the Logstash-to-Cloud documentation

  • Value type is boolean
  • Default value is false

Enable doc_as_upsert for update mode. Create a new document with source if document_id doesn’t exist in Elasticsearch

  • Value type is string
  • There is no default value for this setting.

The document ID for the index. Useful for overwriting existing entries in Elasticsearch with the same ID.

  • Value type is string
  • There is no default value for this setting.
  • This option is deprecated

This option is deprecated due to the removal of types in Elasticsearch 6.0. It will be removed in the next major version of Logstash.

This value is ignored and has no effect for Elasticsearch clusters 8.x.

This sets the document type to write events to. Generally you should try to write only similar events to the same type. String expansion %{foo} works here. If you don’t set a value for this option:

  • for elasticsearch clusters 8.x: no value will be used;
  • for elasticsearch clusters 7.x: the value of _doc will be used;
  • for elasticsearch clusters 6.x: the value of doc will be used;
  • for elasticsearch clusters 5.x and below: the event’s type field will be used, if the field is not present the value of doc will be used.
  • Value type is array
  • Default value is []

Set the Elasticsearch errors in the whitelist that you don’t want to log. A useful example is when you want to skip all 409 errors which are document_already_exists_exception.

  • Value type is hash
  • There is no default value for this setting.

Pass a set of key value pairs as the headers sent in each request to an elasticsearch node. The headers will be used for any kind of request (_bulk request, template installation, health checks and sniffing). These custom headers will be overidden by settings like http_compression.

  • Value type is string
  • There is no default value for this setting.

HTTP Path where a HEAD request is sent when a backend is marked down the request is sent in the background to see if it has come back again before it is once again eligible to service requests. If you have custom firewall rules you may need to change this

  • Value type is uri
  • Default value is [//127.0.0.1]

Sets the host(s) of the remote instance. If given an array it will load balance requests across the hosts specified in the hosts parameter. Remember the http protocol uses the http address (eg. 9200, not 9300).

Examples:

`"127.0.0.1"`
`["127.0.0.1:9200","127.0.0.2:9200"]`
`["http://127.0.0.1"]`
`["https://127.0.0.1:9200"]`
`["https://127.0.0.1:9200/mypath"]` (If using a proxy on a subpath)
		

It is important to exclude dedicated master nodes from the hosts list to prevent LS from sending bulk requests to the master nodes. So this parameter should only reference either data or client nodes in Elasticsearch.

Any special characters present in the URLs here MUST be URL escaped! This means # should be put in as %23 for instance.

  • Value type is boolean
  • Default value is false

Enable gzip compression on requests. Note that response compression is on by default for Elasticsearch v5.0 and beyond

  • Value can be any of: true, false, auto
  • Default value is auto

The default setting of auto will automatically enable the Index Lifecycle Management feature, if the Elasticsearch cluster is running Elasticsearch version 7.0.0 or higher with the ILM feature enabled, and disable it otherwise.

Setting this flag to false will disable the Index Lifecycle Management feature, even if the Elasticsearch cluster supports ILM. Setting this flag to true will enable Index Lifecycle Management feature, if the Elasticsearch cluster supports it. This is required to enable Index Lifecycle Management on a version of Elasticsearch earlier than version 7.0.0.

This feature requires a Basic License or above to be installed on an Elasticsearch cluster version 6.6.0 or later

  • Value type is string
  • Default value is {now/d}-000001

Pattern used for generating indices managed by Index Lifecycle Management. The value specified in the pattern will be appended to the write alias, and incremented automatically when a new index is created by ILM.

Date Math can be used when specifying an ilm pattern, see Rollover API docs for details

Updating the pattern will require the index template to be rewritten

The pattern must finish with a dash and a number that will be automatically incremented when indices rollover.

  • Value type is string
  • Default value is logstash

Modify this setting to use a custom Index Lifecycle Management policy, rather than the default. If this value is not set, the default policy will be automatically installed into Elasticsearch

If this setting is specified, the policy must already exist in Elasticsearch cluster.

  • Value type is string
  • Default value is logstash

The rollover alias is the alias where indices managed using Index Lifecycle Management will be written to.

If both index and ilm_rollover_alias are specified, ilm_rollover_alias takes precedence.

Updating the rollover alias will require the index template to be rewritten

ilm_rollover_alias does NOT support dynamic variable substitution as index does.

  • Value type is string
  • Default value is "logstash-%{+yyyy.MM.dd}"

The index to write events to. This can be dynamic using the %{foo} syntax. The default value will partition your indices by day so you can more easily delete old data or only search specific date ranges. Indexes may not contain uppercase characters. For weekly indexes ISO 8601 format is recommended, eg. logstash-%{+xxxx.ww}. LS uses Joda to format the index pattern from event timestamp. Joda formats are defined here.

  • Value type is path
  • There is no default value for this setting.

The keystore used to present a certificate to the server. It can be either .jks or .p12

  • Value type is password
  • There is no default value for this setting.

Set the keystore password

  • Value type is boolean
  • Default value is true

From Logstash 1.3 onwards, a template is applied to Elasticsearch during Logstash’s startup if one with the name template_name does not already exist. By default, the contents of this template is the default template for logstash-%{+YYYY.MM.dd} which always matches indices based on the pattern logstash-*. Should you require support for other index names, or would like to change the mappings in the template in general, a custom template can be specified by setting template to the path of a template file.

Setting manage_template to false disables this feature. If you require more control over template creation, (e.g. creating indices dynamically based on field names) you should set manage_template to false and use the REST API to apply your templates manually.

  • Value type is hash
  • There is no default value for this setting.

Pass a set of key value pairs as the URL query string. This query string is added to every host listed in the hosts configuration. If the hosts list contains urls that already have query strings, the one specified here will be appended.

  • Value type is string
  • Default value is nil

For child documents, ID of the associated parent. This can be dynamic using the %{foo} syntax.

  • Value type is password
  • There is no default value for this setting.

Password to authenticate to a secure Elasticsearch cluster

  • Value type is string
  • There is no default value for this setting.

HTTP Path at which the Elasticsearch server lives. Use this if you must run Elasticsearch behind a proxy that remaps the root path for the Elasticsearch HTTP API lives. Note that if you use paths as components of URLs in the hosts field you may not also set this field. That will raise an error at startup

  • Value type is string
  • Default value is nil

Set which ingest pipeline you wish to execute for an event. You can also use event dependent configuration here like pipeline => "%{INGEST_PIPELINE}"

  • Value type is number
  • Default value is 1000

While the output tries to reuse connections efficiently we have a maximum. This sets the maximum number of open connections the output will create. Setting this too low may mean frequently closing / opening connections which is bad.

  • Value type is number
  • Default value is 100

While the output tries to reuse connections efficiently we have a maximum per endpoint. This sets the maximum number of open connections per endpoint the output will create. Setting this too low may mean frequently closing / opening connections which is bad.

  • Value type is uri
  • There is no default value for this setting.

Set the address of a forward HTTP proxy. This setting accepts only URI arguments to prevent leaking credentials. An empty string is treated as if proxy was not set. This is useful when using environment variables e.g. proxy => '${LS_PROXY:}'.

  • Value type is number
  • Default value is 5

How frequently, in seconds, to wait between resurrection attempts. Resurrection is the process by which backend endpoints marked down are checked to see if they have come back to life

  • Value type is number
  • Default value is 2

Set initial interval in seconds between bulk retries. Doubled on each retry up to retry_max_interval

  • Value type is number
  • Default value is 64

Set max interval in seconds between bulk retries.

  • Value type is number
  • Default value is 1

The number of times Elasticsearch should internally retry an update/upserted document.

  • Value type is string
  • There is no default value for this setting.

A routing override to be applied to all processed events. This can be dynamic using the %{foo} syntax.

  • Value type is string
  • Default value is ""

Set script name for scripted update mode

Example:

output {
  elasticsearch {
    script => "ctx._source.message = params.event.get('message')"
  }
}
		
  • Value type is string
  • Default value is "painless"

Set the language of the used script. If not set, this defaults to painless in ES 5.0. When using indexed (stored) scripts on Elasticsearch 6 and higher, you must set this parameter to "" (empty string).

  • Value can be any of: inline, indexed, file
  • Default value is ["inline"]

Define the type of script referenced by "script" variable inline : "script" contains inline script indexed : "script" contains the name of script directly indexed in elasticsearch file : "script" contains the name of script stored in elasticsearch’s config directory

  • Value type is string
  • Default value is "event"

Set variable name passed to script (scripted update)

  • Value type is boolean
  • Default value is false

if enabled, script is in charge of creating non-existent document (scripted update)

  • Value type is boolean
  • Default value is false

This setting asks Elasticsearch for the list of all cluster nodes and adds them to the hosts list. For Elasticsearch 1.x and 2.x any nodes with http.enabled (on by default) will be added to the hosts list, including master-only nodes! For Elasticsearch 5.x and 6.x any nodes with http.enabled (on by default) will be added to the hosts list, excluding master-only nodes.

  • Value type is number
  • Default value is 5

How long to wait, in seconds, between sniffing attempts

  • Value type is string
  • There is no default value for this setting.

HTTP Path to be used for the sniffing requests the default value is computed by concatenating the path value and "_nodes/http" if sniffing_path is set it will be used as an absolute path do not use full URL here, only paths, e.g. "/sniff/_nodes/http"

  • Value type is boolean
  • There is no default value for this setting.

Enable SSL/TLS secured communication to Elasticsearch cluster. Leaving this unspecified will use whatever scheme is specified in the URLs listed in hosts. If no explicit protocol is specified plain HTTP will be used. If SSL is explicitly disabled here the plugin will refuse to start if an HTTPS URL is given in hosts

  • Value type is boolean
  • Default value is true

Option to validate the server’s certificate. Disabling this severely compromises security. For more information on disabling certificate verification please read https://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf

  • Value type is path
  • There is no default value for this setting.

You can set the path to your own template here, if you so desire. If not set, the included template will be used.

  • Value type is string
  • Default value is "logstash"

This configuration option defines how the template is named inside Elasticsearch. Note that if you have used the template management features and subsequently change this, you will need to prune the old template manually, e.g.

curl -XDELETE <http://localhost:9200/_template/OldTemplateName?pretty>

where OldTemplateName is whatever the former setting was.

  • Value type is boolean
  • Default value is false

The template_overwrite option will always overwrite the indicated template in Elasticsearch with either the one indicated by template or the included one. This option is set to false by default. If you always want to stay up to date with the template provided by Logstash, this option could be very useful to you. Likewise, if you have your own template file managed by puppet, for example, and you wanted to be able to update it regularly, this option could help there as well.

Please note that if you are using your own customized version of the Logstash template (logstash), setting this to true will make Logstash to overwrite the "logstash" template (i.e. removing all customized settings)

  • Value type is number
  • Default value is 60

Set the timeout, in seconds, for network operations and requests sent Elasticsearch. If a timeout occurs, the request will be retried.

  • Value type is path
  • There is no default value for this setting.

The truststore to validate the server’s certificate. It can be either .jks or .p12. Use either :truststore or :cacert.

  • Value type is password
  • There is no default value for this setting.

Set the truststore password

  • Value type is string
  • Default value is ""

Set upsert content for update mode. Create a new document with this parameter as json string if document_id doesn’t exists

  • Value type is string
  • There is no default value for this setting.

Username to authenticate to a secure Elasticsearch cluster

  • Value type is number
  • Default value is 10000

How long to wait before checking if the connection is stale before executing a request on a connection using keepalive. You may want to set this lower, if you get connection errors regularly Quoting the Apache commons docs (this client is based Apache Commmons): Defines period of inactivity in milliseconds after which persistent connections must be re-validated prior to being leased to the consumer. Non-positive value passed to this method disables connection validation. This check helps detect connections that have become stale (half-closed) while kept inactive in the pool. See these docs for more info

  • Value type is string
  • There is no default value for this setting.

The version to use for indexing. Use sprintf syntax like %{my_version} to use a field value here. See https://www.elastic.co/blog/elasticsearch-versioning-support.

  • Value can be any of: internal, external, external_gt, external_gte, force
  • There is no default value for this setting.

The version_type to use for indexing. See https://www.elastic.co/blog/elasticsearch-versioning-support. See also https://www.elastic.co/guide/en/elasticsearch/reference/7.8/docs-index_.html#_version_types

These configuration options are supported by all output plugins:

Setting Input type Required
enable_metric boolean No
id string No
  • Value type is boolean
  • Default value is true

Disable or enable metric logging for this specific plugin instance. By default we record all the metrics we can, but you can disable metrics collection for a specific plugin.

  • Value type is string
  • There is no default value for this setting.

Add a unique ID to the plugin configuration. If no ID is specified, Logstash will generate one. It is strongly recommended to set this ID in your configuration. This is particularly useful when you have two or more plugins of the same type. For example, if you have 2 elasticsearch outputs. Adding a named ID in this case will help in monitoring Logstash when using the monitoring APIs.

output {
  elasticsearch {
    id => "my_plugin_id"
  }
}