= kafka :type: output :status: stable :categories: ["Services"] //// THIS FILE IS AUTOGENERATED! To make changes, edit the corresponding source file under: https://github.com/redpanda-data/connect/tree/main/internal/impl/. And: https://github.com/redpanda-data/connect/tree/main/cmd/tools/docs_gen/templates/plugin.adoc.tmpl //// // © 2024 Redpanda Data Inc. component_type_dropdown::[] The kafka output type writes a batch of messages to Kafka brokers and waits for acknowledgement before propagating it back to the input. [tabs] ====== Common:: + -- ```yml # Common config fields, showing default values output: label: "" kafka: addresses: [] # No default (required) topic: "" # No default (required) target_version: 2.1.0 # No default (optional) key: "" partitioner: fnv1a_hash compression: none static_headers: {} # No default (optional) metadata: exclude_prefixes: [] max_in_flight: 64 batching: count: 0 byte_size: 0 period: "" check: "" ``` -- Advanced:: + -- ```yml # All config fields, showing default values output: label: "" kafka: addresses: [] # No default (required) tls: enabled: false skip_cert_verify: false enable_renegotiation: false root_cas: "" root_cas_file: "" client_certs: [] sasl: mechanism: none user: "" password: "" access_token: "" token_cache: "" token_key: "" topic: "" # No default (required) client_id: benthos target_version: 2.1.0 # No default (optional) rack_id: "" key: "" partitioner: fnv1a_hash partition: "" custom_topic_creation: enabled: false partitions: -1 replication_factor: -1 compression: none static_headers: {} # No default (optional) metadata: exclude_prefixes: [] inject_tracing_map: meta = @.merge(this) # No default (optional) max_in_flight: 64 idempotent_write: false ack_replicas: false max_msg_bytes: 1000000 timeout: 5s retry_as_batch: false batching: count: 0 byte_size: 0 period: "" check: "" processors: [] # No default (optional) max_retries: 0 backoff: initial_interval: 3s max_interval: 10s max_elapsed_time: 30s timestamp: ${! timestamp_unix() } # No default (optional) ``` -- ====== The config field `ack_replicas` determines whether we wait for acknowledgement from all replicas or just a single broker. Both the `key` and `topic` fields can be dynamically set using function interpolations described in xref:configuration:interpolation.adoc#bloblang-queries[Bloblang queries]. xref:configuration:metadata.adoc[Metadata] will be added to each message sent as headers (version 0.11+), but can be restricted using the field <>. == Strict ordering and retries When strict ordering is required for messages written to topic partitions it is important to ensure that both the field `max_in_flight` is set to `1` and that the field `retry_as_batch` is set to `true`. You must also ensure that failed batches are never rerouted back to the same output. This can be done by setting the field `max_retries` to `0` and `backoff.max_elapsed_time` to empty, which will apply back pressure indefinitely until the batch is sent successfully. However, this also means that manual intervention will eventually be required in cases where the batch cannot be sent due to configuration problems such as an incorrect `max_msg_bytes` estimate. A less strict but automated alternative would be to route failed batches to a dead letter queue using a xref:components:outputs/fallback.adoc[`fallback` broker], but this would allow subsequent batches to be delivered in the meantime whilst those failed batches are dealt with. == Troubleshooting If you're seeing issues writing to or reading from Kafka with this component then it's worth trying out the newer xref:components:outputs/kafka_franz.adoc[`kafka_franz` output]. - I'm seeing logs that report `Failed to connect to kafka: kafka: client has run out of available brokers to talk to (Is your cluster reachable?)`, but the brokers are definitely reachable. Unfortunately this error message will appear for a wide range of connection problems even when the broker endpoint can be reached. Double check your authentication configuration and also ensure that you have <> if applicable. == Performance This output benefits from sending multiple messages in flight in parallel for improved performance. You can tune the max number of in flight messages (or message batches) with the field `max_in_flight`. This output benefits from sending messages as a batch for improved performance. Batches can be formed at both the input and output level. You can find out more xref:configuration:batching.adoc[in this doc]. == Fields === `addresses` A list of broker addresses to connect to. If an item of the list contains commas it will be expanded into multiple addresses. *Type*: `array` ```yml # Examples addresses: - localhost:9092 addresses: - localhost:9041,localhost:9042 addresses: - localhost:9041 - localhost:9042 ``` === `tls` Custom TLS settings can be used to override system defaults. *Type*: `object` === `tls.enabled` Whether custom TLS settings are enabled. *Type*: `bool` *Default*: `false` === `tls.skip_cert_verify` Whether to skip server side certificate verification. *Type*: `bool` *Default*: `false` === `tls.enable_renegotiation` Whether to allow the remote server to repeatedly request renegotiation. Enable this option if you're seeing the error message `local error: tls: no renegotiation`. *Type*: `bool` *Default*: `false` Requires version 3.45.0 or newer === `tls.root_cas` An optional root certificate authority to use. This is a string, representing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. [CAUTION] ==== This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info]. ==== *Type*: `string` *Default*: `""` ```yml # Examples root_cas: |- -----BEGIN CERTIFICATE----- ... -----END CERTIFICATE----- ``` === `tls.root_cas_file` An optional path of a root certificate authority file to use. This is a file, often with a .pem extension, containing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. *Type*: `string` *Default*: `""` ```yml # Examples root_cas_file: ./root_cas.pem ``` === `tls.client_certs` A list of client certificates to use. For each certificate either the fields `cert` and `key`, or `cert_file` and `key_file` should be specified, but not both. *Type*: `array` *Default*: `[]` ```yml # Examples client_certs: - cert: foo key: bar client_certs: - cert_file: ./example.pem key_file: ./example.key ``` === `tls.client_certs[].cert` A plain text certificate to use. *Type*: `string` *Default*: `""` === `tls.client_certs[].key` A plain text certificate key to use. [CAUTION] ==== This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info]. ==== *Type*: `string` *Default*: `""` === `tls.client_certs[].cert_file` The path of a certificate to use. *Type*: `string` *Default*: `""` === `tls.client_certs[].key_file` The path of a certificate key to use. *Type*: `string` *Default*: `""` === `tls.client_certs[].password` A plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete `pbeWithMD5AndDES-CBC` algorithm is not supported for the PKCS#8 format. Because the obsolete pbeWithMD5AndDES-CBC algorithm does not authenticate the ciphertext, it is vulnerable to padding oracle attacks that can let an attacker recover the plaintext. [CAUTION] ==== This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info]. ==== *Type*: `string` *Default*: `""` ```yml # Examples password: foo password: ${KEY_PASSWORD} ``` === `sasl` Enables SASL authentication. *Type*: `object` === `sasl.mechanism` The SASL authentication mechanism, if left empty SASL authentication is not used. *Type*: `string` *Default*: `"none"` |=== | Option | Summary | `OAUTHBEARER` | OAuth Bearer based authentication. | `PLAIN` | Plain text authentication. NOTE: When using plain text auth it is extremely likely that you'll also need to <>. | `SCRAM-SHA-256` | Authentication using the SCRAM-SHA-256 mechanism. | `SCRAM-SHA-512` | Authentication using the SCRAM-SHA-512 mechanism. | `none` | Default, no SASL authentication. |=== === `sasl.user` A PLAIN username. It is recommended that you use environment variables to populate this field. *Type*: `string` *Default*: `""` ```yml # Examples user: ${USER} ``` === `sasl.password` A PLAIN password. It is recommended that you use environment variables to populate this field. [CAUTION] ==== This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info]. ==== *Type*: `string` *Default*: `""` ```yml # Examples password: ${PASSWORD} ``` === `sasl.access_token` A static OAUTHBEARER access token *Type*: `string` *Default*: `""` === `sasl.token_cache` Instead of using a static `access_token` allows you to query a xref:components:caches/about.adoc[`cache`] resource to fetch OAUTHBEARER tokens from *Type*: `string` *Default*: `""` === `sasl.token_key` Required when using a `token_cache`, the key to query the cache with for tokens. *Type*: `string` *Default*: `""` === `topic` The topic to publish messages to. This field supports xref:configuration:interpolation.adoc#bloblang-queries[interpolation functions]. *Type*: `string` === `client_id` An identifier for the client connection. *Type*: `string` *Default*: `"benthos"` === `target_version` The version of the Kafka protocol to use. This limits the capabilities used by the client and should ideally match the version of your brokers. Defaults to the oldest supported stable version. *Type*: `string` ```yml # Examples target_version: 2.1.0 target_version: 3.1.0 ``` === `rack_id` A rack identifier for this client. *Type*: `string` *Default*: `""` === `key` The key to publish messages with. This field supports xref:configuration:interpolation.adoc#bloblang-queries[interpolation functions]. *Type*: `string` *Default*: `""` === `partitioner` The partitioning algorithm to use. *Type*: `string` *Default*: `"fnv1a_hash"` Options: `fnv1a_hash` , `murmur2_hash` , `random` , `round_robin` , `manual` . === `partition` The manually-specified partition to publish messages to, relevant only when the field `partitioner` is set to `manual`. Must be able to parse as a 32-bit integer. This field supports xref:configuration:interpolation.adoc#bloblang-queries[interpolation functions]. *Type*: `string` *Default*: `""` === `custom_topic_creation` If enabled, topics will be created with the specified number of partitions and replication factor if they do not already exist. *Type*: `object` === `custom_topic_creation.enabled` Whether to enable custom topic creation. *Type*: `bool` *Default*: `false` === `custom_topic_creation.partitions` The number of partitions to create for new topics. Leave at -1 to use the broker configured default. Must be >= 1. *Type*: `int` *Default*: `-1` === `custom_topic_creation.replication_factor` The replication factor to use for new topics. Leave at -1 to use the broker configured default. Must be an odd number, and less then or equal to the number of brokers. *Type*: `int` *Default*: `-1` === `compression` The compression algorithm to use. *Type*: `string` *Default*: `"none"` Options: `none` , `snappy` , `lz4` , `gzip` , `zstd` . === `static_headers` An optional map of static headers that should be added to messages in addition to metadata. *Type*: `object` ```yml # Examples static_headers: first-static-header: value-1 second-static-header: value-2 ``` === `metadata` Specify criteria for which metadata values are sent with messages as headers. *Type*: `object` === `metadata.exclude_prefixes` Provide a list of explicit metadata key prefixes to be excluded when adding metadata to sent messages. *Type*: `array` *Default*: `[]` === `inject_tracing_map` EXPERIMENTAL: A xref:guides:bloblang/about.adoc[Bloblang mapping] used to inject an object containing tracing propagation information into outbound messages. The specification of the injected fields will match the format used by the service wide tracer. *Type*: `string` Requires version 3.45.0 or newer ```yml # Examples inject_tracing_map: meta = @.merge(this) inject_tracing_map: root.meta.span = this ``` === `max_in_flight` The maximum number of messages to have in flight at a given time. Increase this to improve throughput. *Type*: `int` *Default*: `64` === `idempotent_write` Enable the idempotent write producer option. This requires the `IDEMPOTENT_WRITE` permission on `CLUSTER` and can be disabled if this permission is not available. *Type*: `bool` *Default*: `false` === `ack_replicas` Ensure that messages have been copied across all replicas before acknowledging receipt. *Type*: `bool` *Default*: `false` === `max_msg_bytes` The maximum size in bytes of messages sent to the target topic. *Type*: `int` *Default*: `1000000` === `timeout` The maximum period of time to wait for message sends before abandoning the request and retrying. *Type*: `string` *Default*: `"5s"` === `retry_as_batch` When enabled forces an entire batch of messages to be retried if any individual message fails on a send, otherwise only the individual messages that failed are retried. Disabling this helps to reduce message duplicates during intermittent errors, but also makes it impossible to guarantee strict ordering of messages. *Type*: `bool` *Default*: `false` === `batching` Allows you to configure a xref:configuration:batching.adoc[batching policy]. *Type*: `object` ```yml # Examples batching: byte_size: 5000 count: 0 period: 1s batching: count: 10 period: 1s batching: check: this.contains("END BATCH") count: 0 period: 1m ``` === `batching.count` A number of messages at which the batch should be flushed. If `0` disables count based batching. *Type*: `int` *Default*: `0` === `batching.byte_size` An amount of bytes at which the batch should be flushed. If `0` disables size based batching. *Type*: `int` *Default*: `0` === `batching.period` A period in which an incomplete batch should be flushed regardless of its size. *Type*: `string` *Default*: `""` ```yml # Examples period: 1s period: 1m period: 500ms ``` === `batching.check` A xref:guides:bloblang/about.adoc[Bloblang query] that should return a boolean value indicating whether a message should end a batch. *Type*: `string` *Default*: `""` ```yml # Examples check: this.type == "end_of_transaction" ``` === `batching.processors` A list of xref:components:processors/about.adoc[processors] to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op. *Type*: `array` ```yml # Examples processors: - archive: format: concatenate processors: - archive: format: lines processors: - archive: format: json_array ``` === `max_retries` The maximum number of retries before giving up on the request. If set to zero there is no discrete limit. *Type*: `int` *Default*: `0` === `backoff` Control time intervals between retry attempts. *Type*: `object` === `backoff.initial_interval` The initial period to wait between retry attempts. *Type*: `string` *Default*: `"3s"` ```yml # Examples initial_interval: 50ms initial_interval: 1s ``` === `backoff.max_interval` The maximum period to wait between retry attempts *Type*: `string` *Default*: `"10s"` ```yml # Examples max_interval: 5s max_interval: 1m ``` === `backoff.max_elapsed_time` The maximum overall period of time to spend on retry attempts before the request is aborted. Setting this value to a zeroed duration (such as `0s`) will result in unbounded retries. *Type*: `string` *Default*: `"30s"` ```yml # Examples max_elapsed_time: 1m max_elapsed_time: 1h ``` === `timestamp` An optional timestamp to set for each message. When left empty, the current timestamp is used. This field supports xref:configuration:interpolation.adoc#bloblang-queries[interpolation functions]. *Type*: `string` ```yml # Examples timestamp: ${! timestamp_unix() } timestamp: ${! metadata("kafka_timestamp_unix") } ```