= openai_chat_completion :type: processor :status: experimental :categories: ["AI"] //// THIS FILE IS AUTOGENERATED! To make changes, edit the corresponding source file under: https://github.com/redpanda-data/connect/tree/main/internal/impl/. And: https://github.com/redpanda-data/connect/tree/main/cmd/tools/docs_gen/templates/plugin.adoc.tmpl //// // © 2024 Redpanda Data Inc. component_type_dropdown::[] Generates responses to messages in a chat conversation, using the OpenAI API. Introduced in version 4.32.0. [tabs] ====== Common:: + -- ```yml # Common config fields, showing default values label: "" openai_chat_completion: server_address: https://api.openai.com/v1 api_key: "" # No default (required) model: gpt-4o # No default (required) prompt: "" # No default (optional) system_prompt: "" # No default (optional) image: 'root = this.image.decode("base64") # decode base64 encoded image' # No default (optional) max_tokens: 0 # No default (optional) temperature: 0 # No default (optional) user: "" # No default (optional) response_format: text json_schema: name: "" # No default (required) schema: "" # No default (required) ``` -- Advanced:: + -- ```yml # All config fields, showing default values label: "" openai_chat_completion: server_address: https://api.openai.com/v1 api_key: "" # No default (required) model: gpt-4o # No default (required) prompt: "" # No default (optional) system_prompt: "" # No default (optional) image: 'root = this.image.decode("base64") # decode base64 encoded image' # No default (optional) max_tokens: 0 # No default (optional) temperature: 0 # No default (optional) user: "" # No default (optional) response_format: text json_schema: name: "" # No default (required) description: "" # No default (optional) schema: "" # No default (required) schema_registry: url: "" # No default (required) name_prefix: schema_registry_id_ subject: "" # No default (required) refresh_interval: "" # No default (optional) tls: skip_cert_verify: false enable_renegotiation: false root_cas: "" root_cas_file: "" client_certs: [] oauth: enabled: false consumer_key: "" consumer_secret: "" access_token: "" access_token_secret: "" basic_auth: enabled: false username: "" password: "" jwt: enabled: false private_key_file: "" signing_method: "" claims: {} headers: {} top_p: 0 # No default (optional) frequency_penalty: 0 # No default (optional) presence_penalty: 0 # No default (optional) seed: 0 # No default (optional) stop: [] # No default (optional) ``` -- ====== This processor sends the contents of user prompts to the OpenAI API, which generates responses. By default, the processor submits the entire payload of each message as a string, unless you use the `prompt` configuration field to customize it. To learn more about chat completion, see the https://platform.openai.com/docs/guides/chat-completions[OpenAI API documentation^]. == Examples [tabs] ====== Use GPT-4o analyze an image:: + -- This example fetches image URLs from stdin and has GPT-4o describe the image. ```yaml input: stdin: scanner: lines: {} pipeline: processors: - http: verb: GET url: "${!content().string()}" - openai_chat_completion: model: gpt-4o api_key: TODO prompt: "Describe the following image" image: "root = content()" output: stdout: codec: lines ``` -- ====== == Fields === `server_address` The Open API endpoint that the processor sends requests to. Update the default value to use another OpenAI compatible service. *Type*: `string` *Default*: `"https://api.openai.com/v1"` === `api_key` The API key for OpenAI API. [CAUTION] ==== This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info]. ==== *Type*: `string` === `model` The name of the OpenAI model to use. *Type*: `string` ```yml # Examples model: gpt-4o model: gpt-4o-mini model: gpt-4 model: gpt4-turbo ``` === `prompt` The user prompt you want to generate a response for. By default, the processor submits the entire payload as a string. This field supports xref:configuration:interpolation.adoc#bloblang-queries[interpolation functions]. *Type*: `string` === `system_prompt` The system prompt to submit along with the user prompt. This field supports xref:configuration:interpolation.adoc#bloblang-queries[interpolation functions]. *Type*: `string` === `image` An image to send along with the prompt. The mapping result must be a byte array. This field supports xref:configuration:interpolation.adoc#bloblang-queries[interpolation functions]. *Type*: `string` Requires version 4.38.0 or newer ```yml # Examples image: 'root = this.image.decode("base64") # decode base64 encoded image' ``` === `max_tokens` The maximum number of tokens that can be generated in the chat completion. *Type*: `int` === `temperature` What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both. *Type*: `float` === `user` A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. This field supports xref:configuration:interpolation.adoc#bloblang-queries[interpolation functions]. *Type*: `string` === `response_format` Specify the model's output format. If `json_schema` is specified, then additionally a `json_schema` or `schema_registry` must be configured. *Type*: `string` *Default*: `"text"` Options: `text` , `json` , `json_schema` . === `json_schema` The JSON schema to use when responding in `json_schema` format. To learn more about what JSON schema is supported see the https://platform.openai.com/docs/guides/structured-outputs/supported-schemas[OpenAI documentation^]. *Type*: `object` === `json_schema.name` The name of the schema. *Type*: `string` === `json_schema.description` Additional description of the schema for the LLM. *Type*: `string` === `json_schema.schema` The JSON schema for the LLM to use when generating the output. *Type*: `string` === `schema_registry` The schema registry to dynamically load schemas from when responding in `json_schema` format. Schemas themselves must be in JSON format. To learn more about what JSON schema is supported see the https://platform.openai.com/docs/guides/structured-outputs/supported-schemas[OpenAI documentation^]. *Type*: `object` === `schema_registry.url` The base URL of the schema registry service. *Type*: `string` === `schema_registry.name_prefix` The prefix of the name for this schema, the schema ID is used as a suffix. *Type*: `string` *Default*: `"schema_registry_id_"` === `schema_registry.subject` The subject name to fetch the schema for. *Type*: `string` === `schema_registry.refresh_interval` The refresh rate for getting the latest schema. If not specified the schema does not refresh. *Type*: `string` === `schema_registry.tls` Custom TLS settings can be used to override system defaults. *Type*: `object` === `schema_registry.tls.skip_cert_verify` Whether to skip server side certificate verification. *Type*: `bool` *Default*: `false` === `schema_registry.tls.enable_renegotiation` Whether to allow the remote server to repeatedly request renegotiation. Enable this option if you're seeing the error message `local error: tls: no renegotiation`. *Type*: `bool` *Default*: `false` Requires version 3.45.0 or newer === `schema_registry.tls.root_cas` An optional root certificate authority to use. This is a string, representing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. [CAUTION] ==== This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info]. ==== *Type*: `string` *Default*: `""` ```yml # Examples root_cas: |- -----BEGIN CERTIFICATE----- ... -----END CERTIFICATE----- ``` === `schema_registry.tls.root_cas_file` An optional path of a root certificate authority file to use. This is a file, often with a .pem extension, containing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. *Type*: `string` *Default*: `""` ```yml # Examples root_cas_file: ./root_cas.pem ``` === `schema_registry.tls.client_certs` A list of client certificates to use. For each certificate either the fields `cert` and `key`, or `cert_file` and `key_file` should be specified, but not both. *Type*: `array` *Default*: `[]` ```yml # Examples client_certs: - cert: foo key: bar client_certs: - cert_file: ./example.pem key_file: ./example.key ``` === `schema_registry.tls.client_certs[].cert` A plain text certificate to use. *Type*: `string` *Default*: `""` === `schema_registry.tls.client_certs[].key` A plain text certificate key to use. [CAUTION] ==== This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info]. ==== *Type*: `string` *Default*: `""` === `schema_registry.tls.client_certs[].cert_file` The path of a certificate to use. *Type*: `string` *Default*: `""` === `schema_registry.tls.client_certs[].key_file` The path of a certificate key to use. *Type*: `string` *Default*: `""` === `schema_registry.tls.client_certs[].password` A plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete `pbeWithMD5AndDES-CBC` algorithm is not supported for the PKCS#8 format. Because the obsolete pbeWithMD5AndDES-CBC algorithm does not authenticate the ciphertext, it is vulnerable to padding oracle attacks that can let an attacker recover the plaintext. [CAUTION] ==== This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info]. ==== *Type*: `string` *Default*: `""` ```yml # Examples password: foo password: ${KEY_PASSWORD} ``` === `schema_registry.oauth` Allows you to specify open authentication via OAuth version 1. *Type*: `object` === `schema_registry.oauth.enabled` Whether to use OAuth version 1 in requests. *Type*: `bool` *Default*: `false` === `schema_registry.oauth.consumer_key` A value used to identify the client to the service provider. *Type*: `string` *Default*: `""` === `schema_registry.oauth.consumer_secret` A secret used to establish ownership of the consumer key. [CAUTION] ==== This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info]. ==== *Type*: `string` *Default*: `""` === `schema_registry.oauth.access_token` A value used to gain access to the protected resources on behalf of the user. *Type*: `string` *Default*: `""` === `schema_registry.oauth.access_token_secret` A secret provided in order to establish ownership of a given access token. [CAUTION] ==== This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info]. ==== *Type*: `string` *Default*: `""` === `schema_registry.basic_auth` Allows you to specify basic authentication. *Type*: `object` === `schema_registry.basic_auth.enabled` Whether to use basic authentication in requests. *Type*: `bool` *Default*: `false` === `schema_registry.basic_auth.username` A username to authenticate as. *Type*: `string` *Default*: `""` === `schema_registry.basic_auth.password` A password to authenticate with. [CAUTION] ==== This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info]. ==== *Type*: `string` *Default*: `""` === `schema_registry.jwt` BETA: Allows you to specify JWT authentication. *Type*: `object` === `schema_registry.jwt.enabled` Whether to use JWT authentication in requests. *Type*: `bool` *Default*: `false` === `schema_registry.jwt.private_key_file` A file with the PEM encoded via PKCS1 or PKCS8 as private key. *Type*: `string` *Default*: `""` === `schema_registry.jwt.signing_method` A method used to sign the token such as RS256, RS384, RS512 or EdDSA. *Type*: `string` *Default*: `""` === `schema_registry.jwt.claims` A value used to identify the claims that issued the JWT. *Type*: `object` *Default*: `{}` === `schema_registry.jwt.headers` Add optional key/value headers to the JWT. *Type*: `object` *Default*: `{}` === `top_p` An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. *Type*: `float` === `frequency_penalty` Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. *Type*: `float` === `presence_penalty` Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. *Type*: `float` === `seed` If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed. *Type*: `int` === `stop` Up to 4 sequences where the API will stop generating further tokens. *Type*: `array`