= archive
:type: processor
:status: stable
:categories: ["Parsing","Utility"]


////
     THIS FILE IS AUTOGENERATED!

     To make changes, edit the corresponding source file under:

     https://github.com/redpanda-data/connect/tree/main/internal/impl/<provider>.

     And:

     https://github.com/redpanda-data/connect/tree/main/cmd/tools/docs_gen/templates/plugin.adoc.tmpl
////

// © 2024 Redpanda Data Inc.


component_type_dropdown::[]


Archives all the messages of a batch into a single message according to the selected archive format.

```yml
# Config fields, showing default values
label: ""
archive:
  format: "" # No default (required)
  path: ""
```

Some archive formats (such as tar, zip) treat each archive item (message part) as a file with a path. Since message parts only contain raw data a unique path must be generated for each part. This can be done by using function interpolations on the 'path' field as described in xref:configuration:interpolation.adoc#bloblang-queries[Bloblang queries]. For types that aren't file based (such as binary) the file field is ignored.

The resulting archived message adopts the metadata of the _first_ message part of the batch.

The functionality of this processor depends on being applied across messages that are batched. You can find out more about batching xref:configuration:batching.adoc[in this doc].

== Fields

=== `format`

The archiving format to apply.


*Type*: `string`


|===
| Option | Summary

| `binary`
| Archive messages to a https://github.com/redpanda-data/benthos/blob/main/internal/message/message.go#L96[binary blob format^].
| `concatenate`
| Join the raw contents of each message into a single binary message.
| `json_array`
| Attempt to parse each message as a JSON document and append the result to an array, which becomes the contents of the resulting message.
| `lines`
| Join the raw contents of each message and insert a line break between each one.
| `tar`
| Archive messages to a unix standard tape archive.
| `zip`
| Archive messages to a zip file.

|===

=== `path`

The path to set for each message in the archive (when applicable).
This field supports xref:configuration:interpolation.adoc#bloblang-queries[interpolation functions].


*Type*: `string`

*Default*: `""`

```yml
# Examples

path: ${!count("files")}-${!timestamp_unix_nano()}.txt

path: ${!meta("kafka_key")}-${!json("id")}.json
```

== Examples

[tabs]
======
Tar Archive::
+
--


If we had JSON messages in a batch each of the form:

```json
{"doc":{"id":"foo","body":"hello world 1"}}
```

And we wished to tar archive them, setting their filenames to their respective unique IDs (with the extension `.json`), our config might look like
this:

```yaml
pipeline:
  processors:
    - archive:
        format: tar
        path: ${!json("doc.id")}.json
```

--
======