Skip to main content
Version: 1.5.1

Azure Blob Storage

Microsoft Azure Long Term Storage

Synopsis

Creates a target that writes log messages to Azure Blob Storage with support for various file formats, authentication methods, and retry mechanisms. Inherits file format capabilities from the base file target.

Schema

- name: <string>
description: <string>
type: azblob
pipelines: <pipeline[]>
status: <boolean>
properties:
account: <string>
tenant_id: <string>
client_id: <string>
client_secret: <string>
container: <string>
name: <string>
format: <string>
extension: <string>
compression: <string>
schema: <string>
field_format: <string>
no_buffer: <boolean>
timeout: <numeric>
max_size: <numeric>
batch_size: <numeric>
containers:
- container: <string>
name: <string>
format: <string>
compression: <string>
extension: <string>
schema: <string>
function_app: <string>
function_token: <string>
interval: <string|numeric>
cron: <string>
debug:
status: <boolean>
dont_send_logs: <boolean>

Configuration

The following fields are used to define the target:

FieldRequiredDefaultDescription
nameYTarget name
descriptionN-Optional description
typeYMust be azblob
pipelinesN-Optional post-processor pipelines
statusNtrueEnable/disable the target

Azure

FieldRequiredDefaultDescription
accountYAzure storage account name
tenant_idN(1)-Azure tenant ID (required unless using managed identity or function app)
client_idN(1)-Azure client ID (required unless using managed identity or function app)
client_secretN(1)-Azure client secret (required unless using managed identity or function app)
containerN(2)"vmetric"Default/fallback container name (catch-all for unmatched events)

(1) = Conditionally required (see authentication methods above) (2) = Required if you want a catch-all container for unmatched events, or if not using the containers array

Connection

FieldRequiredDefaultDescription
timeoutN30Connection timeout in seconds
max_sizeN0Maximum file size in bytes before uploading
batch_sizeN100000Maximum number of messages per file
note

When max_size is reached, the current file is uploaded to blob storage and a new file is created. For unlimited file size, set the field to 0.

Function App (Optional)

FieldRequiredDefaultDescription
function_appN-Azure Function App URL for uploading blobs
function_tokenN-Authentication token for the Function App

If function_app is specified, the target will use the Function App to upload blobs instead of direct Azure Blob Storage SDK. This is useful for scenarios where direct access to storage is restricted.

Files

The following fields can be used for files:

FieldRequiredDefaultDescription
nameN"vmetric.{{.Timestamp}}.{{.Extension}}"Blob name template
formatN"json"File format (json, multijson, avro, parquet)
extensionNMatches formatCustom file extension
compressionN"zstd"Compression algorithm
schemaN-Data schema for Avro/Parquet formats. Can be a built-in schema name, a path to a schema file, or an inline schema definition
no_bufferNfalseDisable write buffering
field_formatN-Data normalization format. See applicable Normalization section

Multiple Containers

You can define multiple output containers with different settings:

targets:
- name: multi_container_blob
type: azblob
properties:
containers:
- container: "security-logs"
name: "security_{{.Year}}_{{.Month}}_{{.Day}}.parquet"
format: "parquet"
schema: "CommonSecurityLog"
- container: "system-logs"
name: "system_{{.Year}}_{{.Month}}_{{.Day}}.json"
format: "json"

Scheduler

FieldRequiredDefaultDescription
intervalNrealtimeExecution frequency
cronN-Cron expression for scheduled execution

Debug Options

FieldRequiredDefaultDescription
debug.statusNfalseEnable debug logging
debug.dont_send_logsNfalseProcess logs but don't send to target (testing)

Details

The Azure Blob Storage target supports writing to multiple containers with various file formats and schemas. When using the SystemS3 field in your logs, the value will be used to route the message to the appropriate container.

Container Routing and Catch-All Behavior

The target uses a routing system to direct events to the appropriate container:

  1. Explicit Container Matching: If an event has a SystemS3 field, the target looks for a container defined in the containers array with a matching name
  2. Catch-All Container: If no matching container is found (or if SystemS3 is not set), the event is routed to the default container specified at the root level

The container and schema properties at the root level serve as a catch-all mechanism. This is particularly useful for automation scenarios where systems may look for specific containers with specific schemas. If no matching container is found in the containers array, these events will fall back to the default container instead of being dropped.

The target supports the following built-in schema templates for structured data formats:

  • Syslog - Standard schema for Syslog messages
  • CommonSecurityLog - Schema compatible with Common Security Log Format (CSL)

You can also reference custom schema files by name (without the .json extension). The system will search for schema files in:

  1. User schema directory: <user-path>/schemas/
  2. Package schema directory: <package-path>/schemas/

Schema files are searched recursively in these directories, and filename matching is case-insensitive.

Templates

The following template variables can be used in the blob name:

VariableDescriptionExample
{{.Year}}Current year2024
{{.Month}}Current month01
{{.Day}}Current day15
{{.Timestamp}}Current timestamp in nanoseconds1703688533123456789
{{.Format}}File formatjson
{{.Extension}}File extensionjson
{{.Compression}}Compression typezstd
{{.TargetName}}Target namemy_logs
{{.TargetType}}Target typeazblob
{{.Table}}Container namelogs

Formats

FormatDescription
jsonEach log entry is written as a separate JSON line (JSONL format)
multijsonAll log entries are written as a single JSON array
avroApache Avro format with schema and compression support
parquetApache Parquet columnar format with schema

Compression

Some formats support built-in compression to reduce storage costs and transfer times. When supported, compression is applied at the file/block level before upload.

FormatDefaultCompression Codecs
JSON-Not supported
MultiJSON-Not supported
Avrozstddeflate, snappy, zstd
Parquetzstdgzip, snappy, zstd, brotli, lz4
note

Files with no messages (i.e. with counter=0) are automatically skipped during upload.

Examples

The following upload configurations are available.

JSON

The minimum configuration for a JSON blob storage:

targets:
- name: basic_blob
type: azblob
properties:
account: "mystorageaccount"
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "00000000-0000-0000-0000-000000000000"
client_secret: "your-client-secret"

Multiple Containers with Catch-All

Configuration with multiple target containers and a catch-all default container:

targets:
- name: multi_container_blob
type: azblob
properties:
account: "mystorageaccount"
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "00000000-0000-0000-0000-000000000000"
client_secret: "your-client-secret"
# Catch-all container for unmatched events
container: "general-logs"
schema: "Syslog"
containers:
- container: "security-logs"
name: "security_{{.Year}}_{{.Month}}_{{.Day}}.parquet"
format: "parquet"
schema: "CommonSecurityLog"
compression: "zstd"
- container: "system-logs"
name: "system_{{.Year}}_{{.Month}}_{{.Day}}.json"
format: "json"
- container: "application-logs"
name: "app_{{.Year}}_{{.Month}}_{{.Day}}.avro"
format: "avro"
schema: "Syslog"
compression: "snappy"

Parquet

Configuration for daily partitioned Parquet files:

targets:
- name: parquet_blob
type: azblob
properties:
account: "mystorageaccount"
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "00000000-0000-0000-0000-000000000000"
client_secret: "your-client-secret"
container: "logs"
format: "parquet"
compression: "zstd"
name: "logs/year={{.Year}}/month={{.Month}}/day={{.Day}}/data_{{.Timestamp}}.parquet"
schema: "Syslog"
max_size: 536870912 # 512MB

Avro with Custom Schema

Configuration for Avro format with a custom schema file:

targets:
- name: avro_blob
type: azblob
properties:
account: "mystorageaccount"
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "00000000-0000-0000-0000-000000000000"
client_secret: "your-client-secret"
container: "logs"
format: "avro"
compression: "snappy"
name: "logs_{{.Year}}_{{.Month}}_{{.Day}}.avro"
schema: "MyCustomSchema"

Function App Upload

Configuration using Azure Function App for uploading:

targets:
- name: function_blob
type: azblob
properties:
account: "mystorageaccount"
function_app: "https://my-function-app.azurewebsites.net/api/BlobStorage"
function_token: "your-function-token"
container: "logs"

Debug Configuration

Configuration with debugging enabled:

targets:
- name: debug_blob
type: azblob
properties:
account: "mystorageaccount"
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "00000000-0000-0000-0000-000000000000"
client_secret: "your-client-secret"
debug:
status: true
dont_send_logs: true # Test mode that doesn't actually upload

High Volume with Batching

Configuration optimized for high-volume ingestion:

targets:
- name: high_volume_blob
type: azblob
properties:
account: "mystorageaccount"
tenant_id: "00000000-0000-0000-0000-000000000000"
client_id: "00000000-0000-0000-0000-000000000000"
client_secret: "your-client-secret"
container: "logs"
format: "parquet"
compression: "zstd"
batch_size: 50000
max_size: 536870912 # 512MB
timeout: 60