STEF SDL

Schema Definition Language
Define schemas for STEF serialization

Overview

The STEF Schema Definition Language (SDL) is used to define schemas for STEF serialization. It provides a simple, type-safe way to describe data structures that can be efficiently serialized and deserialized using the STEF format.

Package Declaration

Every STEF schema file begins with a package declaration:

package com.example.myschema

Package names use dot notation and can have one or more dot-delimited components.

Language-Specific Package Handling

Different target languages handle package names differently when generating code:

Comments

STEF SDL supports C-style single-line comments:

// This is a comment
package com.example // Comments can appear at end of lines

Primitive Types

STEF SDL supports the following primitive data types:

Structs

Structs define composite data types with named fields:

struct Person {
  Name string
  Age uint64
  Email string
}

Root Structs

The root attribute marks a struct as the top-level record type in a STEF stream:

struct Record root {
  Timestamp uint64
  Data Person
}

Multiple structs can be marked as root in a single schema, allowing the STEF stream to contain different types of records:

struct MetricRecord root {
  Timestamp uint64
  Metric Metric
}

struct TraceRecord root {
  Timestamp uint64
  Span Span
}

When multiple root structs are defined, each record in the stream will be one of the root types, and the STEF format includes type information to distinguish between them during deserialization.

Dictionary Compression

Fields can use dictionary compression for repeated values using the dict modifier:

struct Event {
  EventType string dict(EventTypes)
  Message string
}

Structs can also have dictionary compression applied:

struct Resource dict(Resources) {
  Name string
  Version string
}

Dictionary names allow the same dictionary to be shared across multiple fields, even in different structs, as long as the fields have the same type:

struct MetricEvent {
  ServiceName string dict(ServiceNames)
  EventType string dict(EventTypes)
}

struct TraceEvent {
  ServiceName string dict(ServiceNames)  // Same dictionary as above
  SpanName string dict(SpanNames)
}

This sharing enables more efficient compression when the same values appear across different record types.

Optional Fields

Fields can be marked as optional, meaning they may not be present in every record:

struct User {
  Name string
  Email string optional
  Phone string optional
}

Arrays

Array types are denoted with square brackets and can contain zero or more elements of the specified type:

struct Container {
  Items []string
  Numbers []int64
  Objects []Person
}

Arrays are variable-length - they can be empty or contain any number of elements.

Oneofs (Union Types)

Oneofs define union types that can hold one of several possible field types:

oneof JsonValue {
  String string
  Number float64
  Bool bool
  Array []JsonValue
  Object JsonObject
}

A oneof may also be empty, i.e. contain none of the listed values.

Multimaps

Multimaps define key-value collections:

multimap Attributes {
  key string
  value AnyValue
}

Multimaps can also use dictionary compression:

multimap Labels {
  key string dict(LabelKeys)
  value string dict(LabelValues)
}

Enums

Enums define named constant values:

enum MetricType {
  Gauge = 0
  Counter = 1
  Histogram = 2
  Summary = 3
}

Enum values must be explicitly assigned unsigned integer values. Multiple number formats are supported:

enum StatusCode {
  OK = 0
  NotFound = 0x194        // 404 in hexadecimal
  InternalError = 0o770   // 500 in octal
  Custom = 0b1111101000   // 1000 in binary
}

Complete Example

Here's a comprehensive example showing various STEF SDL features:

package com.example.monitoring

// Enum for metric types
enum MetricType {
  Gauge = 0
  Counter = 1
  Histogram = 2
}

// Key-value attributes
multimap Attributes {
  key string dict(AttributeKeys)
  value AttributeValue
}

// Union type for attribute values
oneof AttributeValue {
  StringValue string
  IntValue int64
  FloatValue float64
  BoolValue bool
}

// Resource information with dictionary compression
struct Resource dict(Resources) {
  ServiceName string dict(ServiceNames)
  ServiceVersion string dict(ServiceVersions)
  Attributes Attributes
}

// Metric data point
struct DataPoint {
  Timestamp uint64
  Value float64
  Attributes Attributes
}

// Main metric structure
struct Metric {
  Name string dict(MetricNames)
  Type MetricType
  Unit string dict(Units)
  Description string optional
  DataPoints []DataPoint
}

// Root record type
struct MetricRecord root {
  Resource Resource
  Metric Metric
}

Type References

STEF SDL supports forward references - you can reference types before they are defined in the file. The parser resolves all type references after parsing the complete schema.

Recursive Type Declarations

STEF SDL allows recursive type declarations, enabling the definition of tree-like data structures.

Self-Referential Types

A type can reference itself, useful for creating tree structures:

// Binary tree node
struct TreeNode {
  Value int64
  Left TreeNode optional
  Right TreeNode optional
}

Mutually Referential Types

Multiple types can reference each other, creating more complex recursive relationships:

// Expression tree with operators and operands
struct Expression {
  Node ExpressionNode
}

oneof ExpressionNode {
  Literal LiteralValue
  BinaryOp BinaryOperation
  UnaryOp UnaryOperation
}

struct LiteralValue {
  Value float64
}

struct BinaryOperation {
  Operator string
  Left Expression   // References back to Expression
  Right Expression  // References back to Expression
}

struct UnaryOperation {
  Operator string
  Operand Expression  // References back to Expression
}

These recursive patterns are resolved correctly by the STEF parser and enable rich data modeling capabilities.

Syntax Rules

Generated Code

Use the stefgen tool to generate serialization code from your STEF schema:

stefgen --lang=go myschema.stef

This generates efficient serializers and deserializers in your target language.

Learn More