Schema Registry
The Schema Registry is the authoritative catalogue of payload schemas used across all Open-M pipelines. Every MPPM envelope declares a schema_ref. The Schema Registry resolves that reference, enforces compatibility between connected components at deploy time, and enables the IDE's mapping editor to render accurate field trees for UTL-X mapping.
Why a Schema Registry?
A pipeline YAML without schema references is structurally incomplete. It describes the topology — which components are connected — but not the contract: what data flows on each connection. Without contracts:
- The control plane cannot validate that component A's output schema is compatible with component B's input schema at deploy time. Incompatibility is discovered at runtime — when a message fails to parse.
- The IDE cannot offer meaningful mapping assistance. The UTL-X mapping editor needs both the upstream and downstream schemas to render field trees.
- Schema evolution is ungoverned. If a component updates its output schema, there is no record of what was originally expected, so breaking changes cannot be detected by CI/CD.
- Operations staff looking at a DLQ message cannot determine what the message should have looked like without separate documentation.
Schema references in the pipeline YAML are not optional decoration — they are part of the pipeline's operational contract. Schema content, however, is never inlined. It lives in the Schema Registry and is referenced by version.
Schema ref format
Every schema reference is a namespaced, versioned identifier:
{namespace}.schemas.{name}:{version}
# Examples
logistics.schemas.order-received:1.0.0
logistics.schemas.sap-order-idoc:2.1.0
finance.schemas.invoice-outbound:3.0.0
The namespace segment matches the pipeline namespace, keeping schema ownership co-located with the pipelines that use the schemas. A schema owned by the logistics team lives in logistics.schemas.* and is governed by that team.
Schema types
Open-M supports eight payload formats, each with a declared schema type in the registry:
Binary formats (PROTO, AVRO) at the pipeline edge are valid. Inside the pipeline, they are discouraged. Binary payloads cannot be tapped, displayed in the IDE inspector, or include readable log previews without the schema. Adapters at the pipeline entry point should decode binary → JSON before the payload enters internal hops.
Where schema refs appear in the pipeline YAML
Schema refs appear in three places in the pipeline YAML descriptor:
1. spec.schemas declaration stanza
All schemas used by the pipeline declared once at the top level. Lets the control plane fetch and cache all schemas at deploy time in a single batch, and lets the IDE verify all refs exist before rendering the canvas.
spec: schemas: - { ref: logistics.schemas.order-received:1.0.0, type: JSCH } - { ref: logistics.schemas.order-enriched-customer:1.0.0, type: JSCH } - { ref: logistics.schemas.fulfillment-dispatch:2.0.0, type: XSC }
2. Component port definitions
Each component's input and output ports declare their schema. The control plane checks port compatibility when validating connection schema pairs.
components: - id: customer-enricher ref: logistics.services.customer-lookup:2.1.0 ports: input: schema_ref: logistics.schemas.order-received:1.0.0 format: JSON output: schema_ref: logistics.schemas.order-enriched-customer:1.0.0 format: JSON
3. Connection route schema refs
Each connection declares the schema of the MPPM payload traveling on its topic. For Mode 1 (no transform) this is a single schema_ref. For Mode 2 (UTL-X transform), source_schema_ref and target_schema_ref are both declared — the transform bridges between them.
Registering a schema
# Register a JSON Schema open-m schema register \ --ref logistics.schemas.order-received:1.0.0 \ --type JSCH \ --file ./schemas/order-received.json \ --env production # Register an XSD (XML Schema) open-m schema register \ --ref logistics.schemas.fulfillment-dispatch:2.0.0 \ --type XSC \ --file ./schemas/fulfillment-dispatch.xsd \ --env production # Check what's registered open-m schema list --namespace logistics # View a specific schema open-m schema get logistics.schemas.order-received:1.0.0
Compatibility rules
Schema versions follow semver. The compatibility level required determines the deployment model:
| Version bump | Allowed schema changes | Deploy model | Old envelopes readable? |
|---|---|---|---|
| PATCH x.x.N | No structural change. Documentation, descriptions, examples only. | Rolling restart. | ✓ identical |
| MINOR x.N.0 | New optional fields with defaults. No field removal. No type changes. Consumers that haven't upgraded ignore the new fields. | Rolling restart. Old and new envelopes coexist safely. | ✓ backward compatible |
| MAJOR N.0.0 | Any change permitted: field removal, type changes, structural restructure. | Blue/green deployment. Old version drains before new activates. No mixed-version envelopes on the same topic. | ✗ breaking change |
Deploy-time validation
When a pipeline is deployed, the control plane performs schema validation before any Kubernetes manifest is applied or any topic is provisioned. This is the equivalent of compile-time type checking for your integration:
- Ref existence check. Every
schema_refin the YAML must resolve to a registered schema in the registry. - Port compatibility check. For every connection, the control plane checks that the upstream component's declared output schema is compatible with the downstream component's declared input schema.
- Transform bridge check. For Mode 2 UTL-X connections, the
source_schema_refmust match the upstream port's output schema, and thetarget_schema_refmust match the downstream port's input schema. - Version mismatch check. If a component manifest declares
output.schema_ref: logistics.schemas.order-enriched-customer:1.0.0but the pipeline YAML's connection stanza declaressource_schema_ref: logistics.schemas.order-enriched-customer:2.0.0, the deploy fails.
At runtime, the wrapper validates the schema_ref in each received MPPM envelope against the registry on every message. A validation failure is a permanent error — the envelope is routed directly to the DLQ with no retry, regardless of the configured delivery guarantee.
Binary format handling (PROTO, AVRO)
Protobuf and Avro payloads require special handling because the raw bytes are meaningless without the schema. The Open-M Schema Registry stores the .proto file or Avro schema, and the wrapper uses it to serialise and deserialise at the topic boundary.
Because binary payloads cannot be rendered in the IDE's message inspector without the schema, and cannot appear in log event previews, the recommended pattern is:
- Inbound adapter at the pipeline entry point decodes binary → JSON.
- All internal hops use JSON (JSCH-governed).
- Outbound adapter at the pipeline exit point re-encodes JSON → binary if the target system requires it.
This keeps the pipeline interior inspectable, mappable, and debuggable. Binary encoding stays at the edge where it belongs.