Pipeline YAML Overview
The pipeline YAML descriptor is the canonical artefact in Open-M. It is human-readable, Git-diffable, and drives everything: the IDE canvas renders it, the control plane provisions topics and subscriptions from it, and CI/CD validates and deploys it. You never have to write XML, click through a wizard, or hand-edit a binary file.
Why YAML?
Enterprise middleware platforms like TIBCO BusinessWorks and MuleSoft use XML as their canonical flow format. XML makes for unresolvable merge conflicts, is unreadable without the designer tool, and requires every change to flow through a single IDE. Open-M uses YAML because:
- Human readable. A developer can understand a pipeline's topology from the YAML without opening any tool.
- Git-native. Adding a component or arrow appears as a small, localised addition. Diffs are reviewable in a pull request.
- Kubernetes-consistent. The same syntax family as Helm charts, ArgoCD config, and every CI/CD tool in the cloud-native ecosystem.
- Validatable with JSON Schema. VS Code provides real-time autocomplete and error highlighting without any Open-M-specific plugin.
- Canvas is a rendering, not the source. The IDE reads the YAML and renders it as a graph. Every canvas action writes back to the YAML. There is one source of truth.
Top-level structure
apiVersion: open-m/v1 kind: Pipeline metadata: name: order-fulfillment namespace: logistics.orders version: 2.5.0 description: "Receives orders, enriches, validates inventory, dispatches." spec: defaults: # pipeline-wide defaults, all overridable per component/connection ... schemas: # all schema refs used in this pipeline ... components: # nodes on the canvas ... connections: # arrows on the canvas — each is a topic + subscription + route ... error_handling: ... scaling: # kept separate from logical flow ...
The two mandatory top-level fields are apiVersion: open-m/v1 and kind: Pipeline. These allow the control plane and IDE to version-gate the parser and provide forward-compatible parsing as the descriptor format evolves.
Stanzas at a glance
Full minimal example
A two-component pipeline — HTTP inbound to SAP RFC — with a UTL-X field mapping on the arrow, a single error handler, and single-cluster scaling:
apiVersion: open-m/v1 kind: Pipeline metadata: name: order-to-sap namespace: logistics.orders version: 1.0.0 spec: defaults: delivery_guarantee: at_least_once subscription_type: Shared retry: max_attempts: 3 strategy: exponential_backoff initial_delay_ms: 500 logging: level: INFO schemas: - { ref: logistics.schemas.http-order:1.0.0, type: JSCH } - { ref: logistics.schemas.sap-order-idoc:1.0.0, type: XSC } components: - id: http-inbound ref: open-m.adapters.http-inbound:1.4.0 config: { path: /orders, method: POST } ports: output: { schema_ref: logistics.schemas.http-order:1.0.0, format: JSON } placement: { x: 100, y: 300 } logging: topic: logistics.orders.order-to-sap.http-inbound.log - id: sap-rfc-out ref: open-m.connectors.sap-rfc:2.1.0 config: { bapi: BAPI_SALESORDER_CREATEFROMDAT2 } ports: input: { schema_ref: logistics.schemas.sap-order-idoc:1.0.0, format: XML } placement: { x: 600, y: 300 } logging: topic: logistics.orders.order-to-sap.sap-rfc-out.log connections: - id: conn-http-to-sap # auto from: { component: http-inbound, port: output } to: { component: sap-rfc-out, port: input } route: topic: logistics.orders.order-to-sap.http-inbound.out # auto subscription: logistics.orders.order-to-sap.sap-rfc-out.sub # auto source_schema_ref: logistics.schemas.http-order:1.0.0 target_schema_ref: logistics.schemas.sap-order-idoc:1.0.0 transform: type: utlx mode: ref ref: logistics.mappings.http-order-to-idoc:1.0.0 error_routing: topic: logistics.orders.order-to-sap.http-inbound.err # auto subscription: logistics.orders.order-to-sap.error-handler.err-sub # auto error_handling: dlq_strategy: per_component replay_enabled: true alert_on_dlq: true scaling: clusters: - cluster-ref: k8s-prod-eu-west components: - { id: http-inbound, replicas: 2 } - { id: sap-rfc-out, replicas: 1 }
Auto-generated fields
When you draw an arrow in the IDE canvas, these fields are auto-populated from the naming convention. You can override any of them in the text editor:
connections[].id—{from-id}--{to-id}route.topic—{namespace}.{pipeline-name}.{from-component-id}.outroute.subscription—{namespace}.{pipeline-name}.{to-component-id}.suberror_routing.topic—{namespace}.{pipeline-name}.{from-component-id}.errcomponents[].logging.topic—{namespace}.{pipeline-name}.{component-id}.log- DLQ topics —
{namespace}.{pipeline-name}.{component-id}.dlq
Subscription names are stable identifiers. In Pulsar, a subscription name is durable — if it changes between deployments, the consumer restarts from the latest offset and silently drops unprocessed messages. Never rename a subscription on a production pipeline without a deliberate migration plan. Auto-generated subscription names are stable as long as component IDs and pipeline names don't change.
YAML as the canonical artefact
The pipeline YAML is always the source of truth. The IDE canvas is a visual rendering of the YAML — not the other way around. Every canvas action (drag a component, draw an arrow, change a config value) writes back to the YAML file immediately. The canvas and the YAML are always in sync.
This means your pipelines live in Git. Pull requests show exactly what changed: which component was added, which connection was rewired, what config value changed. CI/CD validates the YAML schema and runs connector tests before merge. There are no locked binary files, no single IDE chokepoint.
Install the Open-M VS Code extension to get JSON Schema validation, component ref autocomplete, and hover documentation directly in your editor — no canvas required. The extension registers a JSON Schema for apiVersion: open-m/v1, kind: Pipeline documents automatically.