UTL-X Mapping
UTL-X (Universal Transformation Language Extended) is Open-M's native mapping language — an open-source, format-agnostic functional transformation language. Pure, stateless, deterministic. It runs inline on the connection arrow without an extra component hop, or as a full mapping component for complex scenarios. AGPL v3, dual-licensed for commercial use.
What is UTL-X?
UTL-X is a format-agnostic functional transformation language inspired by MuleSoft DataWeave, XSLT, and functional programming principles. It abstracts away the source format — the same transformation logic works whether the input is XML, JSON, CSV, YAML, Protobuf, or Avro. The language is pure: every expression returns a value, there are no side effects, and transformations are always deterministic.
In Open-M pipelines, UTL-X serves as the glue between components with incompatible schemas. It is deliberately lightweight for simple field mapping, and scales to N-input fan-in scenarios using the MPPM step window for contextual history.
UTL-X is a standalone open-source project at github.com/grauwen/utl-x, separately usable outside Open-M. It is embedded in the Open-M runtime as the native mapping engine. The conformance suite has 465 tests passing at 100%.
The three mapping modes
Open-M does not force every field mapping through a full component. Instead, mapping is placed where it belongs — either invisible (Mode 1), lightweight on the arrow (Mode 2), or as an explicit component (Mode 3).
In the IDE, Mode 2 renders as a decorated arrow with a ◇ diamond icon at the midpoint. Mode 3 renders as an explicit rectangular node on the canvas. Mode 1 is a plain arrow with no decoration.
Mode 1 — Plain connection
When the source component's output schema and the target component's input schema are identical, no transform is needed. The connection carries the MPPM envelope payload unchanged. This is the most efficient path — zero processing overhead beyond the Pulsar topic hop.
connections: - id: kafka-to-processor from: component: kafka-in port: output to: component: order-processor port: input # No transform stanza — schemas match, plain connection
Mode 2 — UTL-X inline on the arrow
The most common mapping pattern. A UTL-X transform is declared directly on the connection in the pipeline YAML. It executes inside the receiving component's wrapper — in-process, after dequeue, before business logic — with no extra Pulsar topic or component pod.
Mode 2 has two sub-variants: inline (the mapping body is embedded in the YAML, suitable for small unique mappings under ~20 field assignments) and ref (references a versioned mapping stored in the Mapping Registry, suitable for complex or reused mappings).
Inline mappings and the step window: An inline Mode 2 mapping sits on a single arrow and can only access the step history of that pipe. The wrapper extracts one or more steps from the same MPPM envelope and passes them as named inputs — e.g. input: current json, previous json. Nothing from any other pipe is reachable from an inline mapping. To combine payloads from multiple pipes, use Mode 2 ref with N-input, or Mode 3.
Mode 2 inline
connections: - id: sap-to-salesforce from: component: sap-idoc-in port: output to: component: salesforce-opportunity port: input route: topic: logistics.order-pipeline.sap-idoc-in.out source_schema_ref: logistics.schemas.sap-order-idoc:1.0.0 target_schema_ref: logistics.schemas.sf-opportunity:2.1.0 transform: type: utlx mode: inline mapping: | %utlx 1.0 input: current json, previous json // two steps from this pipe only output json --- { Name: $current.ORDERS05.E1EDK01.BSTNK, AccountId: $current.ORDERS05.E1EDKA1[0].PARTN, Amount: $current.ORDERS05.E1EDP01.NETWR |> toNumber(), CloseDate: $current.ORDERS05.E1EDK03.DATUM |> toDate("yyyyMMdd"), StageName: "Prospecting", PrevOrderRef: $previous.ORDERS05.E1EDK01.BSTNK // step before on same pipe } target_schema_ref: logistics.schemas.sf-opportunity:2.1.0
Mode 2 ref
For complex or reused mappings, store the UTL-X script in the Mapping Registry and reference it by version. This allows independent versioning, testing, and reuse across pipelines.
transform: type: utlx mode: ref ref: logistics.mappings.sap-order-to-sf-opportunity:2.0.0 target_schema_ref: logistics.schemas.sf-opportunity:2.1.0
When to use inline vs ref: inline for pipeline-unique mappings under ~20 field assignments. ref when the mapping is reused across pipelines, requires independent version control, or exceeds ~20 assignments. ref mappings are searchable in the Mapping Registry and visible in the IDE's mapping browser.
Mode 3 — Full mapping component
An explicit component node on the pipeline canvas. Required when any of the following apply:
- The mapping makes external calls (database lookups, API enrichment)
- The mapping has side effects (writing audit records, sending notifications)
- Aggregation or splitting (N→1 or 1→N cardinality changes)
- Non-UTL-X engines (XSLT, JSONATA, JQ) are used
- True fan-in from independent correlation chains (different
correlation_ids) - MPPM step-window traceability in the ops dashboard is required for compliance
components: - id: enrich-with-customer-data ref: open-m.connectors.utlx-mapper:1.0.0 config: mapping_ref: crm.mappings.order-enrichment:3.1.0 engine: utlx ports: input: schema_ref: crm.schemas.raw-order:1.0.0 output: schema_ref: crm.schemas.enriched-order:2.0.0
Language syntax
Every UTL-X transformation follows the same three-part structure: a header declaring version and formats, a separator (---), and a functional expression body that produces the output.
// Header — format declarations %utlx 1.0 input auto // auto-detect: XML, JSON, CSV, YAML output json // target format --- // Body — pure functional expression { invoice: { id: $input.Order.@id, customer: $input.Order.Customer.Name, total: $input.Order.Items |> map(item => item.Price * item.Qty) |> sum() } }
UTL-X uses C-style comments — // for single-line and /* */ for multi-line. Hash (#) is not a comment character and will cause parse errors in the body.
// Pipe operator — chains transformations $input.items |> filter(i => i.active) |> map(i => i.name) // Lambda arrow items |> map(item => { name: item.label, qty: item.count }) // Safe navigation — returns null instead of error $input.Order?.Customer?.Address // XML attribute access $input.Order.@id // Index access $input.Order.Items[0] // Filter using pipe $input.Order.Items |> filter(i => i.Price > 100)
Format-agnostic selectors
The same path syntax works regardless of whether the input is XML, JSON, CSV, or YAML. UTL-X translates the path to the appropriate native access at runtime via its Universal Data Model (UDM).
$input.Order.Customer.Name // Simple path $input.Order.Items[0] // Index access $input.Order.@id // XML attribute / JSON property $input.Order.Items |> map(i => i.Name) // Extract from all elements $input.Order.Items |> filter(i => i.Total > 1000) // Filter by condition // Multi-input — named with $ prefix (see N-input section) $order.header.id $pricing.lines[0].unitPrice
Standard library
UTL-X ships with 635 standard library functions covering strings, arrays, dates, math, type conversion, and schema operations. All functions work on the Universal Data Model and are format-independent.
// String functions upper($input.name) trim($input.description) replace(value, ({"\\n": "", "\\t": ""})) // object literal needs ( ) contains($input.code, "SAP") // Array / collection functions map(items, item => item.name) filter(items, item => item.active) reduce(items, (acc, item) => acc + item.amount, 0) sum(prices) groupBy(items, item => item.category) flatten(nestedList) distinct(values) // Date / time functions toDate($input.dateStr, "yyyyMMdd") formatDate(date, "yyyy-MM-dd") now() // Type conversion toNumber($input.amount) toString(42) toBoolean("true") // Null handling default($input.optional?.field, "N/A") isNull($input.value)
The step window
Every MPPM envelope carries a step window — an ordered history of the last N payloads the message passed through. This gives UTL-X access to previous states of the data without any stateful store or external lookup.
The step window is an Open-M MPPM envelope concept — it is not a UTL-X language feature. UTL-X itself has no awareness of message history. If a previous step's payload is needed inside a UTL-X mapping, the Open-M wrapper extracts the relevant envelope step and passes it as a named input declared in the UTL-X header. Previous steps are just inputs like any other — they must be named.
%utlx 1.0 input: current json, previous json output json --- { // Current payload — named input "$current" orderId: $current.id, finalAmount: $current.pricing.finalAmount, // Previous step payload — named input "$previous" originalPrice: $previous.pricing.baseAmount, sourceSystem: $previous.metadata.origin, // Computed across both priceChange: $current.pricing.finalAmount - $previous.pricing.baseAmount }
The Open-M pipeline YAML specifies which envelope steps to extract and under which input alias to pass them to the UTL-X script. The mapping itself stays pure — it only sees named inputs, with no knowledge of the MPPM envelope or step indices.
Multiple MPPM envelopes as inputs: A UTL-X mapping can receive payloads from N independent MPPM envelopes — each from a different correlation chain — as long as each is passed as a distinctly named input by the Open-M wrapper. Mode 3 is only required when the join needs stateful waiting — i.e. the envelopes don't arrive at the same time and need to be held until all are present. If the wrapper can assemble them synchronously, Mode 2 is valid.
Input naming convention
Since previous step payloads and multi-envelope inputs are all just named inputs, a clear naming convention makes mappings self-documenting. The recommended pattern is {source}{Stage} — name after what the data represents, not its technical position:
// Current + previous step from the same pipeline input: orderCurrent json, orderPrevious json // Multiple independent sources input: orderValidated json, pricingResult json, customerProfile xml // Named after the upstream microservice/component input: sapIdocIn xml, workdayEmployee json, sfOpportunity json // Step history from a named pipe — explicit depth in name input: pipe1Step0 json, pipe1Step1 json, pipe2Step0 xml // Mixed formats — each declared with its own format input: orderHeader xml, orderLines csv, pricing json
The Open-M pipeline YAML transform.inputs stanza binds each arrow or envelope step to its alias name, which must match exactly what the UTL-X header declares. The mapping itself stays pure — it only ever sees named inputs.
N-input fan-in (Mode 2)
UTL-X supports multiple named input arrows converging on a single transform. Each input carries its own step window. The transform is declared on the trigger arrow (the one whose arrival fires the mapping) and references context arrows by alias.
This is valid as Mode 2 only when all inputs share the same correlation_id — meaning they are parts of the same message chain. The wrapper assembles the context synchronously from a short-term buffer (default: 1,000 envelopes, 5 second TTL).
transform: type: utlx mode: ref ref: logistics.mappings.compose-dispatch:1.0.0 trigger: true # this arrow owns the transform inputs: - alias: order arrow: conn-order-to-dispatch-mapper source_schema_ref: logistics.schemas.order-validated:1.0.0 window_depth: 3 - alias: pricing arrow: conn-pricing-to-dispatch-mapper source_schema_ref: logistics.schemas.order-pricing:1.0.0 window_depth: 2 correlation_mode: same_chain target_schema_ref: logistics.schemas.fulfillment-dispatch:2.0.0
Inside the UTL-X mapping, each named input is accessible via its alias with a $ prefix. Each alias also exposes its step window:
%utlx 1.0 input: order json, pricing json output json --- { // Access named inputs by alias dispatchId: $order.header.orderId, customerId: $order.customer.id, totalAmount: $pricing.summary.grossTotal, currency: $pricing.summary.currency, lines: $order.lines |> map(line => { sku: line.productCode, qty: line.quantity, price: $pricing.lines[line.lineId].unitPrice }) }
Step window propagation rule: Only the trigger arrow's history propagates forward into the downstream envelope. If a previous step payload is needed inside the mapping, it must be passed as an additional named input declared in the UTL-X header — the wrapper extracts the relevant envelope step and passes it by name. If relevant context data is needed further downstream, explicitly include it in the mapping output.
Which mode to use?
| Situation | Mode | Reason |
|---|---|---|
| Schemas match exactly | Mode 1 | No transform needed. Zero overhead. |
| Simple field rename or type conversion, unique to this pipeline | Mode 2 inline | Pure, stateless, <20 assignments. Lives in the YAML. |
| Complex mapping reused across pipelines | Mode 2 ref | Independently versioned in Mapping Registry. |
| Fan-in from same correlation chain (N inputs, same correlation_id) | Mode 2 ref (N-input) | Same-chain context assembled synchronously by wrapper. |
| Fan-in from independent correlation chains (different correlation_ids) | Mode 3 | Requires stateful wait. Use stateful-join component. |
| Mapping makes external API or database calls | Mode 3 | Side effects disqualify Mode 2 (must be pure). |
| Aggregation or splitting (N→1 or 1→N) | Mode 3 | Cardinality changes require explicit component. |
| Non-UTL-X engine (XSLT, JSONATA, JQ) | Mode 3 | External engines always run as components. |
| Compliance requires MPPM step traceability | Mode 3 | Mode 3 adds an MPPM step entry visible in the ops dashboard. |
UTL-X vs DataWeave and alternatives
| Feature | UTL-X | DataWeave | XSLT | JSONata / JQ |
|---|---|---|---|---|
| Format agnostic | ✓ XML, JSON, CSV, YAML, Avro, Protobuf | ✓ | ✗ XML only | ✗ JSON only |
| Licence | ✓ AGPL v3 / Commercial | ✗ Proprietary (Salesforce) | ✓ W3C open standard | ✓ MIT / MIT |
| Inline on message arrow | ✓ Mode 2 | ✗ Always a component | ✗ | ✗ |
| Step window history | ✓ Built into MPPM | ✗ | ✗ | ✗ |
| N-input fan-in | ✓ Same-chain, Mode 2 | ~ Via DataWeave scripts | ✗ | ✗ |
| Functional / pure | ✓ | ✓ | ✓ | ✓ |
| Standard library | ✓ 635 functions | ✓ Rich | ~ XPATH/EXSLT | ~ Limited |
| LSP / IDE support | ✓ JSON-RPC 2.0 daemon | ✓ MuleSoft IDE only | ~ | ✗ |
| Use without middleware platform | ✓ Standalone CLI | ✗ Requires MuleSoft | ✓ | ✓ |