Parallel steps & dependencies

Z.E.N. workflows are graphs, not lists. A node can depend on zero, one, or many earlier nodes. Anything without unresolved dependencies runs in parallel. The structure of that graph is technically called a DAG; you don't have to think about that name to use it.

The mental model

You write nodes. Each node optionally lists what it depends on:

yaml

nodes:
  - id: a
    type: prompt
    prompt: "Step one"

  - id: b
    type: prompt
    depends_on: [a]
    prompt: "Step two, after step one"

  - id: c
    type: prompt
    depends_on: [a]
    prompt: "Another step two, also after step one"

  - id: d
    type: prompt
    depends_on: [b, c]
    prompt: "Step three, after both"

Z.E.N. fires a first. As soon as a finishes, b and c fire in parallel because they only need a. When both finish, d fires. Total wall-clock time is roughly time(a) + max(time(b), time(c)) + time(d), not the sum.

Why parallel matters

Most useful workflows have steps that don't depend on each other. Pulling Slack and Notion. Searching three databases. Drafting and fact-checking. If the workflow runs them serially, you wait for the slowest sum. If the workflow runs them in parallel, you wait for the slowest individual step. The difference is real, often the difference between a workflow that runs in 30 seconds and one that runs in 3 minutes.

What "depends_on" actually checks

When b lists depends_on: [a], two things happen:

Ordering. b won't start until a is in a terminal state (completed, failed, or cancelled).
Failure propagation. By default, if a fails, b is marked skipped. The whole downstream subtree of b is also skipped.

You can override failure propagation per-node with on_failure: continue if downstream work should run even when the upstream failed.

Outputs flow through the graph

Every node's output is available to its downstream nodes as $<node-id>.output:

yaml

  - id: pull-data
    type: bash
    command: curl -s https://api.example.com/things

  - id: summarize
    type: prompt
    depends_on: [pull-data]
    prompt: |
      Summarize the following list of things:

      $pull-data.output

Variables interpolate at render time, just before the node fires.

The graph has to be acyclic

A node can't depend on itself, directly or indirectly. If your workflow says a -> b -> c -> a, Z.E.N. rejects it at load time. (This is the "acyclic" half of "directed acyclic graph.")

If you genuinely need a loop, use a loop node. Those have a controlled iteration model that doesn't break the acyclic guarantee.

Performance tuning

For workflows with many independent nodes, you can cap parallelism so you don't blow through your provider's rate limit:

yaml

concurrency:
  max_parallel_nodes: 4

Set it at the workflow level. The default is unbounded (Z.E.N. will run every available node concurrently). Hit a rate limit once and you'll want to cap.

What you can ignore

The DAG terminology, in normal conversation. When you describe a workflow to someone, "the morning brief has two parallel pulls and a combine step" is what you say. The shape is the DAG; you don't have to call it that.

Author a workflow for the bigger picture.
Loop nodes when you genuinely need a repeating step.
Approval steps when you want a human in the middle.

Parallel steps & dependencies ​

The mental model ​

Why parallel matters ​

What "depends_on" actually checks ​

Outputs flow through the graph ​

The graph has to be acyclic ​

Performance tuning ​

What you can ignore ​

Next ​

Parallel steps & dependencies

The mental model

Why parallel matters

What "depends_on" actually checks

Outputs flow through the graph

The graph has to be acyclic

Performance tuning

What you can ignore

Next