Pipeline Builder Overview
Design multi-step data pipelines with a visual node-and-edge canvas — connect subtasks, reviews, and logic gates into end-to-end workflows.
The Pipeline Builder is a visual canvas for designing data pipelines. You connect nodes together to define the steps, routing, and quality control in your pipeline. Pipelines can range from a single-step form that collects one human response, to complex multi-stage configurations with LLM-generated fields, cross-node dependencies, automated quality gates, and multi-round human review loops — all within the same drag-and-drop builder.
Opening the builder
New pipeline:
- Navigate to your project in the sidebar.
- Go to Pipelines and click New Pipeline.
- A Create New Pipeline dialog appears — choose Start from Scratch to create a blank pipeline, or select a template if available.
- The builder opens in a full-screen canvas view.
Existing pipeline:
- Click an existing pipeline to open its Data Explorer.
- Click Pipeline Editor ↗ in the header to open the builder.
Canvas basics
The builder uses a node-and-edge graph model:
- Nodes represent pipeline steps (subtasks, reviews, logic gates).
- Edges connect nodes together, defining the flow of tasks.
- Handles are the connection points on each node — inputs on the left, outputs on the right.
Navigation
- Pan: Click and drag on the canvas background.
- Zoom: Scroll or pinch to zoom in and out.
- Select: Click a node to select it and open its configuration panel.
- Multi-select: Hold Shift and click multiple nodes.
Adding nodes
Drag nodes from the left sidebar onto the canvas.
Connecting nodes
Drag from an output handle to an input handle to create a connection. The builder validates that the overall graph is connected — every node must be reachable from the start node and have a path to an end node.
Pipeline structure
Every pipeline requires:
- Exactly one Start node — the entry point where tasks begin.
- Exactly one End node — where tasks complete.
- A connected path from Start to End through at least one work node.
The builder validates your graph in real-time and shows errors (with a count in the header bar) if the structure is invalid — disconnected nodes or missing paths. Click the error indicator to see a list of specific issues.
Node types
| Node | Purpose |
|---|---|
| Start | Entry point — where tasks begin. Can only connect to Subtask nodes. |
| Subtask | A work step — can collect human input, serve predefined data, or generate LLM responses |
| Review | Human quality review of upstream work, with approve and reject/rework paths |
| Logic Gate | Automated conditional routing with Then/Else branches |
| End | Completion point — where tasks finish |
Nodes can have multiple input and output connections. See Node Types for detailed documentation on each.
Configuration panel
Clicking any node opens its configuration panel on the right side of the canvas. The panel contents vary by node type:
- Subtask nodes: Form builder for designing the contributor form, LLM field configuration, evaluation bindings.
- Review nodes: Linked subtask selection, review fields, reject reassignment mode, pass assignment mode, max re-review limits.
- Logic gate nodes: Rule builder with conditions and operators.
Pipeline settings
Pipeline-level settings are accessible from the settings panel in the left sidebar. Each section has an enable/disable toggle:
| Setting | Description |
|---|---|
| Claim Timeout | Auto-release: how long (in hours, 1–720) a contributor has to complete a claimed task before it's released back to the pool. |
| Claim Limits | Maximum concurrent claims per contributor, configurable per role. |
| Schedule | Time windows when tasks are available for claiming, configurable per role. Includes day-of-week selection and an optional auto-release toggle. |
| Task Groups | Group tasks by a dataset column for contributor uniqueness constraints. Only available when the pipeline uses CSV/dataset seeding — i.e., any node has a Predefined field, a dataset-sourced LLM prompt, a dataset-sourced model, or predefined dynamic options/labels. |
| Tags | Labels for organizing and filtering tasks. Supports predefined tags, optional custom tags from contributors, and role-based tag visibility (all, admin_only, or specific roles). |
The pipeline name is editable inline in the builder header.
Pipeline lifecycle
Pipelines move through the following statuses:
| Status | Description |
|---|---|
| Draft | Initial state. The pipeline is being designed and has not been published. |
| Active | Published and accepting tasks. Contributors can claim and work on tasks. |
| Paused | Published but not actively running. All claimed tasks are reverted — contributors cannot submit work and paused-pipeline tasks are hidden from contributor views. When the pipeline is reactivated, tasks are resumed and re-queued. |
| Completed | All tasks have finished. The pipeline is no longer accepting new work. |
| Archived | Retired permanently. Cannot be reactivated. |
| Pending Activation | A new version has been created but is awaiting admin review before it becomes the live version. The previous active version stays live until the admin confirms. This is a version-level status, not a pipeline-level status. |
Saving and publishing a draft
Save and publish are separate actions:
- Save saves your work without publishing. You can save a draft at any time and come back later. Validation errors do not block saving — you'll see a warning but the save goes through.
- Publish & Activate validates the pipeline, creates version 1, and immediately starts accepting tasks. Validation errors must be resolved before publishing.
- Publish to Paused (overflow menu) creates the version without activating — useful when you want to review the version before going live.
Updating a live pipeline
For pipelines that are already Active or Paused:
- Make your changes in the builder.
- Click Save. The system saves your changes and compares them against the last published version to detect what changed. Validation errors do not block saving.
- If the pipeline has no tasks yet, the new version activates immediately.
- If the pipeline has existing tasks, the version enters Pending Activation and the Version Management Modal opens for the admin to review changes and decide how to handle in-flight tasks. Validation errors must be resolved before activation.
See Versioning & Publishing for details on change tiers, the admin decision UI, and the migration process.