Pipelines Docs is in beta — content is actively being added.
Platform GuideEvaluations

Evaluations (Compositions)

Reusable bundles of criteria.

An Evaluation is a named, reusable bundle of criteria. Evaluations are a library grouping primitive — they let you curate a set of criteria ("Response Quality Assessment", "Safety Screen", "Dataset Benchmark v1") that logically go together and track them as one versioned record.

Evaluations are currently a library-only concept — they cannot be attached directly to pipeline nodes as a single unit. To use the criteria in a bundle, add each one individually to a node via the Evaluators panel or Form Builder.

Structure

FieldDescriptionRequired
NameInternal identifier.Yes
DescriptionWhat the bundle evaluates and why.Yes
CriteriaOne or more criteria from the library. Each included criterion pins a specific version (see below).Yes (at least one)

Creating an Evaluation

  1. Go to Evaluations → Evaluations.
  2. Click New Evaluation.
  3. Enter Name and Description.
  4. In Criteria, add one or more criteria from the library.
  5. Click Save.

Versioning

When you add a criterion to an Evaluation, the criterion's current version at that moment is pinned into the Evaluation.

If the included criterion is later updated in the library, the platform automatically bumps the Evaluation's version and updates the pinned criterion version to the new one. You do not need to re-open the Evaluation to get the latest criterion — the composition always follows the criterion as it evolves.

Editing the Evaluation itself (changing the criteria list, name, or description) also creates a new Evaluation version. Past versions are visible in the detail page's version history.

Archiving and deletion

The same rules as criteria apply: an Evaluation in use is archived on delete; an unused one is deleted permanently. Archived items reappear in the library when the archive icon is clicked in the sidebar.