Custom Models (BYOK)
Bring your own API keys for Fireworks, Together AI, Bedrock, HuggingFace, and OpenAI-compatible endpoints.
Pipelines supports bring-your-own-key (BYOK) models. You register a model under your organization with credentials you control, and workflows can then select it alongside the built-in platform models.
- Platform models are pre-configured by Pipelines and routed through OpenRouter. No credentials required.
- Custom models (BYOK) are registered by you, authenticate directly against your provider, and appear in the workflow model picker under "Custom".
You find, add, edit, and remove custom models from Models in the left sidebar.
Supported providers
The Provider dropdown in the Add Custom Model dialog exposes these five options:
| Provider | What it's for | Required credentials |
|---|---|---|
| Fireworks | Fireworks-hosted and fine-tuned models | API Key |
| Together AI | Together AI-hosted models | API Key |
| Bedrock | Models on your AWS Bedrock account (Anthropic, Meta, Mistral, Amazon, etc.) | AWS Region + either IAM Access Key pair or Bedrock API Key (bearer token) |
| HuggingFace | Models behind a HuggingFace Inference Endpoint | API Token + Inference Endpoint URL |
| OpenAI-compatible | Any server that implements the OpenAI chat completions API (vLLM, LiteLLM proxy, Ollama gateways, self-hosted proxies, and in practice the OpenAI public API) | Base URL (required) + API Key (optional) |
Pipelines does not expose a dedicated "OpenAI", "Anthropic", or "Google" provider type for BYOK.
- Google (Gemini) models: use the built-in platform models — there is no BYOK path for Google.
- OpenAI models: available as platform models, or via
OpenAI-compatiblepointed athttps://api.openai.com/v1.- Anthropic models: available as platform models, or via
Bedrockon AWS.
Who can manage and use custom models
- Managing custom models (add, edit, test, delete) requires the Org Admin role.
- Using custom models — selecting them in the workflow model picker — is available to every user with access to the org, including Project Admins and Contributors assigned to a project in that org.
Adding a custom model
- From Models in the sidebar, click Add Custom Model.
- Pick a Provider. Credential fields below the dropdown change to match.
- (Optional) If your org already has another active model for the same provider with a stored API key, a Reuse saved credentials dropdown appears. Selecting an existing model copies its API key and Base URL (where applicable); the credential inputs are disabled and show "(Using saved credentials)".
- For Bedrock, the reused source's authentication method must match the one you select below — cross-method reuse is rejected.
- If the source model has no stored API key, the server rejects reuse with a validation error.
- Fill in the provider-specific credentials (see Per-provider fields).
- Fill in the common fields:
- Provider Model ID (required) — the exact model identifier your provider expects (e.g.,
accounts/acme/models/llama-ft-v2on Fireworks,anthropic.claude-3-5-sonnet-20240620-v1:0on Bedrock). Placeholder text shows a per-provider example. - Display Name (required) — shown in the workflow model picker. If cleared, the picker falls back to the Model Slug.
- Model Slug (required) — stable identifier used in workflow configs and CSV seeding. Auto-derived from Display Name (lowercased, non-alphanumerics collapsed to
-), but editable. The slug must not match an existing platform model ID, and must not clash with another active custom model in the same org — the server rejects collisions with HTTP 409. See Reactivating a deleted model for how slug reuse interacts with deleted models. - Max Tokens (required, default
4096, minimum1) — completion cap passed to the provider (not the model's total context window). Pipelines does not clamp this; oversized values surface as provider errors. - Capability toggles (see Capability flags).
- Input cost/token (USD) / Output cost/token (USD) (optional) — used for cost tracking in analytics. Plain USD-per-token (e.g.,
0.0000003for $0.30 per 1M input tokens). Leave blank to disable cost tracking; Test Connection may auto-fill suggested values.
- Provider Model ID (required) — the exact model identifier your provider expects (e.g.,
- Click Test Connection, then Save Model. The test sends a minimal prompt asking the model to return
{"status":"ok"}and shows "Connection test passed." or the exact provider error.- The test honors your Supports JSON mode toggle: on uses the provider's native JSON-schema API; off uses prompt instructions only. Turn JSON mode off before testing if your provider lacks native structured output.
- On success, cost fields auto-fill from Pipelines' pricing lookup only if they were still empty. If no pricing is available, a yellow "Cost tracking requires per-token pricing" warning appears.
Save Model is disabled in the Add dialog until Test Connection succeeds. Changing Provider, Max Tokens, or any capability toggle clears the success state. Edits to credentials, Provider Model ID, reasoning parameter name, or costs do not invalidate the last test — re-test yourself if you change them.
In the detail page's Edit mode, Save is not gated on a successful test. Leaving a credential field blank during an Edit-mode test reuses the stored credential.
Per-provider fields
Fireworks / Together AI
| Field | Required | Description |
|---|---|---|
| API Key | Yes | Provider API key (fw_... for Fireworks). |
Bedrock
Bedrock has two authentication methods. Pick one from Authentication Method:
-
IAM Key Pair (Access Key + Secret Key)
Field Required Description AWS Region Yes Bedrock region, e.g., us-west-2. A bad region surfaces as a Test Connection error.Access Key ID Yes IAM access key ID ( AKIA...). Shown in full on the detail page so you can see which IAM identity is configured.Secret Access Key Yes IAM secret access key. Encrypted at rest; never returned by the API. -
Bedrock API Key (Bearer Token)
Field Required Description AWS Region Yes Bedrock region. Bedrock API Key Yes Short-term (12h) or long-term (30 day) API key from the AWS Bedrock console.
Switching Authentication Method on an existing Bedrock model requires fresh credentials in the same request; the server rejects cross-method reuse with a 422.
HuggingFace
| Field | Required | Description |
|---|---|---|
| API Token | Yes | HuggingFace token (hf_...). |
| Inference Endpoint URL | Yes | A full HuggingFace-compatible inference URL. Supported values include a deployed Inference Endpoint (e.g., https://xxxx.us-east-1.aws.endpoints.huggingface.cloud) or the Serverless Inference API for a specific model (e.g., https://api-inference.huggingface.co/models/<repo-id>). Paste the full URL — not just a Hub repo path. |
OpenAI-compatible
| Field | Required | Description |
|---|---|---|
| Base URL | Yes | Root URL of the OpenAI-compatible endpoint, e.g., https://my-vllm.example.com/v1. Required so requests are never routed to an unintended endpoint. The server enforces this even on PATCH. |
| API Key | No | Only required if your endpoint authenticates requests. Leave blank for open endpoints. |
Capability flags
These toggles tell the platform what the model can do. They control where the model is selectable in the workflow builder and how Pipelines calls it.
| Toggle | Default | Effect |
|---|---|---|
| Supports JSON mode | On | Turn off if the provider cannot return structured JSON natively. |
| Supports vision / image inputs | Off | Turn on if the model accepts image content. |
| Supports tool use | On | Turn off if the model does not support tool/function calling. When off, the model is hidden on LLM fields that have tool bindings. |
| Supports extended reasoning | Off | Turn on for reasoning/thinking-capable models. Reveals the Reasoning parameter name field (see below). |
Capability flags are not auto-detected — set them based on what the underlying model supports. Incorrect flags can cause runtime errors (e.g., JSON mode on when the provider doesn't support it) or silently hide the model from the picker. All toggles can be changed later via Edit on the detail page.
Reasoning parameter name
When Supports extended reasoning is on, the Reasoning parameter name field selects how the reasoning budget is sent to the underlying model. There are exactly two valid values:
| Value | Sent as | Use for |
|---|---|---|
max_tokens (default) | {"reasoning": {"max_tokens": N}} | Anthropic (Claude) and Google (Gemini) reasoning models. |
effort | {"reasoning": {"effort": "<low|medium|high>"}} | OpenAI (GPT reasoning) and xAI (Grok) reasoning models. |
Notes:
- The field is free-text, but only
max_tokensandeffortare recognized. Any other value (including typos likeMax_Tokensorreasoning_effort) silently falls back tomax_tokensbehavior. - Only meaningful when the underlying model supports an extended-reasoning API. Enabling it on a non-reasoning model does nothing or produces provider errors.
- For BYOK, the reasoning payload goes directly to your provider — not through OpenRouter.
- The typical BYOK case is Claude on Bedrock — leave the default
max_tokens.
What happens after you save
- The model appears immediately in Models (under the Custom type filter) and in the workflow model picker under the "Custom" group.
- The Pipeline Builder filters the picker by required capabilities (e.g., an LLM field with a tool binding only lists models with Supports tool use on).
- The Model Slug is the stable reference written into workflow configs and accepted by CSV seeding flows.
- Credentials are encrypted at rest and never returned by the API. The detail page shows only a masked
****<last4>for the stored secret (Bedrock Secret Access Key, bearer token, or provider API key); very short secrets display as plain****. The Bedrock Access Key ID is not a secret and is shown in full so you can see which IAM identity is configured.
Managing custom models
On the Models page, click a custom model row to open its detail page. From there you can:
- Edit — change the Display Name, Provider Model ID, endpoint URL / Base URL, Max Tokens, every capability toggle (including Supports extended reasoning and its Reasoning parameter name), per-token costs, and replace the API key. For Bedrock you can also change AWS Region, Access Key ID, and Authentication Method (switching methods requires new credentials).
- Secret fields show the placeholder "Leave blank to keep existing". Leaving a secret blank preserves the stored credential; entering a new value rotates it. There is no UI affordance to remove a stored secret without replacing it — delete the model if you need to decommission credentials.
- Permanent after creation: Provider and Model Slug. If you need to change either, delete the model and re-add it (you can reuse the same Model Slug only if the new Provider matches the old one).
- Test — re-run the connection test with the currently saved or edited values. Unlike Add, editing and saving is not gated on a successful test here.
- Delete — deactivates the model. The confirmation dialog shows the number of active workflows still referencing the model's slug. Those workflows will fail to generate LLM responses until a different model is assigned.
Reactivating a deleted model
You can re-add a model with the same slug after deletion, which reactivates the inactive record — but only if the provider type matches (the server returns 409 otherwise). Reactivation overwrites all fields from the new request (display name, provider model ID, credentials, capability flags, costs), so you don't need to match the previous settings.
When to use BYOK vs. platform models
| Consideration | Platform models | BYOK (custom models) |
|---|---|---|
| Setup | None | Provider account + credentials |
| Routing | Through OpenRouter | Direct to your provider |
| Model selection | Pipelines' curated platform model list | Any model your account can reach |
| Rate limits | Shared pool on the Pipelines platform | Your account's rate limits |
| Data flow | Egress to OpenRouter and the upstream provider | Egress to the provider (or your proxy) only |
| Cost tracking in analytics | Built in | Requires input/output cost per token to be set on the model |
Use platform models when you want zero setup and are fine with OpenRouter's routing. Use BYOK when you need a specific model not on the platform list, want your own account's rate limits, or must meet data residency / egress requirements that preclude routing through OpenRouter.
Example: adding a Fireworks fine-tune
In the Add Custom Model dialog:
- Provider:
Fireworks+ Fireworks API key. - Provider Model ID:
accounts/your-org/models/llama-3-1-8b-instruct-v4(use the exact ID from Fireworks). - Display Name:
Llama 3.1 8B (internal v4)— slug auto-fills tollama-3-1-8b-internal-v4. - Max Tokens:
4096. - Leave JSON mode and tool use on; turn vision and extended reasoning off.
Run Test Connection, then Save Model. If the pricing lookup misses, fill the cost fields manually to enable analytics cost tracking. The model appears in the workflow model picker under "Custom" as Llama 3.1 8B (internal v4).