Skip to main content

Temporal Nexus - in Temporal Cloud

SUPPORT, STABILITY, and DEPENDENCY INFO

Temporal Nexus is available in Public Preview.

This Temporal Nexus guide covers using Temporal Nexus in Temporal Cloud.

Introduction to Nexus

Nexus Services are exposed from a Nexus Endpoint that is created in the Nexus Registry. Nexus Endpoints include service documentation that enables others to use the Nexus Services provided by a Nexus Endpoint, for example from caller Workflows.

Nexus Overview

Temporal has built-in Nexus Machinery that performs the low-level Nexus RPC operations on the wire and provides an integrated Temporal SDK experience to create Nexus Services in a handler Worker and use them from a caller Workflow through a Nexus Endpoint.

In Temporal Cloud, the Nexus Machinery routes Nexus requests across Namespace boundaries through a secure mTLS Envoy mesh, where they are sync matched to a Worker in the handler Namespace, which processes the Nexus request.

Nexus caller Workflows will send commands to Temporal Cloud to schedule or cancel a Nexus Operation in the caller Namespace. Handler Workers will poll for Nexus tasks in the handler Namespace.

Nexus Operation handlers can be implemented as Asynchronous Operations or Synchronous Operations, using Temporal SDK builders that enable running a Nexus Service in a Worker, typically the same Worker as the underlying primitives the Nexus Service abstracts.

For example, an asynchronous operation that starts a Workflow in a different Namespace:

Unlike a traditional RPC, an arbitrary-duration Nexus Operation has an identity (ID) that lasts beyond the process lifetimes of the caller and handler, and can be used to re-attach to a long-lived Nexus Operation, for example one backed by a Temporal Workflow.

When you schedule a Nexus Operation in a caller Workflow, a command is sent to Temporal to schedule the Operation, and then the Temporal Nexus Machinery is responsible for making the Nexus RPC calls on your behalf. This means you don’t have to use Nexus RPC directly, only the Temporal SDK along with a Temporal Service.

NexusYourCloud

Similar to executing an Activity, to execute a Nexus Operation from a Workflow:

  1. Caller Workflow will schedule a Nexus Operation using the Temporal SDK, which sends a ScheduleNexusOperation command in the existing RespondWorkflowTaskCompleted RPC which is recorded as NexusOperationScheduled in the caller’s Workflow history.
  2. Nexus handler Worker uses the Temporal SDK to PollNexusTaskQueue and process the Nexus request.
  3. Nexus handler Worker sends a RespondNexusTaskCompleted through the SDK, with the result of the Nexus request, which may be one of: Started, Completed, Failed, TimedOut, or Canceled. See Nexus Operation Events for more info
  4. Caller Workflow then processes the Nexus Operation event through PollWorkflowTaskQueue.

See the Temporal Nexus encyclopedia entry for more details.

Access Control

Like Namespaces, a Nexus Endpoint is an account-scoped resource that is global within a Temporal Cloud account. Any Developer role (or higher) in an account, who is also a Namespace Admin on the endpoint’s target Namespace, can manage (create, update, or delete) a Nexus Endpoint. All users with a Read-only role (or higher) in an account, can view and browse the full list of Endpoints.

Runtime access from a Workflow in a caller Namespace to a Nexus Endpoint is controlled by an allowlist policy (of caller Namespaces) for each Endpoint in the Nexus API registry. Workers authenticate with Temporal Cloud as they do today with mTLS certificates or API keys as allowed by the Namespace configuration. In Temporal Cloud, Nexus requests are sent from the caller’s Namespace to the handler’s Namespace over a secure multi-region mTLS Envoy mesh.

See Nexus Security for additional details.

Automatic Retries

Once the caller Workflow schedules an Operation with the caller’s Temporal Service, the caller’s Nexus Machinery keeps trying to start the Operation, with automatic retries and exponential backoff. If a Nexus Operation returns a retryable error when attempting to start, the Operation it will be retried up to the default Retry Policy’s maximum attempts and expiration interval.

See Nexus Automatic Retries for additional details.

Failure Handling

In a Nexus Operation handler, you can throw a Nexus HandlerError with an associated Nexus HandlerErrorType. If an unknown Error is thrown from a Nexus handler, it is treated as a retryable 500 InternalServerError, that Nexus will attempt to retry until the Schedule-to-Close Timeout is exceeded (which is capped at 60 days max by the server). See [Nexus Operation duration limits].

All retryable errors are automatically retried by the Nexus Machinery, as discussed in Automatic Retries. If a non-retryable error is returned from the handler, then it is returned as a Nexus Operation Failure to the caller Workflow. The list of non-retryable errors is currently not configurable.

A Nexus Operation Failure is delivered to the Workflow Execution when a Nexus Operation fails. It contains information about the failure and the Nexus Operation Execution; for example, the Nexus Operation name and Nexus Operation Id. The reason for the failure is in the message and in the underlying cause is typically an Application Error or a Canceled Error.

note

This differs from how Activities and Workflows handle errors and retries:

See Failures and Automatic Retries for additional details.

Execution Semantics

At-least-once Execution Semantics and Idempotency

Nexus Operations, like Activities, have at-least-once execution semantics. The caller's Nexus Machinery will keep trying to start the Operation multiple times. The Nexus Operation handler should be idempotent just like Activities should be idempotent as a general rule. It's not required in all cases, but highly recommended in general.

At-most-once Execution Semantics through an Underlying WorkflowIdReusePolicy

To dedupe work and get at-most-once execution semantics, an Operation can start a Workflow with a WorkflowIdReusePolicy of RejectDuplicates which only allows one Workflow Execution per Workflow Id within a Namespace for the retention period.

Execution Debugging

Execution debugging with Nexus includes end-to-end executions that span:

  • Caller Workflows
  • One or more Nexus Operations routed within and across Namespaces
  • Underlying Temporal primitives created by a Nexus Operation handler like a Workflow

Multi-level Nexus calls are supported:

  • Workflow A → Nexus Op 1 → Workflow B → Nexus Op 2 → Workflow C

Underlying Workflow ID is returned as the Nexus Operation ID

When a Nexus Operation is started by a caller Workflow that's processed by a Temporal SDK New-Workflow-Run-Operation handler, the underlying Workflow ID is returned as the Nexus Operation ID which is reflected in the Nexus Operation Started event in the caller’s Workflow history.

Workflow history

This can be used to search the handler’s Namespace for that Workflow ID:

Search handler's Namespace

This may also be done using: temporal workflow show –detailed

--------------- [5] NexusOperationScheduled ---------------
endpoint: myendpoint
endpointId: 80a4fb3e7ab145eabc6a3b15e327548f
eventTime: 2024-08-28T03:44:34.985230930Z
input.Language: es
input.Name: Nexus
operation: say-hello
requestId: 1307660f-7f2e-4626-8629-851a0e468482
scheduleToCloseTimeout: 0s
service: my-hello-service
taskId: 158300487
version: 1265
workflowTaskCompletedEventId: 4

--------------- [6] NexusOperationStarted ---------------
eventTime: 2024-08-28T03:44:35.198292012Z
operationId: 1307660f-7f2e-4626-8629-851a0e468482
requestId: 1307660f-7f2e-4626-8629-851a0e468482
scheduledEventId: 5
taskId: 158300491
version: 1265

Which can then be searched using: temporal workflow list –query.

However, this requires knowing the Endpoint’s target Namespace and manual steps, which is why we’ve created [bi-directional linking for Nexus Operations] to navigate forwards and backwards across Workflow event histories, through the Nexus Operations and underlying Temporal primitives they may create.

Pending Operations

Similar to pending Activities, pending Nexus Operations are displayed in the Workflow details page and using: temporal workflow describe.

For example, from the Temporal UI:
Pending Operations

For example, from the temporal CLI:

temporal workflow describe

Pending Nexus Operations: 1

Endpoint myendpoint
Service my-hello-service
Operation echo
OperationID
State BackingOff
Attempt 6
ScheduleToCloseTimeout 0s
NextAttemptScheduleTime 20 seconds from now
LastAttemptCompleteTime 11 seconds ago
LastAttemptFailure {"message":"unexpected response status: \"500 Internal Server Error\": internal error","applicationFailureInfo":{}}

Pending Callbacks

Nexus callbacks are sent from the handler’s Namespace to the caller’s Namespace to complete an asynchronous Nexus Operation. These show up in the UI and using: temporal workflow describe.

For example, from the Temporal UI:
Pending Callbacks

For example, from the temporal CLI:

temporal workflow describe


Callbacks: 1

URL https://nexus.phil-caller-Namespace.a2dd6.cluster.tmprl.cloud:7243/Namespaces/phil-caller-Namespace.a2dd6/nexus/callback
Trigger WorkflowClosed
State Succeeded
Attempt 1
RegistrationTime 32 minutes ago

Bi-directional linking

To support deep linking for Nexus Operations that may span Workflows across Namespace boundaries, and invoke Update, Signals, and other operations on a Workflow, linking support is available for Workflow history events. This enables a given history event to link to events in other Workflows, including across Namespace boundaries.

Temporal SDK Nexus Operation builders, like NewWorkflowRunOperation, use this new Event History linking capability to auto-wire bi-directional links from a specific Nexus Operation event in the callers Workflow history to a specific event in the handler’s Workflow history. In Public Preview, this is only supported for NewWorkflowRunOperation, which does the auto-wiring of bi-directional links for you, but it is not currently supported for NewSyncOperation.

This enables bi-directional navigation across Workflow histories:

  • Forward through the Nexus Operation execution:
    • From a Nexus Operation event in the callers Workflow history.
    • To the underlying event in the handler’s Workflow.
  • Backwards through the Nexus Operation execution:
    • From the underlying event in the handler’s Workflow.
    • To a Nexus Operation event in the callers Workflow history.

Metrics

Scheduling and processing a Nexus Operation is reported through existing cloud metrics with the following operation metric labels:

  • Caller Namespace
    • RespondWorkflowTaskCompleted: This is used to schedule the Nexus Operation.
  • Handler Namespace
    • PollNexusTaskQueue
    • RespondNexusTaskCompleted
    • RespondNexusTaskFailed

See Cloud Metrics for additional details.

Audit Logging

The following Nexus control plane actions are sent to the Audit Logging integration:

  • Create Nexus Endpoint: CreateNexusEndpoint
  • Update Nexus Endpoint: UpdateNexusEndpoint
  • Delete Nexus Endpoint: DeleteNexusEndpoint

See Audit Logging for additional details.

Rate Limiting

Nexus requests (commands, polling) are counted as part of the overall Namespace RPS limit in both the caller and handler Namespaces. Default Namespace RPS limits are set at 1600 and automatically adjust based on recent usage (over prior 7 days).

SLOs & SLAs

Nexus requests (commands, polling) have the same latency SLOs and error rate SLAs as other Worker requests in both the caller and handler Namespaces.

See Availability and SLA for additional details.

Limits

Max Nexus Endpoints

By default, each account is provisioned with a max of 10 Nexus Endpoints. You can request further increases beyond the initial 10 Endpoint limit by opening a support ticket.

Workflow Max Nexus Operations

A single Workflow Execution can have a maximum of 30 in-flight Nexus Operations and 30 total Nexus Operations (as Public Preview does not yet remove completed Nexus Operations from mutable state). After that limit is reached, no more Nexus Operations will be processed for that Workflow Execution.

Nexus Request Handler Timeout

Nexus Operation handlers have less than 10 seconds to process a single Nexus start or cancel request. Handlers should observe the context deadline and ensure they do not exceed it. This includes fully processing a synchronous Nexus operation and starting an asynchronous Nexus operation, for example one that starts a Workflow. If a handler doesn’t respond within a context deadline, a context deadline exceeded error will be tracked in the caller Workflow’s pending Nexus operations, and the Nexus Machinery will retry the Nexus request with an exponential backoff policy.

Nexus Operation Maximum Duration

Each Nexus Operation has a maximum ScheduleToClose duration of 60 days, which is most applicable to asynchronous Nexus Operations that are completed with an asynchronous callback using a separate Nexus request from the handler back to the caller Namespace. The 60 day maximum is a limit we will look to increase at some point in the future. While the caller of a Nexus Operation can configure the ScheduleToClose duration to be shorter than 60 days, the maximum duration can not be extended beyond 60 days and will be capped by the server to 60 days.

Secure Routing

Nexus Endpoints are only privately accessible from within a Temporal Cloud and mTLS is used for all Nexus communication, including across cloud cells and regions. Workers authenticate to their Namespaces through mTLS or an API key as allowed by their Namespace configuration.

See Nexus Secure Routing for additional details.

Payload Encryption

For payload encryption, the DataConverter works the same for a Nexus Operation as it does for other payloads sent between a Worker and Temporal Cloud.

See Nexus Payload Encryption & Data Converter for additional details.

Pricing

One Action to start or cancel a Nexus Operation in the caller Namespace.

The underlying Temporal primitives such as Workflows, Activities, Signals created by a Nexus Operation handler (directly or indirectly) result in the normal Actions for those primitives. This includes retries for underlying Temporal primitives like Activities.

No Action results for handling or retrying the Nexus Operation itself. However, while the retry of the Nexus Operation incurs no charge, any billable action initiated by the handler (such as an Activity) will be charged if it fails and is subsequently retried.

See Pricing for additional details.

Getting Started

Calls across existing Namespaces can be enabled by creating a Nexus Endpoint in the Temporal Nexus Registry, creating a Nexus Service in a Worker in the handler Namespace, and then using the Nexus Service from a caller Workflow in a different Namespace.

Monolithic Namespaces can be decomposed into multiple Namespaces, by hiding service implementations behind a Nexus Endpoint in the monolithic Namespace, pointing all consumers at the new Nexus Endpoint, and then changing the Endpoint’s target Namespace to a different Namespace. Multiple Nexus Endpoints can target a single monolithic Namespace.

Cross-Namespace

Follow these steps to enable calls across existing Namespaces:

  1. Add Nexus Services to the same Workers as the Temporal primitives being abstracted.
  2. Add a Nexus Endpoint that:
    1. Targets the handler Namespace.
    2. Allows the caller Namespace.
  3. Make Nexus calls from a caller Workflow in a different Namespace.
    1. Use workflow.NewNexusClient(endpointName, serviceName).
    2. Execute a Nexus Operation with nexusClient.ExecutionOperation(...).

Decompose a Monolithic Namespace

Multiple Nexus Endpoints can target a single Namespace, and then each Endpoint can be updated, one at a time, to target separate Namespaces, for an incremental migration.

Once Nexus Endpoints are in place, targeting a new Namespace can be done with config changes and zero downtime. New Nexus requests will be routed to the new target Namespace, and existing Nexus requests will be completed in the previous Namespace.

Follow these steps to decompose a large monolithic Namespace:

  1. Hide service implementations behind a Nexus Endpoint.
    1. Add Nexus Services to the same Workers as the Temporal primitives being abstracted.
    2. Add Nexus Endpoints to the Nexus Registry, with monolithic Namespace as the target.
  2. Service consumers use the Nexus Endpoint instead of the underlying implementation.
    1. This can be done incrementally, until there are no direct caller dependencies on the underlying service implementations, that is the underlying Temporal Workflow and Activity primitives.
  3. Move service implementations to a different Namespace.
    1. Create a new Namespace.
    2. Add a Worker deployment with the Nexus Service.
    3. Update the Nexus Endpoint target to the new Namespace.
    4. Configure the Endpoint allowlist to allow calls from the original monolithic Namespace.
    5. New Nexus Operations will be routed from callers in the monolithic Namespace to the new Namespace.
  4. Quiesce Nexus Operations on the previous target Namespace.
    1. Leave the previous Worker deployment running until all existing Nexus Operations in the previous Namespace have completed. This includes their underlying Workflows, if any.
    2. Previous Workers in the monolithic Namespace can have the service implementation removed, since it is now being served from the new Endpoint target Namespace.

Please note the following points:

  • For long-lived entity Workflows, new Nexus requests will be routed to the new target Namespace. Long-lived entity-Workflow will still be running on the previous Namespace.
  • When using Signal-with-Start behind a Nexus Operation, a new entity-Workflow will start in the new target Namespace.
  • A cancel Nexus Operation request from a caller Workflow will be routed through the Nexus Endpoint to the new target Namespace. The new Namespace may not have the underlying Workflow.

In these scenarios, Temporal can provide guidance on different migration strategies.