Error Handling (onError)
onError defines what happens when a resource fails. Without it, any error stops the workflow immediately. With it, you can retry, substitute a fallback value, or log the error and continue.
Complete reference
# resources/example.yaml
onError:
action: continue # "continue" (use fallback), "retry", or "fail" (default)
maxRetries: 3 # for action: retry -- total attempts after the first
retryDelay: "1s" # wait between retries
fallback: # for action: continue -- what get('resourceId') returns on failure
status: "error"
message: "Service unavailable"
expr: # expressions that run when an error is caught
- set('errorMessage', error.message)
- set('errorLogged', true)
when: # only apply onError if one of these is true
- error.type == 'TIMEOUT' # otherwise the error propagates normally
- error.message contains 'connection refused'| action | what happens |
|---|---|
continue | downstream resources run; get('resourceId') returns the fallback |
fail | workflow stops and returns the error (default when no onError block) |
retry | resource is retried up to maxRetries times; fails after that |
Basic Usage
Continue with Fallback
Continue execution even if the resource fails, using a fallback value:
# resources/fetch-data.yaml
actionId: fetchData
httpClient:
url: "https://api.example.com/data"
method: GET
onError:
action: continue
fallback:
data: []
fromCache: false
error: trueWhen the HTTP request fails, the resource returns the fallback value instead of stopping the workflow.
Continue without Fallback
If no fallback is provided, the resource returns an error info object:
# resources/example.yaml
onError:
action: continueThe output will be:
{
"_error": {
"message": "connection refused",
"handled": true
}
}Retry with Backoff
Automatically retry failed operations:
# resources/unreliable-api.yaml
actionId: unreliableApi
httpClient:
url: "https://flaky-api.example.com/data"
method: GET
onError:
action: retry
maxRetries: 3
retryDelay: "1s"This will:
- Execute the resource
- On failure, wait 1 second
- Retry up to 3 times total
- If all retries fail, return an error
Explicit Fail
Explicitly mark that errors should stop execution (useful for documentation):
# resources/example.yaml
onError:
action: failThis is the default behavior when no onError is configured.
Advanced Usage
Error-Specific Handling with when
Handle only specific types of errors:
# resources/example.yaml
onError:
action: continue
fallback:
status: "timeout"
when:
- error.type == 'TIMEOUT'
- error.message contains 'deadline exceeded'If the error doesn't match any when condition, the error is NOT handled and propagates normally.
Execute Expressions on Error (expr)
Run expressions when an error occurs (useful for logging, metrics, etc.):
# resources/example.yaml
onError:
action: continue
expr:
- set('lastError', error.message, 'session')
- set('errorCount', get('errorCount', 'session') + 1, 'session')
- set('errorTimestamp', info('timestamp'))
fallback:
error: true
retryLater: trueThe expressions have access to the error object:
error.message- The error message stringerror.type- Error type/code (e.g., "TIMEOUT", "VALIDATION_ERROR")error.code- Error code (same as type)error.statusCode- HTTP status code (if applicable)error.details- Additional error details (if available)
Dynamic Fallback Values
Fallback values can include expressions:
# resources/example.yaml
onError:
action: continue
fallback:
data: "{{ get('cachedData', 'session') }}"
timestamp: "{{ info('timestamp') }}"
error: trueUse Cases
Resilient API Calls
# resources/fetch-user-data.yaml
actionId: fetchUserData
httpClient:
url: "https://api.example.com/users/{{ get('userId') }}"
method: GET
timeout: 5s
onError:
action: retry
maxRetries: 3
retryDelay: "500ms"Graceful Degradation
# resources/llm-enhancement.yaml
actionId: llmEnhancement
chat:
prompt: "Enhance this text: {{ get('text') }}"
onError:
action: continue
fallback: "{{ get('text') }}" # Return original text on failureCircuit Breaker Pattern
# resources/external-service.yaml
actionId: externalService
# Check circuit breaker state first
validations:
skip:
- get('circuitOpen', 'session') == true
httpClient:
url: "https://api.example.com/data"
method: GET
onError:
action: continue
expr:
# Increment failure count
- set('failCount', default(get('failCount', 'session'), 0) + 1, 'session')
# Open circuit after 5 failures
- set('circuitOpen', get('failCount', 'session') >= 5, 'session')
fallback:
error: true
circuitBreaker: "open"Database Fallback
# resources/query-primary.yaml
actionId: queryPrimary
sql:
connectionName: primary
query: "SELECT * FROM users WHERE id = ?"
params:
- "{{ get('userId') }}"
onError:
action: continue
fallback: null
---
actionId: queryReplica
requires:
- queryPrimary
# Only query replica if primary failed
validations:
skip:
- get('queryPrimary') != null
- safe(get('queryPrimary'), '_error') == nil
sql:
connectionName: replica
query: "SELECT * FROM users WHERE id = ?"
params:
- "{{ get('userId') }}"LLM with Model Fallback
# resources/primary-l-l-m.yaml
actionId: primaryLLM
chat:
prompt: "{{ get('q') }}"
onError:
action: continue
fallback: null
---
actionId: fallbackLLM
requires:
- primaryLLM
validations:
skip:
- get('primaryLLM') != null
- safe(get('primaryLLM'), '_error') == nil
chat:
prompt: "{{ get('q') }}"
---
actionId: response
requires:
- primaryLLM
- fallbackLLM
apiResponse:
success: true
response:
answer: "{{ default(get('primaryLLM'), get('fallbackLLM')) }}"Error Object Reference
In onError.expr and onError.when expressions, the error object is available:
| Property | Type | Description |
|---|---|---|
error.message | string | Human-readable error message |
error.type | string | Error type code |
error.code | string | Same as type |
error.statusCode | number | HTTP status code (if applicable) |
error.details | object | Additional error context |
Common Error Types
| Type | Description |
|---|---|
execution_error | General execution failure |
TIMEOUT | Request timed out |
VALIDATION_ERROR | Input validation failed |
NOT_FOUND | Resource not found |
UNAUTHORIZED | Authentication required |
RESOURCE_FAILED | Resource execution failed |
Best Practices
- Use retries for transient failures - Network issues, rate limits, temporary unavailability
- Use continue for non-critical operations - Enhancements, optional data, caching
- Use when conditions - Handle specific errors differently
- Log errors with expr - Store error info for debugging/monitoring
- Provide meaningful fallbacks - Return useful data even on failure
- Combine with validations.skip - Create fallback resource chains
See Also
- Validation - Input validation
- Expression Helpers - Helper functions
- Resources Overview - Resource types
