In a perfect world, every API call returns a 200 OK, networks never drop packets, and third-party services have 100% uptime. But as developers, we build for reality, not for perfection. The real world is messy. Transient network issues, temporary service outages, and rate limits are not edge cases; they are inevitable facts of building distributed systems.
Designing for the "happy path" is easy. The real challenge—and the mark of a truly robust system—is how gracefully your workflows handle the failures. Unhandled errors can lead to incomplete processes, corrupted data, and frantic midnight pages.
This is why Actions.do was built with resilience at its core. It’s not enough to simply execute a task; you must be able to execute it reliably. Let's explore how to use the built-in reliability features of Actions.do to build workflow automation that can withstand the chaos of the real world.
Imagine a simple workflow: a new user signs up, and you need to enrich their profile with data from a third-party CRM. The Action might look something like this:
// A simple, "happy path" action
const enrichUserData = new Action({
id: 'enrich-user-from-crm',
handler: async ({ userId }) => {
const crmData = await crmApi.fetchUser(userId);
await database.updateUser(userId, { crmData });
return { success: true };
}
});
This works perfectly... until it doesn't. What happens if:
In all these cases, the workflow halts, the user's profile remains unenriched, and you might need manual intervention to fix it. This approach doesn't scale.
The Actions.do philosophy is that resilience shouldn't be an afterthought—it should be a configurable property of each fundamental building block. By defining reliability rules at the individual Action level, your entire workflow inherits that strength.
Here are the core features you can leverage to transform fragile tasks into resilient, self-healing operations.
The first line of defense against transient failures is to simply try again. But retrying indiscriminately can make a bad situation worse, a phenomenon known as the "thundering herd" problem. Smarter retries involve waiting between attempts.
Actions.do allows you to define a sophisticated retry strategy directly in your Action's configuration.
import { Action } from 'actions.do';
const enrichUserData = new Action({
id: 'enrich-user-from-crm-v2',
// highlight-start
retry: {
attempts: 5, // Try up to 5 times
strategy: 'exponential', // Use exponential backoff
initialDelay: '2s', // Start with a 2-second delay
maxDelay: '60s' // Cap delays at 60 seconds
},
// highlight-end
handler: async ({ userId }) => {
// ... same handler logic
}
});
With this configuration, a temporary API glitch is no longer a workflow-killing event. It's a minor hiccup that the Action handles automatically.
Some operations don't fail loudly; they just hang, stuck waiting for a response that will never come. A single stuck Action can stall an entire workflow, consuming resources and delaying critical processes.
import { Action } from 'actions.do';
const enrichUserData = new Action({
id: 'enrich-user-from-crm-v3',
retry: { /* ... */ },
// highlight-start
timeout: '45s', // Fail the action if it runs longer than 45 seconds
// highlight-end
handler: async ({ userId }) => {
// ... same handler logic
}
});
This timeout ensures that one bad call doesn't bring everything to a grinding halt.
What happens after the final retry attempt fails? You still need a plan. Instead of letting the workflow crash, you can define a graceful fallback.
Actions.do enables you to invoke another Action as a failure handler.
import { Action } from 'actions.do';
// Define the fallback action first
const handleCrmFailure = new Action({
id: 'notify-ops-on-crm-failure',
handler: async ({ error, payload }) => {
const message = `Failed to enrich user ${payload.userId}. Error: ${error.message}`;
// This could be another Action that calls Slack, PagerDuty, etc.
console.error(message);
}
});
// Now, define the main action with the fallback
const enrichUserData = new Action({
id: 'enrich-user-from-crm-final',
retry: { attempts: 5, strategy: 'exponential' },
timeout: '45s',
// highlight-start
onError: {
// On final failure, execute this other Action
actionId: 'notify-ops-on-crm-failure'
},
// highlight-end
handler: async ({ userId }) => {
const crmData = await crmApi.fetchUser(userId);
await database.updateUser(userId, { crmData });
return { success: true };
}
});
Now, if the enrich-user-from-crm action ultimately fails after all its retries, it won't crash silently. It will trigger the notify-ops-on-crm-failure Action, creating a auditable and observable failure that your team can act on.
By combining these features, you transform a simple task into a robust, stand-alone building block for any automated workflow. You've defined not just what the Action should do on success, but exactly how it should behave in the face of adversity.
This is the power of "Actions as Code." It moves reliability from an application-level concern to a fundamental property of your business logic itself. When you build your workflows with Actions.do, you're not just automating tasks—you're engineering reliability, scalability, and peace of mind into every step of the process.
Ready to build workflows that don't break? Get started with Actions.do and build your first resilient Action today.