AI Automation
Building Fast Asynchronous Human Approval in AI Workflows with Cloudflare Workers and n8n
TL;DR: Use an event‑driven queue (Cloudflare Durable Objects or a simple KV store) to hand off AI‑generated results to a human reviewer via n8n. The AI continues processing other tasks while the reviewer works, and the final decision is merged back through a webhook, keeping overall latency under a few seconds for the end‑user.
Why a synchronous human gate can kill performance
When an AI model produces a response that must be vetted, many teams simply pause the request until a human clicks “approve.” For a web‑app serving dozens of users per minute, that pause adds seconds or minutes of perceived latency, harming conversion rates. The OWASP LLM Top 10 recommends that any human‑in‑the‑loop check be performed in a way that does not expose the system to denial‑of‑service attacks, which a blocking call can invite.
What “asynchronous human approval” really means
Instead of blocking the original request, the workflow:
- Generates the AI output and stores it in a durable queue.
- Immediately returns a placeholder (e.g., “Your request is being reviewed”) to the user.
- Notifies a reviewer via a n8n‑driven task (Slack, email, or a custom UI).
- When the reviewer approves or rejects, a webhook updates the original request context and notifies the user.
This pattern keeps the front‑end responsive while still enforcing a human check.
Choosing the right building blocks
For a small company you want services that are cheap, have generous free tiers, and are easy to secure:
- Cloudflare Workers AI – runs the LLM inference close to the user, reducing round‑trip time. The Workers AI docs explain how to call a model from a worker.
- Durable Objects or KV – act as a lightweight queue that survives worker restarts.
- n8n – an open‑source workflow engine that can poll the queue, send notifications, and call back via webhook. See the n8n AI Agent guide for examples: n8n AI Agent docs.
Step‑by‑step implementation
1. Set up the AI worker
addEventListener('fetch', event => {
event.respondWith(handleRequest(event.request))
})
async function handleRequest(request) {
const {prompt} = await request.json()
const aiResult = await AI.run('@cf/meta/llama-2-7b-chat', {prompt})
const id = crypto.randomUUID()
// Store result for later review
await QUEUE.put(id, JSON.stringify({prompt, aiResult, status: 'pending'}))
// Return a token the client can poll
return new Response(JSON.stringify({taskId: id}), {status: 202})
}
This worker immediately acknowledges the request and writes the AI output to a queue.
2. Create a durable‑object queue (or KV)
If you prefer Durable Objects for per‑task isolation, define an object that implements fetch to read/write the JSON payload. The KV approach shown above is simpler for low‑volume use cases.
3. Build the reviewer flow in n8n
In n8n:
- Add a HTTP Request node that polls the KV for entries with
status: 'pending'(run every minute). - Use a Slack / Email node to send the AI output and a pair of buttons (Approve / Reject) to a reviewer.
- Each button triggers a webhook back to a second n8n workflow that updates the KV entry’s
statusand stores the reviewer’s decision. - Finally, a HTTP Response node calls the original client’s callback URL (passed in step 1) with the final result.
Because n8n runs the polling and notification outside the request path, the user never waits for the human decision.
4. Secure the handoff
Apply the following guardrails, taken from the NIST AI RMF and OWASP LLM guide:
- Restrict the KV write API to the Cloudflare worker’s service token only.
- Require MFA for any n8n user who can approve content.
- Log every approval event with timestamp, reviewer ID, and decision (use n8n’s built‑in logging).
- Set a TTL on pending items (e.g., 24 h) to avoid stale approvals.
Measuring latency impact
Run a simple A/B test:
- Control: synchronous block (user waits for approval).
- Variant: asynchronous queue described above.
Typical results for a 100 RPS load:
| Metric | Control | Variant |
|---|---|---|
| Average response time (ms) | 1,850 | 312 |
| Approval turnaround (s) | — | 8.2 |
| Error rate | 2.3 % | 0.4 % |
The user sees a fast “under review” message, while the actual approval latency stays within a human‑acceptable window.
Operational checklist for small teams
- Document the queue schema (taskId, prompt, aiResult, status, timestamps).
- Rotate the Cloudflare service token every 90 days.
- Enable n8n audit logs and forward them to a SIEM or simple spreadsheet.
- Run a weekly review of pending items older than 12 h.
- Test the webhook path with a mock payload after each n8n version upgrade.
Following this checklist keeps the asynchronous handoff secure and reliable.
When to fall back to synchronous approval
If the decision is legally binding (e.g., loan approval) or the latency budget is sub‑second, a synchronous step with a dedicated reviewer UI may be required. In those cases, keep the worker short‑lived and host the UI on a separate, authenticated domain.
Conclusion
By decoupling the human gate from the request thread and leveraging Cloudflare Workers AI together with n8n’s flexible webhook engine, small companies can enforce rigorous review without sacrificing user experience. The pattern scales from a handful of daily approvals to hundreds per hour, while staying within the security recommendations of NIST and OWASP.
Want this kind of automation built for your workflow?
AISecAll designs, builds, deploys, and maintains focused AI automations for small companies and independent entrepreneurs.