Broken Automations: A Postmortem Pattern I See Repeatedly
Most broken automations do not fail loudly.
They fail quietly, repeatedly, and expensively.
By the time someone asks for help, the automation has usually:
- Been running for weeks or months
- Been worked around rather than fixed
- Become “too scary to touch”
- Quietly cost more time than it saved
This post is not a tutorial.
It is a postmortem pattern.
These are the same failure modes I see again and again when diagnosing broken Google Workspace, Zapier, and hybrid automations.
No blame. Just analysis.
What a Broken Automation Usually Looks Like
Rarely does someone arrive saying: “This automation is broken.”
They usually say:
- “It mostly works, but…”
- “We are getting duplicates sometimes”
- “We turned it off because it was doing weird things”
- “We do not know why it ran again”
The automation is still running.
It is just no longer trusted.
That is the real failure.
Postmortem Pattern 1: It Worked Once, So We Let It Run Forever
What Failed
The automation was designed to succeed on a single run, not to operate safely over time.
Common Symptoms
- Reprocessing the same emails or rows
- Daily triggers with no concept of completion
- Scripts that assume “new” means “exists”
Root Cause
No explicit definition of:
- What counts as new
- What counts as done
- What should be ignored
Time-based triggers without state are the most common culprit.
How I Fix It
I add memory to the system.
- Explicit processed markers
- Durable state tracking
- Idempotent logic
Running it twice becomes harmless.
Postmortem Pattern 2: The Trigger and the Action Touched the Same Thing
What Failed
The automation modified the same object that triggered it.
Common Symptoms
- Google Sheets scripts that keep re-running
- Gmail labels reappearing
- “Why did this fire again?” confusion
Root Cause
A feedback loop that was never made visible.
The system is not broken. It is doing exactly what it was told to do.
How I Fix It
- Separate detection from processing
- Ensure triggers observe state, not create it
- Introduce guard conditions that stop re-entry
This is usually a design correction, not a rewrite.
Postmortem Pattern 3: Abstraction Hid the Risk
What Failed
The automation was built in a tool that made complexity easy to assemble but hard to reason about.
Common Symptoms
- Zapier workflows nobody fully understands
- Branches added “just one more time”
- Fear of touching anything in case it breaks
Root Cause
Critical concepts were hidden:
- State
- Deduplication
- Retry behaviour
- Failure handling
The automation grew faster than the understanding of it.
How I Fix It
I flatten the logic.
- Make state explicit
- Reduce hidden behaviour
- Move critical paths into Apps Script when needed
The goal is not fewer tools. It is clearer systems.
Postmortem Pattern 4: No One Designed the Failure Path
What Failed
The automation assumed success.
Common Symptoms
- Silent failures
- Partial data
- “It skipped some things but we do not know which”
Root Cause
No answer to: “What should happen if this step fails?”
Retries without safeguards often make things worse.
How I Fix It
I design failure intentionally.
- Explicit error states
- Logged failures
- Safe retries
- Clear alerts only when needed
Failures become visible and contained.
Postmortem Pattern 5: Ownership Was Never Clear
What Failed
No one knew who was responsible for the automation long term.
Common Symptoms
- “The person who built it left”
- “We are afraid to change it”
- “We do not know how it works anymore”
Root Cause
The automation was treated as a task, not a system.
How I Fix It
I make the system legible.
- Clear structure
- Readable logic
- Documented assumptions
- Predictable behaviour
If no one can understand it, it is already broken.
What All Broken Automations Have in Common
They were built to make something happen.
They were not built to keep behaving.
That distinction matters.
How I Approach Automation Postmortems
When I am asked to fix a broken automation, I do not start with code.
I start with questions:
- What triggers what?
- What state exists?
- What happens on a second run?
- What happens on failure?
Only then do I change anything.
Most fixes are small.
Most damage comes from not seeing the system clearly.
If You Are Dealing With One Right Now
If you are here because:
- An automation keeps looping
- You are seeing duplicates
- You turned something off to stop the damage
- You do not trust a system you rely on
This is not unusual.
It usually means the automation outgrew its original design.
That is fixable.
How I Can Help
I diagnose and rebuild broken Google Workspace automations.
That includes:
- Postmortems of existing systems
- Fixing loops and duplication
- Rebuilding fragile workflows safely
- Replacing “scary” automations with boring ones
Details are here:
https://empowerautomation.co.uk/services
Real breakdowns live here:
https://empowerautomation.co.uk/system-logs
Broken automations rarely fail because of bad intent.
They fail because nobody was asked to think about what happens next.
This post documents a real automation failure mode. Similar failures often appear under labels such as automation loops, duplicate triggers, silent retries, and state loss.