【Control】🔁 16. (Safety Design) What Is Recovery Control?
Why AI Control Is Defined by Post-Failure Design
topics: [“control engineering”, “AI”, “safety design”, “recovery”, “FSM”]
⚠️ Introduction: AI Control Must Be Designed for Failure
Discussions about AI-based control often include claims like:
“If accuracy improves, it will be fine.”
“If learning continues, it will get smarter.”
From a control engineer’s perspective, reality is different.
AI will fail.
The real question is not how to avoid failure, but:
How do we safely return after failure?
This article introduces Recovery Control,
the final pillar of the AI Control Safety Package.
🛠️ What Is Recovery Control?
In one sentence, Recovery Control is:
“A design framework that guarantees safe return after abnormal events.”
Its objectives are clear:
- Return safely
- Never jump back abruptly
- Prevent secondary damage
These are design guarantees, not runtime guesses.
🚨 Why Recovery Control Is Necessary
Even with a Safety Envelope,
violations will occur.
- Sensor failures
- Environmental changes
- Model degradation
- AI output collapse
The danger lies in what happens next:
- Permanent stop
- Immediate return to normal operation
- Restart without understanding the cause
All of these are unsafe.
🧭 Core Principles of Recovery Control
Recovery Control is based on three principles.
🟦 ① Fall Back to Safety
- Limit outputs
- Reduce gains
- Slow down operation
🟧 ② Return Gradually
- Never jump directly to Normal
- Always pass through intermediate modes
🟨 ③ Never Return with Unresolved Causes
- “It seems fixed” is unacceptable
- Decisions are made by FSM
🧩 Core Components of Recovery Control
🟥 ① Safe Mode
The system always enters Safe Mode first.
- Minimal outputs
- Simple behavior
- Fully predictable for humans
AI does not intervene here.
🟪 ② Diagnostic Mode
Next, the system organizes the situation.
- Sensor failure?
- Model mismatch?
- External disturbance?
- AI decision collapse?
This is where LLM may assist.
🟫 ③ Re-Initialization
If required:
- Reset PID gains
- Reset estimators
- Reinitialize FSM states
🟩 ④ Gradual Return
Finally, the system returns step by step.
- Safe → Limited → Normal
- Any anomaly immediately triggers rollback
🧠 FSM-Centered Recovery Design (Critical)
Recovery Control is FSM-driven.
Typical State Transitions
- Normal
- Warning
- Safe
- Diagnostic
- Limited
- Normal
The order and conditions
are fully defined by human designers.
AI must never decide when it is “safe to return.”
🔗 Role of Recovery in PID × FSM × LLM Architecture
⚙️ PID
- Ensures stability in Safe / Limited modes
- Speed is secondary
- Predictability is the priority
🧾 FSM
- Fully controls transitions
- Can enforce stop or rollback
🧠 LLM
- Explains causes of failure
- Proposes redesign options
- Supports human decision-making
LLM only thinks. It does not act.
❌ Common Recovery Design Failures
🚫 AI Declares “All Clear”
- No solid evidence
- No reproducibility
- No accountability
🚫 Weak Safe Mode
- Too close to normal operation
- Failure reoccurs immediately
🚫 Direct Return to Normal
- The highest accident risk
🏁 Why Recovery Control Differentiates AI Control Systems
AI control systems:
- Look similar when successful
- Reveal true design quality when they fail
Systems with Recovery Control:
- Do not collapse in the field
- Can explain what happened
- Preserve trust
🧠 Summary: AI Control Is Defined by How It Returns
- AI will fail
- Post-failure design defines safety
- Recovery Control cannot be retrofitted
- FSM-driven staged return is essential
- LLM must remain advisory
The true value of AI control is measured not by
how often it succeeds, but by
how reliably it returns after failure.
📚 Trilogy Summary
- Why LLMs must not be placed inside control loops
- Safety Envelope as boundary design
- Recovery Control as return design
Only when all three are present
can AI control be deployed safely.
🔗 References
- AI Control Safety Package
https://samizo-aitl.github.io/ai-control-safety-package/
End of Article