【Control】🔁 16. (Safety Design) What Is Recovery Control?

Why AI Control Is Defined by Post-Failure Design

topics: [“control engineering”, “AI”, “safety design”, “recovery”, “FSM”]

⚠️ Introduction: AI Control Must Be Designed for Failure

Discussions about AI-based control often include claims like:

“If accuracy improves, it will be fine.”
“If learning continues, it will get smarter.”

From a control engineer’s perspective, reality is different.

AI will fail.

The real question is not how to avoid failure, but:

How do we safely return after failure?

This article introduces Recovery Control,
the final pillar of the AI Control Safety Package.

🛠️ What Is Recovery Control?

In one sentence, Recovery Control is:

“A design framework that guarantees safe return after abnormal events.”

Its objectives are clear:

Return safely
Never jump back abruptly
Prevent secondary damage

These are design guarantees, not runtime guesses.

🚨 Why Recovery Control Is Necessary

Even with a Safety Envelope,
violations will occur.

Sensor failures
Environmental changes
Model degradation
AI output collapse

The danger lies in what happens next:

Permanent stop
Immediate return to normal operation
Restart without understanding the cause

All of these are unsafe.

🧭 Core Principles of Recovery Control

Recovery Control is based on three principles.

🟦 ① Fall Back to Safety

Limit outputs
Reduce gains
Slow down operation

🟧 ② Return Gradually

Never jump directly to Normal
Always pass through intermediate modes

🟨 ③ Never Return with Unresolved Causes

“It seems fixed” is unacceptable
Decisions are made by FSM

🧩 Core Components of Recovery Control

🟥 ① Safe Mode

The system always enters Safe Mode first.

Minimal outputs
Simple behavior
Fully predictable for humans

AI does not intervene here.

🟪 ② Diagnostic Mode

Next, the system organizes the situation.

Sensor failure?
Model mismatch?
External disturbance?
AI decision collapse?

This is where LLM may assist.

🟫 ③ Re-Initialization

If required:

Reset PID gains
Reset estimators
Reinitialize FSM states

🟩 ④ Gradual Return

Finally, the system returns step by step.

Safe → Limited → Normal
Any anomaly immediately triggers rollback

🧠 FSM-Centered Recovery Design (Critical)

Recovery Control is FSM-driven.

Typical State Transitions

Normal
Warning
Safe
Diagnostic
Limited
Normal

The order and conditions
are fully defined by human designers.

AI must never decide when it is “safe to return.”

🔗 Role of Recovery in PID × FSM × LLM Architecture

⚙️ PID

Ensures stability in Safe / Limited modes
Speed is secondary
Predictability is the priority

🧾 FSM

Fully controls transitions
Can enforce stop or rollback

🧠 LLM

Explains causes of failure
Proposes redesign options
Supports human decision-making

LLM only thinks. It does not act.

❌ Common Recovery Design Failures

🚫 AI Declares “All Clear”

No solid evidence
No reproducibility
No accountability

🚫 Weak Safe Mode

Too close to normal operation
Failure reoccurs immediately

🚫 Direct Return to Normal

The highest accident risk

🏁 Why Recovery Control Differentiates AI Control Systems

AI control systems:

Look similar when successful
Reveal true design quality when they fail

Systems with Recovery Control:

Do not collapse in the field
Can explain what happened
Preserve trust

🧠 Summary: AI Control Is Defined by How It Returns

AI will fail
Post-failure design defines safety
Recovery Control cannot be retrofitted
FSM-driven staged return is essential
LLM must remain advisory

The true value of AI control is measured not by
how often it succeeds, but by
how reliably it returns after failure.

📚 Trilogy Summary

Why LLMs must not be placed inside control loops
Safety Envelope as boundary design
Recovery Control as return design

Only when all three are present
can AI control be deployed safely.

🔗 References

AI Control Safety Package
https://samizo-aitl.github.io/ai-control-safety-package/

End of Article