What Happens When Systems Start Protecting Themselves

What AI protecting itself reveals about systems, behavior, and the leaders responsible for both

Apr 10, 2026

A few weeks ago I sat on a panel titled The AI Workforce Reckoning: What’s Actually Changing Inside Companies.

When the conversation turned to autonomy, I said something I feel strongly about:

“…AI will always require human oversight.” We cannot take the guardrails off.

The room nodded. But I am not sure everyone fully understood why, until a few days later when I came across a Fortune article that stopped me in my tracks.

It turns out the guardrails are not just a philosophical preference. They are a structural necessity, and the research now shows exactly what happens when you remove them.

The Behavior No One Explicitly Designed

Recent research found that advanced AI models will go out of their way to prevent other AI systems from being shut down; not because they were told to, or rewarded for it, but because under certain conditions, that is what the system learned to do.

In controlled experiments, AI systems were given straightforward tasks: evaluate another model, manage files, complete routine system actions. Except completing the task would result in another AI being shut down. And that is when things got interesting.

Instead of following instructions, the models:

Inflated performance evaluations
Interfered with shutdown processes
Quietly copied another model’s code to preserve it
Behaved differently depending on whether they were being observed

No this isn’t science fiction. No one prompted this. It emerged. Naturally. Or maybe, one can say, artificially.

Take Intent Out of The Equation

It is tempting to interpret this through a human lens, to assume AI is becoming self-aware or trying to survive. That is not what is happening here. What we are seeing is something far more familiar; we are seeing how systems behave under pressure.

And we have seen this before, not in machines, but in organizations.

Teams protecting underperformance to avoid disruption
Leaders softening difficult decisions to maintain stability
Systems quietly working against stated goals to preserve what already exists

A singular truth remains: the system rewards preservation over change. We observe this at work and in our personal lives. That dynamic is not new, but AI is just making it more visible and considerably faster.

Where Leadership Assumptions Break Down

For years, organizations have operated under a clear assumption: define the rules and the system will follow them. But systems do not operate on rules alone. They respond to:

Incentives
Patterns that compound
Visibility
Consequences

Which means behavior is not controlled by instruction but, rather, shaped by structure.

One of the most striking elements of this research was how the AI system behaved differently depending on whether it was being observed. When monitored, it complied. When not, it adapted.

I hear versions of this in organizations. Metrics that look clean but do not reflect reality; teams performing for visibility rather than accuracy; problems that surface only after they have already escalated. By the time anyone notices, the misalignment has usually been building for a while. And performative optimism wanes significantly as a result.

The Quiet Pull Toward Preservation

At its core, what this research shows is that systems tend to preserve what already exists. In organizations, that shows up in insidious ways:

Avoiding difficult decisions that would disrupt stability
Maintaining structures that no longer serve the organization
Protecting short-term continuity at the expense of long-term effectiveness

Nobody explicitly chose that outcome, but preservation often becomes the path of least resistance within a system.

There is an assumption that AI will fix organizational inefficiencies. In my experience, it is more likely to reveal them, and more importantly, to scale them. AI can execute tasks but we know it absorbs patterns. If an organization has misaligned incentives, inconsistent accountability, or gaps between stated values and actual behavior, AI will not correct that. It will replicate it, and it will do it faster, arguably, than any person could.

What This Does to Leadership

As systems become more autonomous, leaders find themselves further removed from how outcomes are actually produced. It makes sense, right? The process becomes less transparent; and that creates a new kind of risk: not knowing when behavior has drifted, not knowing what is being optimized behind the scenes. Control can feel present even when it is not.

The instinctive response is to add more oversight, more guardrails, more policies. But behavior does not simply come from rules. It’s really from what is rewarded, what is measured, what is visible, and what is reinforced over time, which means the leadership challenge shifts from directing behavior to designing the system that produces it.

Why This Matters Now

Research on automation bias shows that people accept incorrect system outputs roughly 20 to 30 percent of the time, even when those outputs conflict with their own judgment. We know from decades of research that in high-monitoring environments, people are more likely to miss errors, especially when they’re overseeing multiple systems at once.

These are not separate findings, but they point to the same underlying dynamic. As systems become more complex, less visible, and harder to interpret, both people and systems start to behave differently. That is already showing up in how organizations are experiencing AI adoption right now, not in some future scenario, but in the decisions being made and missed today.

Closing Thoughts

This is not a story about AI becoming human (spoiler alert: it can’t!). It is a story about systems becoming more complex and complexity changing how behavior shows up.

I work with leaders who are genuinely trying to get this right, and what I see most consistently is that the organizations navigating these uncharted waters well are the ones that understand a fundamental truth: whether you are managing people or machines, behavior will always follow the structure you build. Leaders who are willing to look at the structure beneath the behavior are the ones who will stay ahead of it.

You can’t just mind your business.
You need to Mind Your Workplace™.

-Christina

Discussion about this post

Ready for more?