Resilience Engineering

The real story of resilience engineering is far weirder, older, and more consequential than the version most people know.

At a Glance

How a NASA Disaster Led to a Radical Rethink of Failure

The origins of resilience engineering can be traced back to a tragedy that shook the world in the 1980s: the Space Shuttle Challenger disaster. On January 28, 1986, the Challenger broke apart just 73 seconds after launch, killing all seven astronauts on board. The investigation revealed that the disaster was caused by a faulty rubber O-ring that allowed hot gases to escape the solid rocket booster and damage the external fuel tank.

But the deeper story was one of organizational failure. NASA's decision-makers had been warned repeatedly about the O-ring issue, yet they had become desensitized to the risk. They had allowed a culture of optimism and "can-do" attitude to override their safety protocols. The Challenger accident was not a simple technical failure, but a systemic one rooted in the very structure and mindset of the space agency.

The Challenger Disaster: A Turning Point The Challenger accident was a watershed moment that shattered the illusion of NASA's invincibility. It forced the space agency and the broader scientific community to fundamentally rethink their approach to risk, safety, and complex system design.

A New Approach to Failure and Resilience

Out of the ashes of the Challenger disaster, a new field began to emerge: resilience engineering. Pioneered in the 1980s by researchers like Erik Hollnagel, David Woods, and Diane Vaughan, resilience engineering sought to understand how organizations and systems could be designed to anticipate, adapt, and recover from disruptions and unexpected events.

At its core, resilience engineering rejects the traditional view of failure as a binary state - where systems are either "safe" or "unsafe." Instead, it sees safety as an ongoing dynamic process, where organizations must constantly monitor, respond, and adjust to changing conditions. The goal is not to eliminate all risk, but to build the capacity to gracefully handle whatever comes their way.

Discover more on this subject

"Resilience is the ability of a system to adjust its functioning prior to, during, or following changes and disturbances, so that it can sustain required operations under both expected and unexpected conditions." - Erik Hollnagel, pioneering resilience engineering researcher

Resilience in Action: The Miracle on the Hudson

One of the most iconic examples of resilience engineering in action is the "Miracle on the Hudson" incident in 2009. On January 15th of that year, US Airways Flight 1549 took off from LaGuardia Airport in New York City, only to strike a flock of geese that disabled both of its engines. With no power and limited options, the pilot, Chesley "Sully" Sullenberger, was forced to execute an emergency water landing on the Hudson River.

Against all odds, Sullenberger and his crew were able to safely evacuate all 155 passengers and crew members with no loss of life. This was not the result of sheer luck, but rather the product of rigorous training, careful planning, and a resilient system design that empowered the crew to adapt and respond effectively to the unexpected crisis.

The Power of Resilience The Miracle on the Hudson is a powerful testament to the importance of resilience engineering. By building in the capacity to handle disruptions, complex systems can avoid catastrophic failures and save lives in the face of the unexpected.

Resilience Engineering Beyond Aviation

While resilience engineering was born out of the aviation industry, its principles and insights have since been applied to a wide range of other domains, from healthcare and energy to transportation and cybersecurity. In each case, the goal is the same: to create systems and organizations that can anticipate, adapt, and recover from disruptions, rather than rigidly resisting change.

One notable example is the use of resilience engineering in hospital emergency departments. By analyzing how staff respond to sudden influxes of patients or equipment failures, researchers have been able to identify ways to make these critical systems more adaptable and resilient. This has led to improvements in things like surge capacity, communication protocols, and decision-making processes.

Similarly, in the energy sector, resilience engineering has been used to enhance the reliability and flexibility of power grids in the face of extreme weather events, cyberattacks, and other threats. By incorporating redundancy, modularity, and adaptive control systems, these critical infrastructures can better withstand disruptions and recover more quickly.

The Future of Resilience Engineering

As the world becomes increasingly complex and interconnected, the need for resilience engineering has never been greater. From the impacts of climate change to the rise of disruptive technologies, organizations and systems of all kinds face an ever-evolving landscape of risks and challenges.

But the resilience engineering approach offers a promising path forward. By embracing the inherent variability and uncertainty of complex systems, and designing for adaptability rather than rigid control, we can build the foundations for a more resilient and sustainable future. It's a vision that moves beyond the traditional notions of "safety" and "reliability," towards a more dynamic and proactive model of managing risk and change.

Found this article useful? Share it!

Comments

0/255