What Product Teams Can Learn from the London Underground Fire of 1987
On the evening of November 18, 1987, a devastating fire broke out at the King‘s Cross St Pancras station on the London Underground. It started around 7:30pm when a passenger discarded a burning match that fell through a gap on a wooden escalator leading down to the Piccadilly Line.
The match ignited a pool of grease and litter in the escalator machinery space, which had built up over decades of poor maintenance. A small tissue fire was noticed by a ticket collector and quickly extinguished, but he didn‘t report it as he returned to his duties.
Minutes later, another passenger reported a wisp of smoke rising from the escalator to a different ticket collector. He investigated but didn‘t call the fire brigade, assuming it was a small issue that didn‘t require an emergency response. This proved to be a fatal mistake.
Anatomy of a Deadly Disaster
By 7:45pm, the fire had ignited the wooden escalator steps and was now visible to patrons and staff. The first call to the London Fire Brigade came at 7:52pm, a full 22 minutes after the initial tissue fire. But even more crucial time would be lost.
When fire crews arrived at 8:02pm, they found the ticket hall filled with dense smoke. Their efforts to reach the fire were hampered by a series of critical failures and oversights in the Underground‘s fire safety systems:
- The escalators had no sprinklers and were made of flammable wood
- Station staff had no training on emergency procedures or evacuations
- Fire alarms and response processes had not been properly tested
- Police and fire crews had no way to communicate underground
As the fire grew in intensity, it ignited the walls and ceiling, which were covered with multiple layers of old paint. The running trains created a deadly chimney effect, bringing fresh oxygen to fan the flames and spreading toxic smoke throughout the station.
At 8:08pm, just 6 minutes after the fire brigade arrived, the blaze reached a horrifying flashover point. A jet of flames exploded up the escalator shaft, engulfing the ticket hall in a 600°C fireball. The entire station was now an inferno.
The flashover was intensified by the trench effect of the escalator shaft and the lack of fire safety doors at the top of it. Thirty one people, including a firefighter, would ultimately lose their lives in the devastating blaze. Another 100 were taken to hospital.
It took over 12 hours and 30 fire engines to fully extinguish the fire. The damage was catastrophic with the ticket hall almost completely destroyed. Over £5 million in damage was caused (equivalent to £25 million today) and the station would remain closed for repairs for 6 months.
Smoldering Issues Beneath the Surface
The public inquiry into the King‘s Cross fire, led by Sir Desmond Fennell QC, identified a startling array of safety failures that had made the Underground a tragedy waiting to happen:
- Staff had grown complacent about fire risks and safety procedures
- Unclear division of responsibilities and communication breakdowns between different Underground departments
- Inadequate staff training on fire safety equipment and emergency plans
- Lack of preventative maintenance allowing flammable debris to build up
- Infrequent fire safety testing of materials and infrastructure
These organizational and cultural failings allowed a perfect storm of technical hazards to build up over time – flammable escalators, layers of paint, no sprinklers, poor ventilation. It was a disaster waiting for a spark.
The Fennell Report would ultimately make 157 recommendations for sweeping changes and improvements to fire safety on the Underground. While many were procedural, like better staff training and regular safety drills, others required major investments in physical infrastructure.
Legacy wooden escalators were replaced with metal ones. Comprehensive sprinkler and ventilation systems were installed. Rigorous fire safety standards were applied to all materials used in stations. A dedicated Underground radio network was set up for emergency responders.
But perhaps the most significant changes were the ones made to the human systems and culture of the Underground. New staff roles were created focused solely on fire safety. Strict protocols were established for reporting and responding to potential fires immediately. Regular drills were instituted to keep safety top of mind.
Parallels to Product Development
At first glance, the King‘s Cross fire may seem like a distant tragedy with few implications for modern product development. But for those versed in the challenges of shipping software, the parallels are strikingly familiar.
We may not be in the business of public transport, but we are in the business of building enormously complex systems under intense pressure. We have to constantly balance speed of development with safety and reliability. We wrangle intricate legacy systems while continuously shipping new features and innovations.
The same organizational and cultural failure modes that made the Underground vulnerable to disaster are ones that every product team must guard against:
- Growing complacent about monitoring and testing software health
- Allowing silos to develop where information and issues don‘t flow freely
- Underinvesting in tooling and automation for quality and safety
- Letting technical debt and design cruft accumulate over time
- Not regularly stress testing systems and incident response plans
Just like a physical fire, software disasters often start small – an overlooked bug, a flaky test, a problematic design choice. It‘s easy to ignore these smoldering issues, assuming they‘re too minor to be worth sounding the alarm. But left unchecked, they can quietly build into a raging inferno.
"Poor visibility of problems is as bad as the problems themselves. If you can‘t see the smoke, you can‘t fight the fire." – Charity Majors, CTO of Honeycomb
Building More Fireproof Teams
So what can we learn from the tragedy of King‘s Cross to build more resilient and adaptable product teams? Here are some key principles:
1. Prioritize observability and monitoring
One of the London Underground‘s critical failures was a lack of early warning systems. Ticket collectors weren‘t empowered to raise alarms. Smoke detectors and sprinklers were absent. Issues that started small grew in the dark.
In the world of software, observability is key. We need monitoring systems that surface potential problems early, before they can spiral out of control. Robust logging, tracing and alerting are a must. Every team member should feel comfortable raising a flag if they spot a worrying pattern or anti-pattern.
2. Embrace a culture of safety and learning
The Underground had grown dangerously complacent about fire risks. Outdated training and inconsistent procedures bred a false sense of security. Post-incident reviews were focused more on assigning blame than truly understanding root causes.
To build more resilient systems, product teams need a strong culture of safety and continuous learning. Thorough postmortems should be blameless and focus on surfacing systemic issues. Failures should be treated as opportunities to strengthen the immune system. Regular fire drills and chaos testing keep responders sharp.
3. Invest in cross-functional collaboration
The Underground‘s fragmented structure and poor communication between departments critically undermined emergency response. Ticket collectors and firefighters were operating with different playbooks. Information on the fire‘s severity didn‘t reach decision makers in time.
Product teams must actively work against the perils of hyper-specialization and siloed knowledge. Developers, designers, QA and ops should be in constant collaboration. Cross-training and shared on-call rotations build collective ownership. Open communication channels and regular check-ins ensure essential information flows.
4. Double down on quality and testing
The Underground had underinvested in basic fire protection measures like sprinklers and non-flammable materials. Escalators hadn‘t been upgraded in decades. Layers of hazardous paint had accumulated without testing. Minor issues were left to fester into major risks.
In software, skimping on quality is like playing with fire. Automated testing, QA processes, and code review can feel like speed bumps, but they are essential for preventing catastrophic failures. Technical debt and design cruft are like smoldering embers, harmless at first but potentially explosive down the line. Proactive refactoring and maintenance keep them in check.
5. Automate incident response
When the King‘s Cross fire broke out, critical time was lost to manual steps and human decision loops. Emergency plans weren‘t well drilled and relied on people taking heroic actions in the heat of the moment. The lack of communications infrastructure left crews flying blind.
In the world of product incidents, automated response can make all the difference. Infrastructure-as-code lets us codify our emergency runbooks. Monitoring tools automatically trigger incident workflows. Communication channels like Slack can be integrated to keep all stakeholders informed. The less we rely on human decision making amid a crisis, the better.
The Fire Next Time
No product team can eliminate risk entirely. We operate in a world of extraordinary complexity, chasing a rapidly evolving horizon. Bugs will always ship. Incidents will always happen. There will always be another fire to fight.
But by learning from past disasters like the King‘s Cross fire, we can progressively fireproof our teams and codebases. We can build habits and muscle memory for detecting risks early, responding quickly, and always improving. We can create resilient systems, both human and technological.
The key is to never grow complacent, never assume that the absence of a major failure means the presence of safety. Every smoldering issue is a warning sign, every incident a catalyst for positive change.
In 1987, a burning tissue was ignored until it became an inferno. In 2023 and beyond, let us resolve to always take the small sparks seriously and snuff them out long before the flashover ever occurs. The fires we prevent are the ones that no one ever needs to hear about. But they are quietly the most important ones of all.
"Only a fool learns from his own mistakes. The wise man learns from the mistakes of others." – Otto von Bismarck