Failures in '2001: A Space Odyssey'
Note: This article is a semi-satirical take on an important project management topic: risk management. it was prepared in fulfillment of an assignment for a project management class I enrolled in.
The plot of Stanley Kubrick's film masterpiece "2001: A Space Odyssey" revolves around a mysterious obelisk found buried on the moon. (Co-writer and author Arthur C. Clarke optimistically imagined in the 1960s that we'd be such an advanced spacefaring nation that we'd have bases on the moon by 2001.) The obelisk is featureless but has dimensions one by three by nine units--the first three prime numbers--suggesting that it was placed underground by an alien intelligence. Shortly after being unearthed (unmooned?), it sends a powerful radio signal directed at Jupiter. Humans, being the curious beings they are, mount a deep space mission to Jupiter to see with what or whom the obelisk might be communicating.
Few human projects are more carefully planned than those involving manned spaceflight. We can assume that the flight of the Discovery, the ship dispatched to Jupiter, has been planned to the very last detail...with the exception of the most important aspect of the mission: finding whatever there is to find near Jupiter. That the mission's planners could not have known what would be found there must have been considered a significant unquantifiable risk. The rest of the mission, resting on the knowns of the technology involved in the mission--the ship, its onboard systems and so on--likely had well understood and quantified risks derived from decades of spaceflight experience.
One aspect of the Discovery's infrastructure was considered very low risk: its onboard computer. As discussed obliquely in the movie, it's safe to say that the ship's computer was considered one of the lowest risk aspects of the entire mission. Its only purpose was to oversee the complex operations of the ship, relieving the human crew of many of the more tedious tasks associated with running a spaceship.
The computer had a unique feature: it operated in such a way as to emulate human emotions. This would make it easier for the crew to interact with it using natural language patterns. It was so life-like that its official name, HAL9000, was shortened by the crew so that it was simply referred to by the name "HAL." They thought of it as another member of the crew, one that they could depend on, and apparently with good reason: HAL himself stated that "no 9000 computer has ever made a mistake or distorted information."
There are several major failures of the mission's project plan. The first is the difficulty associated with maintaining the ship's communications system. During the mission, a component of the communications system called the AE-35 unit fails and necessitates an out-of-ship spacewalk to access. A better design would have made this component (critical for maintaining communications with earth) accessible from inside the ship, reducing physical risk to the astronauts in the event maintenance was needed. The second plan failure was in allowing HAL remote operation of the space pod used during the spacewalk. A malfunction in HAL caused the computer to use the space pod to kill the spacewalking astronaut sent to repair the AE-35 unit. The third failure occurred when HAL killed all hibernating crew members by turning off their life support. The fourth failure was related to the second in which HAL attempted to rid the ship of the last surviving astronaut by locking him outside of the Discovery during a spacewalk undertaken to retrieve the body of the first astronaut.
Finally, the last project failure again centered on the ship's design. When HAL malfunctioned, there was no simple way to shut him off and retake control of the ship other than to directly access his Logic Control Center, buried in a remote part of the ship. Once there, memory modules had to be physically removed in order to shut him down. Had the ship's designers anticipated the possibility of a seriously malfunctioning computer (no matter how unlikely) they would have installed a simple mechanism for disengaging computer control.
Invalid assumptions about risk
The system designers clearly never anticipated that a computer malfunction so catastrophic could occur that it would result in HAL attempting to murder the crew. Perhaps this was a failure of imagination on their part or a reliance on their own bravado in which it was believed that HAL was "incapable of error." Presumably, testing on the ground had led them to believe that the likelihood of a major failure of this type was so near zero as to be not worth considering. Since HAL was deeply integrated into the Discovery's infrastructure, any major failure would lead to mission failure. It's not clear from the movie if the ship could be successfully operated without HAL but it looks like as if it would not have been possible.
Low probability, high impact
The biggest project management lesson of the movie is this: expect the unexpected. Though ground tests by engineers indicated a flawless computer system, the decision to imbue the system with a simulated "personality" led to catastrophic consequences. Their test record led them (erroneously) to believe that they fully understood how computers with human emotions worked in all possible scenarios. This was a classic case of an unmanaged risk which had a low probability but a high impact.