Triple-Redundant Space Station Computers Crippled by Single-Point Connector Failure
The IEEE Spectrum has published a short but fascinating post mortem on how the International Space Station’s triple-redundant environmental- and attitude-control computer system failed last June. Was it some subtle voltage perturbation in the newly refurbished, solar power system? Nope? A subtle programming bug that took out all three computers simultaneously? Nope. Heretofore unknown fatal logic design flaw? Nope.
It was, of course, the first thing that every experienced electrical engineer knows to check first: the connectors. It seems that space stations have a big problem with water condensate…especially if your dehumidifier exhausts directly on your computer cables, which seems to be the case on the space station. A dehumidifier exhausted moisture-laden air right on power-monitoring boxes in series with the computers and their power source. The moisture attacked and corroded the connector pins in the cables that connected those boxes to the computers. When the cosmonauts disconnected the affected cables, the connector pins were wet. Well, that’s kinda scary.
One of the wires routed through those boxes was a “power-off” command line that went to all three supposedly independent computers. Naturally, that was one of the signal lines affected by corrosion. One corrosion-induced short on a signal line took down the entire triply-redundant computer system in a single blow.
There are lots of lessons in this article for the sharp system designer. Highly recommended reading.
For Paul Rako’s take on this, click here.
Dave J commented:
Kyle B. commented:















