Two weeks after the North-South Line of Singapore’s metro system was severely disrupted because of flooding in a tunnel, there are still calls on social media for Desmond Kuek, the CEO of SMRT Corporation which operates the line, to resign or be sacked. Public anger intensified after SMRT chairman Seah Moon Ming said at the 16 October 2017 press conference that the SMRT maintenance team will have their bonuses cut for failing to maintain the flood-prevention system. The public feeling was that Kuek himself should shoulder the blame and not point fingers at his staff, especially as he has been CEO for five years already.
Indeed, there is a growing perception that an entire cadre of military generals inserted to run various parts of public administration enjoy impunity whatever the failures on their watch. Kuek was a lieutenant general before. It is time for a proper example to be set.
This post is not intended to dwell on this issue. I have stated my point above.
I didn’t come to this view immediately. Organisations do not always need to be decapitated each time a major problem surfaces. For example, whether CEOs of airlines ought to resign after an aircraft crash depends on the reasons for the crash. Sometimes it is pure bad luck, e.g. flying over a volcano that unexpectedly erupted. Other times, one can trace a history of slipshod maintenance of aircraft, an ingrained culture of ignoring safety checklists, or job rosters that are too demanding thus leading to fatigued staff despite protests — these are usually signs that a thorough personnel overhaul may be needed.
* * * * *
The rest of this article rests mainly on the media statement on Land Transport Authority’s (LTA) website, giving a preliminary explanation for what happened on 7 October 2017. Christopher Tan of the Straits Times mentioned the salient points in his article 16 October; but I know that not many people read the Straits Times nowadays, even fewer think about checking government websites.
The general sense one gets from the LTA statement is that the neglect of the system was long-standing. You’d also notice that there were two separate failures that came together to cause the disruption, which indicates how bad the neglect was. One switch failure can happen. For two to happen — with the first failure not detected and not fixed before the second failure occurred — is a very serious lapse. For this reason, my view shifted to join the call for the top man to go.
Based on what I can glean from LTA’s statement, I drew the illustration above. Of course, I haven’t seen the pit myself and the illustration is merely diagrammatic.
To cope with heavy downpours, tunnel systems are supported by storm water catchment pits. The one serving the tunnel opening at Bishan “has the capacity of about two Olympic-sized swimming pools, and can contain about six hours of continuous heavy rain,” said the LTA. The pumps should normally be activated once the pit begins to fill, but just in case the rain is so heavy that if even the maximum pumping rate is insufficient, the pits have sufficient capacity to store the water instead of letting it overflow into the tunnel.
Here’s the interesting thing: While the rain that afternoon was heavy, it was not prolonged. Based on the data for the Bishan area, the total volume of rain water collected by the catchment area around the Bishan tunnel portal entrance was approximately 640 cubic metres, estimated the LTA. “Had the storm water pit been empty (at the start of the downpour), this volume of water would have only filled about 13% of the pit.”
Conclusion: The pit was already close to full when the rain began. This means the pumps had not been working for quite a while. The last time the pit was checked was in June (source: Christopher Tan’s article). It is not clear what the check found at that inspection; LTA’s statement is silent on this. Either the pit and pumps were in good order, or they were not and no action taken.
To discharge water from the pit, there are three pumps. Each has its own float switch, similar in principle to the device in the cistern of your toilet. According to the LTA, “Subsequent checks found that the pumps in the storm water pit were all in working condition,” yet none of them activated that day.
This was because the system was designed to include an override switch, which as explained by Christopher Tan in his article,
(this) switch detects low water levels to prevent the pumps from overheating when there is little or no water in the reservoir. This fourth switch, which overrides the rest, was the one that malfunctioned.
— Straits Times, 16 October 2017, SMRT, LTA have to get to root of the problem, by Christopher Tan. Link.
It appears that this fourth switch (with its float) failed to detect that water in the pit was rising. It then suppressed all three pumps.
Another switch didn’t work either. This was the ‘Alarm switch’, whose function was to alert the Operations Control Centre if water reached a critical level in the pit, so that manual intervention could be summoned. As a result, the control centre had no forewarning of the crisis, until water was already in the tunnel and a train stranded with its undercarriage and all its electrical systems in water.
We were extremely lucky no one was electrocuted. That’s how close we came to a fatal disaster.
It is still not known why the override switch and the alarm switch malfunctioned. The fact that sludge and debris were found at the bottom of the pit could be a factor.
Nonetheless, the description from preliminary findings indicated a serious problem that could have been noticed weeks or months earlier, but was not. Two failures occurred, not one: the alarm switch and the override switch — which I presume were independent systems. Neglect has to be very severe for something like this to happen.
Christopher Tan also mentioned in his article an observation by transport minister Khaw Boon Wan. Failsafe systems are designed to have multiple independent parts, so that if one failed, the other parts can still take care of the problem. Redundancy gives a margin of security. Khaw said that two pumps — and they each had their own float switches — would be sufficient to empty the reservoir even in heavy rain, so design-wise that sounds right.
But then a brilliant engineer put in an override switch — just one override switch that can suppress all the pumps. “To have all three tied to one switch undermines this redundancy,” wrote Christoper Tan.
Amazingly, nobody seems to be asking what should be an obvious question. At the other tunnel openings in our MRT system, are they all designed the same way, with a single override switch able to suppress a whole bank of pumps? If so, then all it takes is for one switch to fail and none of the pumps will be activated. What do you think are the chances of this happening again?
* * * * *
Earlier, there were reports that a maintenance team was scheduled to check on the sump pit and pumps some days (or a week?) prior to the incident. The reports said that the maintenance team could not get track access, and so the routine inspection was rescheduled to 12 October. However, the flooding happened before that, on 7 October.
Firstly, I am always a little skeptical of such explanations. They sound a little too convenient, making it seem as if plans and intentions were good but “bad luck” got in the way. However the thing that stuck me about this explanation is not the matter of dates. It is that in order to reach the sump pit and the pumps, the maintenance crew had to go via track access. Why is that so? Why has it been designed this way? Why is there no separate entry into the pumps and sump pit to allow maintenance in the day when trains are running? And once again, is the design the same for all the other tunnel openings?
I hope the preliminary explanation from LTA won’t be the end of the matter. A proper commission of inquiry is warranted.