Seattle, Washington – Boeing engineers were nearly done redesigning software on the grounded 737 MAX last June when some pilots hopped into a simulator to test a few things. It didn’t go well.
A simulated computer glitch caused it to dive aggressively in a way that resembled the problem that had caused deadly crashes off Indonesia and in Ethiopia months earlier.
That led to an extensive redesign of the plane’s flight computers that has dragged on for months and repeatedly pushed back the date of its return to service, according to people briefed on the work.
The company, which initially expressed confidence it could complete its application to recertify the plane with the Federal Aviation Administration within months, now says it hopes to do that before the end of the year.
Changing the architecture of the jet’s twin flight computers, which drive autopilots and critical instruments, has proven far more laborious than patching the system directly involved in 737 Max crashes, said these people, who asked not to be named speaking about the issue.
The redesign has also sparked tensions between aviation regulators and the company. As recently as this week, the FAA and European Aviation Safety Agency asked for more documentation of the changes to the computers, said one of the people, potentially delaying the certification further.
Developing and testing software on airliners is an exacting process. Manufacturers may have to demonstrate with extensive testing that a software failure leading to a crash would be as rare as one in a billion.
“It’s really complicated,” John Hansman, an aeronautics and astronautics professor at the Massachusetts Institute of Technology who is not involved in the repair, said of revising aircraft software. “It totally makes sense why it’s taking longer.”
Compared with the initial redesign of the software system involved in the crashes – a feature known as Maneuvering Characteristics Augmentation System or MCAS – the work on the flight computers will likely create an exponential increase in the safety tests required before it’s approved, said Peter Lemme, a former Boeing engineer who worked on flight-control systems before leaving the company to become a consultant.
Where before you may have had 10 scenarios to test, I could see that being 100,
Lemme said.
And that doesn’t account for the added time to design the software changes needed for the two computers, he said.
The work on the plane originally focused on MCAS, which repeatedly pushed down the nose in both accidents as a result of a malfunctioning sensor. Pilots eventually lost control and the crashes killed 346 people, prompting a worldwide grounding of the jet on March 13.
Within weeks of the first crash, a Lion Air flight off the coast of Indonesia on Oct. 29, 2018, Boeing announced that it was redesigning MCAS to make it less aggressive and to prevent it from activating more than once. It was projected to be completed within months.
While the fix became more complex and politically charged after the second accident – the crash of an Ethiopian Airlines jet on March 10 – the changes to MCAS remained self-contained and relatively simple.
I could have a bunch of graduate students and rewrite MCAS in a couple of days and be done,
Hansman said.
That, of course, wouldn’t pass muster with FAA, he said. And it was far simpler than the extensive computer redesign that they undertook.
Flight-Control Failure
In the original 737 Max design review, Boeing and its overseers at FAA concluded pilots would react swiftly to flight-control failures, but that assumption has been called into question by Indonesia’s final report into the crash and recommendations issued by the U.S. National Transportation Safety Board.
FAA officials, stung by post-crash charges of laxity, had already begun a more rigorous review of systems on the plane.
Part of assessing an aircraft’s safety involves anticipating even the most remote potential failure.
As a result, Boeing in June simulated what would happen if gamma rays from space scrambled data in the plane’s flight-control computers.
In one scenario, the plane aggressively dove in a way that mimicked what happened in the crashes on the grounded jetliner, the people said. While such a failure had never occurred in the 737’s history, it was at least theoretically possible.
Response Time
Because at least one of the pilots who flew the scenario in a simulator found it difficult to respond in time to maintain control of the plane, it needed to be fixed, according to two people familiar with the results.
The answer was to modernize what was a relatively antiquated design on the 737.
Most modern, computerized aircraft, such as more recent Boeing models and Airbus’s jets, use three computer systems to monitor each other,
Hansman and Lemme said.
By contrast, the 737 Max had two separate computers. One operated the flight systems and another was available if the first one failed, with the roles switching on each flight. But they interacted only minimally.
Boeing decided to make the two systems monitor each other so that each computer can halt an erroneous action by the other. This change is an important modernization that brings the plane more in line with the latest safety technology but raised highly complex software and hardware issues.
Short Circuit
Simply introducing a new wire that connects the two computers, for example, raises potential safety issues, Hansman said. If a short circuit in one computer occurs, could the wire cause it to disable the second computer?
And if flight data arrives in one computer a fraction of a second before or after it reaches the second one, that could create confusion for each system, according to Lemme.
As Boeing and the subcontractor that supplied the flight-control computer, the United Technologies Corp. division Collins Aerospace Systems, worked through these changes, it has at times created tension.
Officials from the FAA and EASA (European Aviation Safety Agency) expressed frustration with Boeing at a meeting last summer when company representatives didn’t supply a detailed enough explanation of the changes.
Work Audit
A similar issue arose in early November when an audit describing work on the changes wasn’t complete and the agencies ordered Boeing and Collins to revise it, according to a person familiar with the matter.
Boeing, in a statement, said it provided technical documents to regulators “in a format consistent with past submissions.”
“Regulators have requested that the information be conveyed in a different form, and the documentation is being revised accordingly,” according to the statement. “While this happens we continue to work with the FAA and global regulators on certification of the software for the safe return of the MAX to service.”
The result has been to extend the jet’s grounding.
“It’s absolutely the right thing to do,” said Jeffrey Guzzetti, the former chief of accident investigations at the FAA. “But that is a big change to make.”