Automation Complacency

Tuesday, April 21st, 2015

Pablo Garcia suffered from a rare genetic disease called NEMO syndrome, but when the 16-year-old went in for a colonoscopy he started complaining about numbness and tingling all over his body:

At 9 o’clock that night, Pablo took all his evening medications, including steroids to tamp down his dysfunctional immune system and antibiotics to stave off infections. When he started complaining of the tingling, Brooke Levitt, his nurse for the night, wondered whether his symptoms had something to do with GoLYTELY, the nasty bowel-cleansing solution he had been gulping down all evening to prepare for the procedure. Or perhaps he was reacting to the antinausea pills he had taken to keep the GoLYTELY down.

Levitt’s supervising nurse was stumped, too, so they summoned the chief resident in pediatrics, who was on call that night. When the physician arrived in the room, he spoke to and examined the patient, who was anxious, mildly confused, and still complaining of being “numb all over.”

At first, he was perplexed. But then he noticed something that stopped him cold. Six hours earlier, Levitt had given the patient not one Septra pill — a tried-and-true antibiotic used principally for urinary and skin infections — but 38½ of them.

Levitt recalls that moment as the worst of her life. “Wait, look at this Septra dose,” the resident said to her. “This is a huge dose. Oh my God, did you give this dose?”

“Oh my God,” she said. “I did.”

The doctor picked up the phone and called San Francisco’s poison control center. No one at the center had ever heard of an accidental overdose this large—for Septra or any other antibiotic, for that matter—and nothing close had ever been reported in the medical literature. The toxicology expert there told the panicked clinicians that there wasn’t much they could do other than monitor the patient closely.

How did this happen?

As the pediatric clinical pharmacist, it was [Benjamin] Chan’s job to sign off on all medication orders on the pediatric service. The chain of events that led to Pablo’s catastrophic overdose unfolded quickly. The medication orders from Jenny Lucca, Pablo’s admitting physician, reached Chan’s computer screen moments after Lucca had electronically signed them.

Pablo had a rare genetic disease that causes a lifetime of infections and bowel inflammation, and as Chan reviewed the orders, he saw that Lucca had ordered 5 mg/kg of Septra, the antibiotic that Pablo took routinely to keep infections at bay.

Chan immediately noticed a problem with this Septra order: the dose of 193 mg the computer had calculated (based on the teenager’s weight) was 17 percent greater than the standard 160-mg Septra double-strength tablets. Because this discrepancy exceeded 5 percent, hospital policy did not allow Chan to simply approve the order. Instead, it required that he contact Lucca, asking her to enter the dose corresponding to the actual pill size: 160 mg. The pharmacist texted Lucca: “Dose rounded by >5%. Correct dose 160 mg. Pls reorder.”

Of the scores of medications that the resident would order — and the pharmacist would approve — that day, this was probably the simplest: an antibiotic pill dispensed by corner drugstores everywhere, being taken as a routine matter by a relatively stable patient. Neither the doctor nor the pharmacist could have anticipated that this text message, and the policy that demanded it, would be a lit match dropped onto a dry forest floor.

Both Chan and Lucca knew that Pablo weighed less than 40 kilograms (38.6 to be exact, or about 85 pounds). But here is where worlds — the worlds of policy, practice and computers — collided. The 40 kilogram policy required that Lucca’s original order be weight-based (in milligrams of medication per kilogram of body weight), but the 5 percent policy meant that Chan needed Lucca to reorder the medication in the correct number of milligrams. What should have been a simple order (one double strength Septra twice daily) had now been rendered hopelessly complex, an error waiting to happen. And so one did.

After receiving Chan’s text message, Lucca reopened the medication-ordering screen in Epic, the electronic health record system used by UCSF. What she needed to do was trivial, and she didn’t give it much thought. She typed “160” into the dose box and clicked “Accept.” She then moved to the next task on her long checklist, believing that she had just ordered the one Septra tablet that she had wanted all along. But she had done something very different.

[...]

Since doses can be ordered in either milligrams or milligrams per kilogram, the computer program needs to decide which one to use as the default setting. (Of course, it could leave the unit [mg versus mg/kg] box blank, forcing the doctor to make a choice every time, which would actually require that the physician stop and think about it, but few systems do that because of the large number of additional clicks it would generate.)

In UCSF’s version of Epic, the decision was made to have the screen default to milligrams per kilogram for all kids weighing less than 40 kilograms, in keeping with the weight-based dosing policy. That seemingly innocent decision meant that, in typing 160, Lucca was actually ordering 160 mg per kg — not one double-strength Septra, but 38½ of them.

In a seminal 1983 article, Lisanne Bainbridge, a psychologist at University College London, described the irony of automation:

“The more advanced a control system is,” she wrote, “so the more crucial may be the contribution of the human operator.” In a famous 1995 case, the cruise ship Royal Majesty ran aground off the coast of Nantucket Island after a GPS-based navigation system failed due to a frayed electrical connection. The crew members trusted their automated system so much that they ignored a half-dozen visual clues during the more than 30 hours that preceded the ship’s grounding, when the Royal Majesty was 17 miles off course.

In a dramatic study illustrating the hazards of overreliance on automation, Kathleen Mosier, an industrial and organizational psychologist at San Francisco State University, observed experienced commercial pilots in a flight simulator. The pilots were confronted with a warning light that pointed to an engine fire, although several other indicators signified that this warning was exceedingly likely to be a false alarm. All 21 of the pilots who saw the warning decided to shut down the intact engine, a dangerous move. In subsequent interviews, two-thirds of these pilots who saw the engine fire warning described seeing at least one other indicator on their display that confirmed the fire. In fact, there had been no such additional warning. Mosier called this phenomenon “phantom memory.”

Computer engineers and psychologists have worked hard to understand and manage the thorny problem of automation complacency. Even aviation, which has paid so much attention to thoughtful cockpit automation, is rethinking its approach after several high-profile accidents, most notably the crash of Air France 447 off the coast of Brazil in 2009, that reflect problems at the machine–pilot interface. In that tragedy, a failure of the plane’s speed sensors threw off many of the Airbus A330’s automated cockpit systems, and a junior pilot found himself flying a plane that he was, in essence, unfamiliar with. His incorrect response to the plane’s stall — pulling the nose up when he should have pointed it down to regain airspeed — ultimately doomed the 228 people on board. Two major thrusts of aviation’s new approach are to train pilots to fly the plane even when the automation fails, and to prompt them to switch off the autopilot at regular intervals to ensure that they remain engaged and alert.

This bias grows over time as the computers demonstrate their value and their accuracy (in other words, their trustworthiness), as they usually do. Today’s computers, with all their humanlike characteristics such as speech and the ability to answer questions or to anticipate our needs (think about how Google finishes your thoughts while you’re typing in a search query), engender even more trust, sometimes beyond what they deserve.

The warnings in cockpits are now prioritized to reduce alarm fatigue:

“We work very hard to avoid false positives because false positives are one of the worst things you could do to any warning system. It just makes people tune them out.”

[...]

Because of this process, the percentage of flights that have any alerts whatsoever — warnings, cautions, or advisories — is low, well below 10 percent.

The goal of root cause analysis is to concentrate on system flaws:

Reason’s insight, drawn mainly from studying errors outside of healthcare, was that trying to prevent mistakes by admonishing people to be more careful is unproductive and largely futile, akin to trying to sidestep the law of gravity.

(From The Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine’s Computer Age, by Robert Wachter. Hat tip to T. Greer.)

Comments

  1. Ross says:

    Chasing RCA may be a good first step, but what will eventually solve these problems in complex systems — especially those systems involving human/computer interaction?

    Cribbing from Dekker and others working in resilience engineering, looking for a ‘root cause’ for a failure is as limiting and wrong-headed as looking for a root cause for a success. It’s simply not a good way to capture the interaction of all the parts of the system.

Leave a Reply