An oil refinery in Europe is hit by lightning and a fire breaks out. Five hours later, a vessel overfills, causing liquid to enter a flare line that was designed for vapor usage only. When the line fails, twenty tons of flammable liquid hydrocarbons are released. The resulting explosion and fire cause 0 million in plant damages, 26 minor injuries and additional unspecified costs due to lost production.
What went wrong? The investigation into the incident identified many causes, including: a misleading indication of a control valve position; a plant modification that had not been properly assessed; control panel graphics that did not include overviews; and excessive alarms—two operators were confronted with 275 alarms during the 11 minutes before the explosion.
This real-life case story was presented at a Honeywell Users’ Group meeting last June in Phoenix, as part of a discussion on managing abnormal situations. In their paper titled, “You Too Can Use Abnormal Situation Management to Make a Difference at Your Site,” Bradley G. Houk of ExxonMobil and Jamie Errington of NOVA Chemicals presented an overview of Abnormal Situation Management (ASM). The pair cited examples in which ASM could have reduced primary causes of incidents and discussed the value their companies derive from membership in a group known as the ASM Consortium.
The ASM Consortium, which is led by Honeywell, is comprised of leading oil and chemical companies and other organizations with specialty expertise. (See sidebar for more information.) In fact, ASM and Abnormal Situation Management are registered trademarks of Honeywell Inc. This doesn’t mean, however, that ASM is limited to Honeywell and its customers in the oil and chemical industries. Industry analysts and others use terms such as Abnormal Condition Management (ACM) and Critical Condition Management (CCM) when discussing the management of plant incidents.
Whatever the acronym, the definition cited on the ASM Consortium Web site holds true: “An abnormal situation is a disturbance or series of disturbances in a process that causes plant operations to deviate from their normal operating state.” The disturbances may be minimal or catastrophic, and cause production losses or, in serious cases, endanger human life.
Abnormal conditions can occur in any processing environment, from oil refineries and chemical plants to food, pharmaceutical and biotechnology manufacturing. Based on research by the ASM Consortium, and as reported by the National Institute of Standards and Technology (NIST), process incidents resulting from abnormal situations cost U.S. manufacturers about 0 billion per year in lost production, and an additional 0 billion per year in associated costs such as equipment repair.
“Those figures may be conservative,” says Kevin Harris, director of the ASM Consortium and a marketing and technology director with Honeywell Industry Solutions in Phoenix. Process disruptions contribute to plant capacity losses of 3 percent to 8 percent per year, according to the Consortium. “We focus not just on the ‘big bangs’ but also on the minor disturbances and perturbations that, added up over the years, represent a huge chunk of the profit at a site,” adds Harris.
Three sources
The ASM Consortium identifies three principal sources of abnormal situations: people or work context factors; equipment factors; and process factors.
Impacts on the people factor, which accounts for an average 42 percent of incidents, are the training, skill and experience levels of the operations teams and their stress levels when situations reach alarm conditions. As well, the organizational structure, communications, environment and documented procedures and practices (or lack thereof) play a role in operator response.
Equipment factors, which account for an average 36 percent of incidents, include degradation and failures in the process equipment, such as pumps, compressors and furnaces, and failures in the control equipment, such as sensors, valves and controllers.
Process factors account for an average 22 percent of incidents. Impacts include process complexity, types of materials and manufacturing (batch vs. continuous) and state of operation—steady state vs. startups, shutdowns and transitions.
Industry analyst Asish Ghosh, vice president of Manufacturing Advisory Services for ARC Advisory Group, Dedham, Mass., says critical conditions are caused by a combination of events that are not normally expected to happen at the same time. “There are layers of protection in a critical condition management system,” says Ghosh. The first layer is the process controls, usually a distributed control system (DCS) or programmable logic controller (PLC)-based system that has safety interlocks and exception logic. The next layer is the dedicated safety shutdown system, which shuts down the process unit completely if the process control system fails to manage the incident. All of the major vendors of large process control systems—including ABB, Emerson Process Management, Honeywell, Invensys, Siemens and Yokogawa—include safety solutions as part of their systems. The final layer of safety is the fire and gas protection system, which prevents and contains catastrophic incidents.
Explains Ghosh, “If the CCM, or Critical Condition Management system, does not work, for whatever reason, there are safety shutdown procedures that keep the plant safe, but can cost millions of dollars in lost production. The ideal CCM should prevent a system shutdown by giving the operators guidance through alarm filters and deductive alarm notification, and supporting the operators in the recovery process.”
Software + strategy
Ghosh says the concept of CCM, which runs across all plant protection layers, is relatively new and fills a gap between control and shutdown systems. Once ruled by proprietary solutions, CCM has seen the recent announcements of off-the-shelf software packages to manage critical events. Ghosh cites Danville, Calif.-based Control Arts, with its alarm enforcer and history analysis package; Burlington, Mass.-based Gensym’s Optegrity, which uses its G2 real-time expert system to develop and deploy CCM solutions; Honeywell’s Asset Manager PKS and other solutions to minimize incidents and maximize asset availability; Edmonton, Alberta, Canada-based Matrikon, with its ProcessGuard alarm and event management tool; Kingwood, Texas-based Nexus Engineering’s Nexus OZ, which also uses expert systems to provide reliability and diagnostic information to existing operator consoles; and Houston, Houston-based Plant Automation Services’ AMO Plus for alarm management across many DCS, PLC and panel-mounted alarm systems.
Software concepts and strategies are part of the deliverables of the ASM Consortium. In 1994, the Consortium received 6.6 million in funding from NIST for researching and prototyping ASM solutions. Says director Harris, “The Consortium developed a concept for a suite of software that would allow operators to manage abnormal situations. A lot of those ideas have now started to come forward in new Honeywell products and services, such as the Operator Performance Solution suite and alarm management services.”
Consortium member companies, many of which have been impacted by merger and acquisition activity, need to apply ASM across a variety of process control systems, not just those from Honeywell. The Consortium supports this by developing solutions that are vendor-independent and employ software standards.
One example is the guidelines documents developed by the Consortium. Says Harris, “The display guidelines document recommends design principles in a generic, non-equipment specific sense. It outlines about 85 guidelines on how operator displays should work, starting at the top with hierarchy and philosophy recommendations and going down to details such as what color lines should be on schematics.”
The ASM application packages use open standards, such as OLE for Process Control (OPC), and can work with any system that supports the open interface. “A large part of what we do is applicable to whatever system is used,” Harris explains.
Consortium members pay an initiation fee and annual dues. These monies are used to fund ongoing research. Meaningful research in this area is very expensive, in terms of dollars and resources for people and site access. “By collaborating,” says Harris, “we can benchmark across different companies and plant sites to determine best practices and root causes for incidents. The fundamental approach is a pooling of knowledge so everyone ends up with a lot more than if they had done this by themselves.”
The proof is in the practice. In the Europe oil refinery situation presented at the Honeywell Users’ Group meeting, the papers’ authors contend that the flood and mis-prioritization of alarms—over 87 percent in high alarm—could have been prevented with proper overview displays and ASM alarm management services.
Concludes Harris, “I’ve been around the industry for 29 years, seen a lot of control rooms and talked to a lot of operators. Recently, I visited the large control room of one of our member company’s world-class site for ASM. There was not a single alarm on their screens. I had never encountered that before. Their operators do not manage the process by reacting to alarms; they are way out ahead.”
See sidebar to this article: Dashboard drives down road to profits