Back in the day of analog indicators and pneumatic control on the plant floor, Harvey Ivey entered the energy industry as the low man on the totem pole, shoveling coal and sweeping floors before moving into the role of operator. Today, 36 years later, he is the manager of instrumentation and control systems and support at Southern Company, overseeing more than 100 units in the field at the Atlanta-based producer of electricity.
Over the years, Ivey has seen many changes, most notably the move to distributed control systems (DCS) that deliver data in a digital format. From a safety standpoint, the digital displays gave operators more information about what was happening in the plant, Ivey says. In addition, in the digital world, it was easier to add numerous alerts and alarms. Armed with all of this new information, operators could quickly react to developing incidents, right?
Wrong. Ivey noticed that over the years operators were making more mistakes than they had in the past. But the human error was not due to negligence. Rather, it was a symptom of a deeper problem.
“We flooded the operator with information and increased his accountability for more equipment, but we didn’t give him the right toolset to manage it all,” Ivey says.
Specifically, the human machine interface (HMI), which is the window into every plant floor process controlled by the DCS, has been clouding the operator view. Either older, customized HMIs were not updated when a new DCS was deployed, limiting the data that was collected and displayed, or more likely, the HMI was designed by engineers focused on putting as much data as possible in front of people, without considering the impact on the operator.
“We upgraded everything in the control room except the operator, who is the same human with the same potential and liabilities,” says Ivey.
Indeed, industry observers say most industrial accidents are a result of human error—typically because it was difficult to assess the data the HMI was displaying. During the analog days, all it took was a glance at a pressure or temperature gauge to intuitively know if all was okay. But in today’s digital world, operators are inundated by numbers that require a bit of “mental gymnastics” to process what’s happening, Ivey says. “We have almost come to the point of unmanageable complexity. We’ve thrown so much at an operator that it is hard for him to process it all.”
To fix this deluge of dysfunctional data, the industry is responding by developing operator-effective HMIs that present information in a manner that humans can absorb quickly. Next-generation HMIs play off of a person’s innate pattern recognition ability, industry experts say, as well as include the strategic placement of colors and graphical representations of controls and instruments.
“Regardless of how great a control system is, if you don’t have the proper front end to show information to operators in context and in an ergonomically correct way, you are compromising the operator’s ability to safely manage production,” says Eddie Habibi, CEO of Houston-based PAS, a provider of process automation, alarm management and HMI-related products.
In addition, a new approach to alarm management needs to happen to offset the daily alarm floods that overwhelm, distract, and actually fatigue operators, which negatively impacts their ability to manage a situation.
“It is crazy how many alarms go off in a control room,” says Tom Williams, program manager for operator effectiveness at Honeywell Process Solutions and the director of the ASM Consortium, an R&D group defining best practices, products and tools for effectively managing abnormal situations.
According to ASM, there are a number of factors that can contribute to the onset and escalation of an abnormal situation. A key component of managing abnormal situations in initial stages is the intervention activities of the operations team. But if alarms don’t locate the root cause of a situation, they are contributing to the human error that can develop into a disaster.
Incident update
The general public is aware of high-profile accidents, such as the 2010 explosion on the BP Deepwater Horizon oil rig in the Gulf of Mexico that killed 11 people and released, according to BP, an estimated 3.26 million barrels of oil into the ocean. But almost every day there are reports of fires, explosions or chemical leaks at process plants around the globe. Last year, the ASM Consortium recorded over 1,000 incidents on its website.
Chemical company BASF, which maintains a database of all of its own plant incidents, identified that these events are often triggered by an operator’s delayed reaction, notes Keith Dicharry, BASF North America’s automation director and principal expert of process control. As an engineer, Dicharry believed that putting as much information as possible in front of the operator would help him or her to make better decisions.
“In the last few years, we’ve realized that is not true,” Dicharry says. “That’s when we began to look at what we were doing wrong from an automation standpoint.” And to really understand the process, “I needed to get my mind around being an actual operator,” he says.
Similarly, Southern Company enlisted the help of its control system operators to work with engineers on the internal design of a high-performance HMI, dubbed PowerGraphiX, which is now live in the control room of a 750 MW unit. The operators provided insight into the type of information and graphics they needed, and then Ivey worked closely with PAS’s Habibi to incorporate the right context and visualization into the display. (Habibi co-authored The High Performance HMI Handbook and The Alarm Management Handbook.) They also organized and simplified the number of screens, and whittled down the elements associated with the control system, equipment and diagnostics to 80 graphics from about 600 graphics.
Now, Southern Company operators can perform about 80 percent of their daily tasks from six graphics, Ivey says, and it takes just two clicks to navigate anywhere they need to go in the system. More importantly, operators are empowered to make better decisions.
“The operator is a knowledge worker…consuming so much information all day,” says Habibi. Not only do they need to be empowered, but there needs to be a culture shift in companies to recognize the critical role operators play in the entire organization, he says. “The CEO is important, the engineers are important, but the most important person whose minute-by-minute decisions can define the difference between profitability and loss, and between safe operations and accidents, is the operator.”
The abnormal situation matrix
Before launching a new HMI initiative, organizations need a strategy. It starts by assessing the existing environment and taking inventory of machines, instruments, controls and alarms tied to each aspect of the plant floor.
Each time BASF embarks upon an HMI/alarm management upgrade, the company conducts a control room effectiveness study, drilling down into every tool given to the operators. They then go through a documentation and rationalization exercise, analyzing every process to determine if the alarm associated with it is needed. When the procedure is finished, only processes that require an operator to physically do something, like a visual check, will get an alarm, Dicharry says.
Dicharry took ASM Consortium guidelines into consideration and worked with PAS during the rationalization process and development of a high-performance HMI. In the high-performance graphics, an operator has visualization into an image that, ironically, resembles an old panel board—depicting measurement needles and red lines against a grayscale background to display operating limits at a glance.
Digital displays often have every color of the rainbow flashing a bunch of numbers on the computer screen—resulting in a real headache. “Now we are seeing the adoption of grayscale graphics, providing a simple view of the manufacturing plant,” says Roy Tanner, ABB’s global marketing manager for the 800xA DCS. “The only time you see color is when there is a problem that needs to be addressed.”
The same simplified philosophy applies to rationalizing alarms through a hierarchy of four levels: Level 1 process area overview; Level 2 process unit control; Level 3 process equipment detail; and Level 4 unit diagnostics. And many of the DCS vendors are focusing on new alarm analytics that will help engineers and operators identify duplicate, bad or stale alarms that contribute to alarm floods.
Honeywell Process Solutions, for example, has built an alarm tracker that will prioritize and group alarms so that, when an alarm flood occurs, there is a visual track back through the stream to where the alarm originated. In addition, built into Honeywell’s HMI is a visual thesaurus that provides a structured approach to the four-level hierarchy. “We are starting to look more carefully at how we are grouping things and presenting it in ways that people already understand,” says Williams.
By building an HMI around patterns, recognizable symbols and priority levels, there is an inherent standardization involved as well. Most organizations have a mix of DCSs in the plants, and the goal is to deliver the same information to the operator regardless of the platform.
“It would be fantastic if I could have my reliability engineers across the globe comparing data and specs for different systems,” says BASF’s Dicharry. “This is about global visibility so that I can have my best resources across the globe looking at the same data at the same time in the same format, which is allowing them to troubleshoot.”
A standard approach would also put companies in a better position to incorporate predictive analytics. Now that Southern Company has mapped out an upgrade path to high-performance HMI that it will roll out to all of its plants over the next five years, the company is proactively looking at ways to stop an incident before it ever starts. According to Ivey, the company is working with Emerson Process Management on a prognostic control technology, which includes embedding a process simulator into a control system.
“You can tell an operator where you are now, and based on the prognostic control, if things stay the same, where you will be in 15 minutes,” Ivey says.
It is an early indicator of trouble ahead. Emerson is applying predictive types of technology to specialized areas as well, such as vibration—an area that needs a skilled subject matter expert. But not every operator understands the difference between good and bad vibration on a machine. Emerson’s PeakVue signal, however, boils it down by using the “zero principle” and the “Rule of 10s.” In this setup, 10 indicates a problem has developed on a machine; 20 means the problem has become serious; 40 means the problem is critical; etc.
“It is now possible for operators with little or no understanding of vibration analysis to distinguish between machines which are in good condition and machines in distress,” says Robert Skeirik, senior product manager at Emerson’s machinery health group. “This creates a new paradigm between operations and maintenance.”
Similarly, PAS is pioneering an area it calls “boundary management,” which follows the principle that every process in a plant has boundaries, including mechanical, operational and safety, all of which are defined in the original design of the plant. There is a co-dependency among all of the equipment and, if something is out of sync, an operator needs to have visibility into the boundaries of the plant. PAS worked with some of its key customers to develop an algorithm that figures out when a boundary was crossed and, integrated with alarm management, sends a notification to the right people.
Ultimately, it is up to the operator to make the right decision in order to avert an incident. The operator, Habibi says, is critical to the entire process. “We monitor and feed the operator the information he needs in real time. It is about helping him make the right decision, because his reliability is just as important as equipment reliability.”