Historian Best Practices: Distilling the Truth

Historians enable deep insight into the behavior of a process at a given point in time, but depending on which member of the automation team you are, you may not know enough about the data you’re capturing or how to optimize its use.

Renee Bassett

Aug. 20, 2012

11 min read

It was the case of the missing product. A full tank of product literally disappeared from its batch tank and supervisors had no idea where it went. Luckily, a data historian was in place. The system had documented all digital and analog measurement and control device values associated with the process, and by examining the data, management could quickly pinpoint when the level in the tank started dropping. Then, they overlaid historic trend data on all the on/off valves around the tank. From this, it was clear that the tank drain valve had been activated on a specific date and time, sending a full tank worth of product down the drain. Referencing separate data for that device at that time, supervisors were able to identify which operator had manually activated the drain valve and caused the product loss.

Enabling such deep insight into the behavior of a process at any given point in time is the purpose of a historian, which is a software application with three main functions: Acquisition of real-time data, storing of real-time data, and access to the data from various other applications for presentation to the end user. Many process plants installed such electronic data recording software when they installed their control system. But, depending on which member of the automation team you are—plant manager, engineer, operator or information technology (IT) specialist—you may not know enough about the data you’re capturing or how to optimize its use.

“Many plant managers know they need a historian, but never see it in use,” says Marc Leroux, global marketing for Collaborative Production Management technology at automation systems vendor ABB (www.abb.com), Columbus, Ohio. “The data is often used by engineers or quality control, but not by management (other than through preconfigured reports). One part of the challenge is that the delivery mechanism, or how management is supposed to interact with the data, is not defined. It needs to be something easy to use, and end user (non-IT) configurable.”

Jim Kline, product line manager for Collaborative Production Management technology, says ABB typically sees three types of users of historians:

• Those who buy a historian to use some pre-configured application, such as alarm analysis, overall equipment effectiveness (OEE) or energy management. “In these cases, the historian design and structures are preconfigured to meet the application requirements,” Kline says.

• Those who implement a historian to solve a specific problem, often regulatory. “They collect the information that is needed for compliance, then will often go back to add additional data points when someone asks specific questions,” he says.

• The other type of user collects all data that they can easily configure, and waits until they have a question to answer. The belief is that, no matter what the case, the data is there.

“What we often see, in the latter two cases, is that the data is there, but without context. And data without context is just data,” says Kline.

How much data
To manage real-time process data so it becomes information, you need to know what you have coming in, and what you want to do with it. Colin More, applications engineer with global engineering and construction company M+W Group (formerly Global Automation Partners or GAP, www.mwgroup.net), says, “A historian can acquire different types of data from analog (floating point and integer), discrete (on/off, open/close, 0/1), string information, batch information, alarms and events. Historians need to serve a platform to link disparate, real-time data sources into a single application. The application can of course render the information in the form a trend, but historians can also provide the perfect platform to support applications that create [information] dashboards, compare batches, create reports, and more.”

Systems integrators are often responsible for writing the detailed business intelligence specifications that serve as the foundation for knowing what data to collect and how they intend to use it. “Our applications vary, but in general they are factory floor automation projects in the food, beverage and dairy industries. As we are typically much more experienced in [data collection and use] than our clients, we bring lots of examples from their industry,” says David McCarthy, president and chief executive officer of TriCore, Inc. (www.tricore.com), Racine, Wis. “A dairy will be interested in different things than a soft drink bottler. We bring that perspective to our clients. We are normally installing new historians or upgrading old ones and as a general rule, we historize all analog points, and all critical discrete points.”

200 mbytes/day
Just how much data can you expect to store? The answer is “a lot” and possibly too much or not enough. “Most data is being analyzed every five seconds,” says McCarthy, generalizing about his clients. “Data is logged only on change in value, to prevent the same value being written over and over. This helps greatly with data compression. Still vast amounts of data tend to be logged, typically 2,000 to 5,000 data points at varying intervals. Actual data accumulation ranges from about 70-200 Mbytes/day.”

M+W’s More says “A typical system that would use a historian could range from few hundred data points to hundreds of thousands of points. Historians are built to acquire massive amount of data for these systems on a real-time (in the second/millisecond range). Some historians can also collect some relational data.”

ABB’s Kline says, “We hear a lot about ‘Big Data’ and the presumption is that all data is the same. It is important to understand the context of the data, and how it applies to the process. We have seen, for the past several decades, that it is difficult for IT-centric analysts to understand the process environment. That does not change when data is put into a historian.”

Real-time process data decisions typically are made by the engineers in the operation. “These decisions are often made initially based on process variable trending requirements, but as more advanced applications are implemented, [these applications] often require historized data as an input, and those requirements may determine additional process data to be historized,” says Peter Martin, vice president of business value consulting for automation vendor Invensys Operations Management (iom.invensys.com) in Foxboro, Mass. (For a discussion of the changing role of plant historians, see the Invensys whitepaper, “Performance-Based Industrial Infrastructures.”)

Like many historian applications, the Wonderware Historian database is preconfigured, explains Elliott Middleton, product manager for Invensys’s Wonderware historian and clients. “The typical historian application is a simple extension to a supervisory control function. Historization then requires only setting the desired tags. Wonderware Intelligence applications, however, do require more detailed design effort,” he adds.

“One theme we have seen again and again,” says ABB’s Kline, “is that there is a specific amount of capital assigned to a historian project, and the implementation goes as far as the money permits. If there was a good infrastructure in place, with everything already connected to a DCS [distributed control system], then it was simple to implement the project. But this is seldom the case, so the decision is often made to collect what is readily available and tackle the others when funding permits. This leads to incomplete data used in analysis.”

Collecting data
“The challenge of collecting information is pretty well solved,” contends Mike Bowbyes, PMP, product marketing manager for Honeywell Process Solutions’ Advanced Solutions Group (www.honeywellprocess.com). “The standard for archiving the data has been around for 15 or 20 years in pretty good form and includes things like compression algorithms that allow you to record everything that happens without storing everything in database. But companies are moving to a central repository that may push the technical limits.

Bowbyes says Honeywell’s historian, which has automatic tag synchronization with its Experion control system, supports “up to 2 million tags in a single database, but you could have 2 million tags in a single plant. It’s uncommon for people to want to store data for life of a plant, but it is a regulatory requirement in some countries. Some have to keep millions of data points gathered at a five second frequency for 30 years.”

GE Energy’s (www.ge-energy.com) iCenter is an example of a historian application pushing the limits. The iCenter is a 24x7 operation that manages an enterprise-size historian implementation of more than 2 million tags logged every one second from over 700 assets at 25 sites around the world. This data is the foundation for advanced analytics used in alarm detection and dispositioning, as the primary source for ongoing product development engineering, and is used to support contracted services agreements that include everything from spare parts to performance guarantees. Proficy Historian from automation vendor GE Intelligent Platforms (www.ge-ip.com) is the foundation, and it’s connected to Proficy SOA, which models the fleet and provides a single point of contact for Proficy Workflow to manage back-end operations, interact with reporting services and visualization, and support connectivity to data warehouses and relational databases used throughout GE Energy.

According to Dan McGuire, program leader of global professional services for GE Intelligent Platforms, “Tag namespace and compression setting were identified early as key design elements of a stable and scalable system. Defining and managing the namespace has made the system predictable and has avoided a chaotic naming scheme that can happen in a multimillion-tag system.

“At the start of the design, we felt we had a good methodology around determining the right compression setting for tags by breaking them into signal types, then using historical data and end-user expertise to select the best settings. Experience has taught us that this needs to be an ongoing process that must become part of regular operations. Business and engineering demands can sometimes require high-resolution data across specific signals in every asset in the fleet, for example, and if this is not controlled through a lifecycle process, then storage and overhead to maintain the data for years can be unpredictable and costly.”

Which data the iCenter collects is very dependent on the project objectives, says McGuire. “If our end goal is to optimize and have visibility into a power cogeneration KPI, we need to historize and collect raw data for energy consumption, energy generated, energy exported to the grid, and with that, we can generate different views, summarize data, and other reports.”

Typically, critical quality measurements and ambient properties important to the process are collected, like temperature, pressure, humidity, etc. “Tags are almost always added later after the value of the historian is established. The sampling rate is usually taken as ‘the fastest as possible,’ but this also depends on the industry, the kind of information and goal to accomplish as end result,” McGuire adds.

Making connections
The infrastructure in place is often a key factor when deciding how to collect data. McGuire says, “We collect data via OPC, Modbus, File Collectors, API, SDK, and more. And many times extra network drops are required to collect additional data.”

M+W’s More, who specializes in FactoryTalk Historian from Rockwell Automation, says the application can connect to almost any type of real-time control system. “Today, it is most common to use OPC servers for interfacing, but the system will support almost anything—ODBC, Batch executive, Modbus, etc. The data gathered from FactoryTalk Historian then can be accessed by standard Rockwell FacotryTalk tools such as, but not limited to, ProcessBook, Datalink, VantagePoint, and BatchView.”

Sometimes getting data out of legacy control systems is hard, but “OPC has helped us with that tremendously,” says Honeywell’s Bowbyes. “There are still issues with moving data across networks and firewalls with OPC, and OPC-UA will help with that, but the standard is there for real-time and historical data.” OPC is a series of data access standards specifications for connecting industrial automation and enterprise systems. There are currently seven OPC standards specifications completed or in development.

No matter how you get your real-time data or how you store it, the key thing to remember is to design in flexibility.

“We are seeing new use cases come up that a historian needs to support,” says ABB’s Leroux. “They involve unstructured networks of disconnected data (such as data collected on ships that may not be synced for several days or weeks), data from exponentially large sources (such as Smart Grids that start at the residence, then go to the substation, then aggregated at the utility), geographically diverse areas (wellheads in disparate oil fields), and the like.

“Finally, a number of our customers are exploring decentralized monitoring and control, where a virtual control room may be in India for an oil platform in the North Sea. The bottom line is that conditions change rapidly, and the use of the historian is going to have to be able to react to these changes.”

>> Struggling to Fully Use Historical Data? Click here for tips

>> Click here for more links and resources on historians.