How to Apply Industrial Machine Learning

The concept of machine learning is becoming better understood as we increasingly interact with it every day. From Netflix and Amazon recommendations, to Siri and Cortana voice recognition, to Google Maps travel time calculations, we’re all becoming more familiar with machine learning technologies—even if we don't quite realize it yet.

Applying machine learning in industry, however, is a different story. Though several companies are doing it, it’s not nearly as ubiquitous as the consumer-oriented applications mentioned above. And that’s what made a presentation by Kathy Applebaum and Kevin McClusky of Inductive Automation during the Ignition Community Conference 2018 so interesting. In their presentation, they explained the main branches of machine learning, the different types of algorithms applied and—most importantly—what steps industrial users can take to begin using machine learning in their facilities.

Looking at the current state of machine learning use in industry today, Applebaum said predictive maintenance is the principal application for the technology, followed closely by quality control, demand forecasting and robot training.

Given the clear and growing interest in machine learning for industrial applications, McClusky pointed out that Inductive Automation’s Ignition software can now be applied here. With the release of Ignition 7.9.8 this past May, Ignition’s libraries now contain machine learning algorithms that cover a variety of use cases, he said.

Types of machine learning
Applebaum said it's important to understand first that there are three main technology types referenced as machine learning—and no shortage of arguments about the overlap among these three. The three types are:

Analytics is knowledge discovery, Applebaum said. “You’re probably already doing descriptive analytics (i.e. running reports from databases). Diagnostic analytics adds a ‘why’ component to descriptive analytics to identify the cause of the issue (e.g. Why did the machine break down?). Predictive analytics looks at what could happen in the future, and is not usually very specific, but is based on what’s happened before. Finally, prescriptive analytics builds on predictive analytics by recommending a next step to address the issue.”
Machine learning itself refers to the automated use of data to learn and improve from experience.
Artificial lntelligence encompasses computational tasks that simulate human intelligence, Applebaum said.

Algorithms
To illustrate how different algorithms work to enable machine learning in any of its three variations, McClusky began by showing how Ignition’s machine learning application can be used to group various data points together. For this demonstration he used quality data measurements (such as temperature and humidity) from individual pieces of a production process. By displaying these data on a graph in Ignition, McClusky showed how a user can view the data from multiple aspects in 2D space. “Ignition categorizes the incoming data from sensors on the fly for use in the graphic comparisons so that you can see how each group [of data] acts compared to each other,” he said.

The machine learning algorithm employed by McClusky in this example is known as a K-means, which clusters data points. “K-means don't know what the categories represent, it just calculates centers by figuring out where each data point is and how far apart they are,” he explained. This capability makes K-means good for categorization of data for defect analysis. For example, as a new part comes through [production], you can use its data to see if it fits into established ‘good quality’ parameters to determine if the part passes or requires further analysis.

Another machine learning algorithm type is known as a decision tree. This algorithm is very powerful as it can lead you step-by-step to determine categories or formulas for the data, said Applebaum. Decision trees are useful for predictive maintenance because you can use them to see how decisions were made. Plus, you can use it in conjunction with other algorithms.

Regression analysis is a machine learning algorithm that can be used for process tuning and production forecasting. “For process tuning, using a regression analysis can take a manual process and have the system generate recommended setpoints [for manual application] or have the setpoints go straight into the PLC,” said McClusky. “You can even use this for advanced process control for ongoing tuning.”

He added that regression analysis can be used for production forecasting based on a current set of variables (i.e. any data point) to determine, for example, what will be produced on a single line or overall by the end of a shift. “Longer term projections based on past experiences for all variables can also be done, such as: How does production look a week or a month from now? You can even input variables from other systems, like SAP. For machine learning [applications of regression analysis], the more variables you use the better your results—provided the data is good to begin with, of course,” he said.

Neural network algorithms simulate the way we think our brains work, Applebaum said. One way in which neural networks are commonly used in industry are vision systems. With neural networks, you can look at specific items in lines or processes and use existing sensors to infer data [from those areas] to simplify processes.

Referencing a vision system machine learning application using Ignition at Frito-Lay, McClusky said the company applied it to an area of a line where a scale was located to weigh the potatoes. Frito-Lay wanted to use the vision system to determine the density of the potatoes on the line so that cook times could be adjusted accordingly for each batch. They were able to do this successfully, allowing them to eliminate dumping a portion of potatoes for weighing.

Regardless of which kind of machine learning application you plan to use or which algorithm you apply, you need good data from the start, which means you need a strategy for finding the right data and handling it so that you’re assured of its quality. “Use statistics to sample data to tell if it's good,” said McClusky. “You need to know the whole universe of what you’re dealing with to get good results. So you can’t just take historian data on its own; you have to look at sampling techniques and correlation versus causation. Then, consider how good your results are. This is where domain knowledge and knowledge of processes is important. Domain experts—not data scientists—know what types of data are promising and when results don't make sense.”

Implementation
With a good understanding of the types of machine learning and the algorithms that support them, the next step is to start thinking about applications. Applebaum outlined the following five steps to machine learning implementation success: identify the problem(s) you want to address, gather the data, create the model, deploy the model and monitor for success.

To identify the problem, Applebaum says it’s best to pick a question you want to answer. For example, do you want to improve a specific process, decrease defects, etc.? When doing this, beware of the high-value versus easy decision, she cautions. “Go with an easy implementation first, because high-value [projects] can be a hard place to start,” she advises.

With that said, Applebaum stressed that it’s still important, even with easy machine learning projects, to ensure there is some value to be obtained and not have it be a project that exists solely for the sake of demonstrating the technology. “Understand the cost function—the difference between the prediction and actual results based on what you're trying to do.”

This is another area where domain knowledge can play a big role in machine learning applications in terms of selecting useful data for the project, acquiring missing data, ensuring quality data input and identifying dependent variables (i.e. data points that are linked to each other, such as temperature and time of day).

McClusky added that, when implementing a machine learning project, be sure to use an extract transform load (ETL) to acquire the data rather than the production database itself. Automate the data acquisition process [with an ETL] so the data are acquired, cleaned and have missing values handled automatically.

Then start visualizing the data in Ignition to help understand it, so you can see which type of algorithm you want to apply, Applebaum advised, adding “And don't be afraid to try more than one [algorithm].” Ignition offers K-means, database scans, neural networks and simple regressions. Amazon Web Services, Microsoft Azure and Google Cloud offer other tools for further analysis.

“Lots of people try to skimp on this testing process,” Applebaum noted. “But don't do that. Make sure your models actually predict things you haven’t seen before. Go back and re-test as much as needed to get a useful model. It’s better to spend time here getting it right than having to fix problems later.”