Every automation system eventually develops a situation requiring advanced engineering support. This type of break-fix support could be due to any number of causesâpower outages, server maintenance, operator error, etc. But no matter what the root issue turns out to be, sooner or later, every system will need it. And thatâs why itâs equally sure that, here at Avanceon, all our engineers will at some point find themselves helping to support customers to keep their manufacturing processes running.
Troubleshooting, like coding, is a unique and special set of skills, and each person might have a slightly different approach to resolving an issue. When I find myself in a break-fix situation, I tend to follow a regular procedure to try not only to fix the problem but also determine the root cause of the issue.
Step 1: Ask questions
Begin by discussing the symptoms of the issue with the person reporting it. If you think about it, how can you solve a problem if you donât know what the problem is? Asking the right questions in this first phase of the support process is vital to enabling a successful resolution.
Step 2: Replicate the issue yourself
Sometimes the information youâve gathered in the first step might not quite paint the full picture of the situation. When I try to replicate the issue, I often gain insight into what the user is actually reporting.
Step 3: Check the log files
A well-built system will provide evidence of what is happening in the event something is not working properly. If youâre lucky, error messages will provide the context for understanding the actual problem. Even if the system hasnât generated any error messages, the system logs can often provide details regarding behind-the-scenes issues in a script or database transaction. Analyzing these messages can often reveal the issue at hand.
Step 4: Trace backwards
Start at the point in the system where the issue has been reported and trace backwards. For example, letâs assume the user is experiencing an issue on a specific application screen. Begin drilling down into the specific elements of the screen that are not workingâa button, for example. Dig into the code/function behind the button to see how itâs supposed to work. Perhaps the button triggers a script that queries a database for data, but that data isnât displaying on the screen. Tracing through these individual elements/functions can often help to understand where in the process the malfunction occurs.
Step 5: Restart/redeploy the system
Usually, itâs not going to be possible to restart servers in a manufacturing system without taking down other, still functional parts. However, I find it amazing how often simply turning it off and on again will fix a system when some underlying aspect gets out of sync.
Step 6: Document the findings
Itâs always good practice to document the issue, both for the customerâs benefit and to provide insight to the support team. One of the main benefits of documentation in a support situation is to provide some guidance should the same situation reoccur. You donât want to spend valuable time trying to reanalyze an issue if you donât have to.
Thereâs nothing revolutionary in my six-step process, but I find itâs a workable model for helping me find, analyze and correct system issues. If you have a similar best practice, please share it with us!
Find more information about how Avanceon approaches engineering projects.
Ed Miller is as engineer at Avanceon, a certified member of the Control System Integrators Association (CSIA). For more information about Avanceon, visit its profile on the Industrial Automation Exchange.
About the Author
Ed Miller
Engineer, Avanceon

Leaders relevant to this article:
