Exclusive: Experts Weigh In On the Impact of Data

In an exclusive roundtable discussion with Automation World, Georgia Tech engineering, science and research experts shared their insights on the trajectories and effects of artificial intelligence and machine learning, and how the intersection of data and technology is changing industry.

David Greenfield

Nov. 26, 2018

11 min read

Add Us On Google

Exclusive: Experts Weigh In On the Impact of Data — The discussion underway at Georgia Tech. Photo: River West Photography www.riverwest.co

You may have missed it, but in February 2017 the European Parliament adopted a resolution on the “Civil Law Rules of Robotics.” The resolution proposed that the European Commission create a specific legal status for robots "so that at least the most sophisticated autonomous robots could be established as having the status of electronic persons responsible for making good any damage they may cause, and possibly applying electronic personality to cases where robots make autonomous decisions or otherwise interact with third parties independently.”

To date, a group of nearly 300 European Union (EU) political leaders, artificial intelligence/robotics researchers and industry leaders, physical and mental health specialists, and law and ethics experts have expressed their concerns with this approach. Their response indicates they agree that the economic, legal, societal and ethical impact of artificial intelligence (AI) and robotics must be considered “without haste or bias;” but their response also shows they find the resolution, as proposed, has some clear gaps. In particular, the group noted that, from a technical perspective, the resolution overestimates the “actual capabilities of even the most advanced robots” and displays “a superficial understanding of unpredictability and self-learning capacities” as well as a perception of robots “distorted by science fiction and a few recent sensational press announcements.”

As intriguing as this ongoing EU debate into the legal status of robots and their ability to hold human rights is, it is very disconnected from industry’s day-to-day robotics and AI. But the root of the EU debate around how data is managed is clearly of interest in terms of how industry approaches its handling of data, and integration of devices and systems as part of an Internet of Things (IoT) or Industry 4.0 initiative.

To help me better understand how the computer science and data research being conducted today could impact the future of industry, Alain Louchez, managing director of Georgia Tech’s Center for the Development and Application of IoT Technologies (CDAIT), invited me to moderate a roundtable discussion with some of Georgia Tech’s leading experts. In addition to Louchez, participants in the roundtable included Jeff Evans, principal research engineer at the Georgia Tech Research Institute (GTRI) and director of the Digital Transformation of Things at the college's Center for Advanced Communications Policy; Dr. Haesun Park of the School of Computational Science and Engineering; Dr. Umakishore Ramachandran of the School of Computer Science; Dr. Yan Wang of the School of Mechanical Engineering; and Dr. Margaret Loper, chief scientist, Information and Communications Laboratory at GTRI; Dr. Elizabeth Whitaker, principal research engineer at GTRI's Information and Communications Laboratory; and Barry Drake, senior research scientist at GTRI's Information and Communications Laboratory.

No one in the middle
“You’ve probably heard the phrase ‘data is the new oil,’” said Evans. “And it’s true, which is why control of data has become as important to business as [any other kind of] intellectual property. The big challenge that’s emerging is that, as everything is integrated, you naturally want to share the data to get the most value out of it. But you have to understand that, in these integrated data sharing and analysis environments, the overwhelming majority of the data will be handled automatically—machine to machine without a person involved. only a very small percentage of all that data has to be in a format that’s friendly for human users.”

These comments underscore the connection between the EU’s developing approach to robots and industry’s handling of data. In both cases, human decisions and activities will increasingly be driven by analyses first conducted by machines. And it’s important to realize that reliance on these machine-driven decisions is not optional to industry’s future. The fact that so much data is generated by industrial devices, machines and systems necessitates our use of automated analyses. After all, there’s simply too much data for humans to process on their own.

“Users are accumulating a lot of data without knowing what they’re getting and why they’re getting it,” said Ramachandran. “That’s why pruning data at the source is such a challenge.”

“Big data doesn't mean all relevant data,” added Park, concurring with Ramachandran’s comment. She said you have to first understand other angles of data such as the four V's of data—volume, variety, velocity and veracity. She cautions that any pruning of data must be done judiciously. “Because data exists in so many different formats and from so many different sources and their values differ depending on the end goals, there are pros and cons to sorting data at an early stage,” she said.

“Of course, it bears repeating that data security and privacy are of the utmost importance, and need to be provided at the very start by design,” said Loper. “Tremendous achievements have been made in the last few years in industry, academia and standards organizations to ensure data confidentiality, integrity and availability, but these areas will always remain a work in progress.”

Wang added that to understand how data should be handled to have a meaningful impact for industry, there must be ongoing conversations and collaborations across three communities of knowledge relevant to industrial data—computer scientists, data scientists and domain experts. “If you have a limited amount of data, it can be difficult to make a decision; but with a large amount of data, you can overload people and systems,” he said. Therefore, there has to be domain knowledge involvement at an early stage to “collect and process data at a level that is meaningful to the decision-maker.”

And that’s where operations technology (OT) professionals will play a significant role in the development of automated data analytics.

“IT needs OT input,” stressed Ramachandran. “They need domain experts to find the area of focus. Companies have to transform themselves around domain expertise to solve problems at a level applicable across domains.”

Data digestion
Describing a yield problem recently experienced by a capacitor manufacturing firm, Ramachandran said the company looked to analysis of its production data as a potential way of resolving the issue. Ramachandran noted that the manufacturer had identified specific sets of data being produced at each step of the production process, but because these data were not being used in specific ways, no data was being passed from one stage of production to the next.

“Having a common data plane to get data into a format where it can be shared and analyzed is critical,” he said. Putting data into such a format, with the help of domain experts in the plant, enabled the capacitor manufacture to resolve its yield issue and led to a huge improvement in production.

“You have to get data to where it can be digested by next phase in the pipeline,” Ramachandran added. “But without domain experts to help you understand the ultimate use of the data, it’s hard to know how to design the system. That’s why data scientists should always begin a project by working with the end users.”

“It’s important to design features [of data analytics] to only process the information that end users need to understand and exclude the noise,” said Drake. “The biggest hurdle to this is recognizing the ultimate value of the data and how it will be used.”

To help with this process, Evans said it’s important to break data down into discrete elements before writing an analytics algorithm. Relevancy of data to the analytics will change according to the application, such as with latency issues related to control programs or analyses being conducted at the edge and in the cloud. AI algorithms can further drive the sortation of data for use after it’s been broken down with the help of domain experts, he added.

Even with domain experts involved in the data analytics process, Whitaker notes that many users still struggle to understand what the eventual use of any data analysis will be. That’s why it’s imperative to make human interaction with these systems as easy as possible. This need for ease of interaction is a big reason why the use of clipboards for data capture in manufacturing remains so widespread. The systems have to be as easy to interact with as a clipboard, she said.

Before any of these changes can occur—from the use of plant floor experts to compartmentalize data for the development of algorithms to moving workers away from the clipboard and toward the use of automated systems—executive involvement is required. The problem is that there currently “seems to be a chasm between the C-suite and the plant floor,” said Louchez. “There are numerous examples of engineers producing a lot of data that could be used to extract value one way or the other, but no vision at the top in terms of how to do it. For any of this to ultimately work, there needs to be an overarching plan at the business level,” he said.

Wang suggested that part of this disconnect is caused by many companies still being in the “wait-and-see” stage when it comes to IoT and Big Data.

Evans added that current “Wild West” environment in communications is contributing to this wait-and-see problem. “We know that everything is going to be a sensor of some kind, but how the communication of that sensor data will be handled remains a big question,” he said. “How do you leverage more efficient operations when these communication standards are still evolving?”

Moving forward
When it comes to sharing data, trust has always been a major issue whether you're talking about machine-to-machine, machine-to-human or even among humans, said Drake. Considering this reality, he said we have to be able to answer the question: Why should we trust AI? He noted that there is currently a big push in machine learning and AI to “open up the black box of machine-to-human and human-to-machine connections to engage people in the process of analyzing data to build this trust. “It’s important to engage people in a way that they can visualize it and understand it, so that it’s not just a matter of trusting the black box,” he said.

“An important component of an IoT system’s trustworthiness, which the National Institute of Standards and Technology (NIST) views as encompassing cybersecurity, privacy, safety, reliability and resilience, is ensuring there are adaptive algorithmic rules that machines can use to determine how to trust the data being exchanged and the decisions being made by other machines,” pointed out Loper.

“Reputation and past experience are big factors in building that understanding of trust” added Whitaker, and will play significant roles in growing the level of trust between people and systems. “For there to be humans in the loop, we need an easy way for them to interact with these systems and to have a knowledge-based approach to work so they can work—in a hybrid way—with data analytics. To date, there has not been a lot of work done yet on a top-down, human centered approach to this.”

The participants in the roundtable agreed that the education system will have to play a big role in preparing people to be part of the data analytics-enabled workforce of the future, which is coming at us fast.

The problem is that “the education system we have today is a bit broken, we are too siloed,” said Ramachandran. “To future-proof human resources, we should not produce siloes of knowledge, but train people to be able to grasp across disciplines.”

Evans added that he is part of a group at Georgia Tech that is “integrating workforce development concepts as part of a program optimizing base operations for a U.S. Department of Defense customer, and data analytics is a key part of what we’re doing. What we’ve learned early on is that, for data analyses to be successful, people need to understand communication protocols in order to have the ability to assess the systems they're interacting with and create dashboards for broad use.” This is such a critical need—with no clear fix on the near-term horizon as everyone in industry is well aware of through interaction with the various industry protocols—that Evans contends specific workforce development training should be focused here through two-year education programs.

Viewing these disconnects from a higher level, Louchez said that when he looks at some failing IoT initiatives, he sees that “they’re led as insulated activities and not seamlessly blended into the whole company. IoT, by its nature, is an integration—internally and externally. So if it’s not part of a bigger picture within the company to begin with, it’s going to fail.”

He added that another common IoT failure issue occurs when companies try to bite off more than they can chew. “They embark upon massive changes and expect rapid results, but they need to realize that an IoT-centered transformation is a long-term and inherently complex undertaking that will likely take much more time than they initially assumed.”