After years and years of gathering reams of sensor data from machines and instruments through the industrial world, more focus is being put these days on actually getting some high-level benefit from that data. But what about where that data resides and how it gets from here to there?
Data virtualization is not a new concept. There has been a growing need to not simply rely on one huge data set tucked away in an enterprise data warehouse for purposes of historical sales and manufacturing trends, and data virtualization has been instrumental in helping get the specific data that’s needed into the hands that need it.
Data virtualization has been a key part of Cisco’s product portfolio since it acquired Composite Software about 15 months ago. At its Data Virtualization Day this week in New York City, Cisco unveiled a major update to its flagship data virtualization platform, Cisco Information Server (CIS) 7.0. It includes tools that make it easier for IT to deploy data services to the rest of the organization, and for manufacturing to make use of the plethora of data sources that prevail.
Before data virtualization was more common, organizations used to take a megabyte of data from one source and another megabyte of data from another source, copying each of those chunks into a data warehouse. Data virtualization became necessary as not only the number of data sources grew, but also the volume and complexity of that data grew. “Now we talk about not megabytes of data, but petabytes of data,” says Mike Flanagan, general manager of Cisco’s Data Analytics Business Group. (That’s data with 15 zeros after it, by the way.)
Where once the data came from Oracle and SAP and maybe one or two others, all lined up in neat rows and columns, now manufacturers are gathering data from a huge number of sources that include things like Amazon web services, private clouds, on-premise data centers, big data via sources like Hadoop, MongoDB or Apache Cassandra, and the list goes on. “Now with big data, there are all kinds of different ways to store that data, and they’re not compatible,” Flanagan adds.
For data virtualization software to work, Cisco previously had to build a specific adapter for each new data source that came out. “Historically, that was not a problem, since it was maybe once a year,” Flanagan says. “Now it’s more like once a week.”
CIS 7.0’s Data Source Software Development Kit (SDK) tackles that problem by enabling system integrators and customers to quickly build their own data virtualization adapters for new data sources that they want to use. They can have the adapter certified by Cisco to ensure best practices and query optimization techniques. “For all these new database companies, rather than waiting for us, they can now do their own integration,” Flanagan says.
With data increasingly distributed, this new software’s Deployment Manager automates transfer views, data services, caches, policies and more across multiple instances.
This all makes it easier for IT to put data gathering schemes into action, Flanagan explains. “Customers will put it in some kind of proof-of-concept environment while they figure out how they want to deploy it for their needs internally,” he said. “Deployment Manager will help you effectively graduate the non-production environment into a production environment.”
In fact, much of what CIS 7.0 is geared toward is simplifying data virtualization to make it more accessible to non-IT parts of a business. “The biggest problem with data virtualization today is it’s still an IT-centric solution,” Flanagan said. “People still have to partner with IT.”
So a big part of the new release is a self-service component called Business Directory. Users can search and categorize to quickly find the data they’re looking for, and then use whichever business intelligence (BI) tool they choose to query it. This gives users more ready access to the data, which they can then analyze based on their own insight. IT can still manage security profiles, ensuring data is only visible by authorized users.