With the growing interest in big data analytics combined with the shortage of affordable experts and skilled practitioners for deploying and managing big data initiatives, many organizations are looking to managed services as a way to fill the need. More MSPs continue to develop sophisticated services for big data processing, along with all other aspects of data management and analytics.
Big data processing can function well as cloud-based, end-to-end managed services that integrate bi-directionally with on premises data centers and systems. End-to-end services can include streaming data flows, data management activities like data governance, and data visualization and analytics. Big data analytics on cloud infrastructures require top notch management of security and compliance, areas that MSPs are currently improving anyway to proactively move into new technology areas in demand for many companies.
Remember: big data doesn't just refer to high volume or fast-moving data that is important to large corporations. More companies of all kinds need to extract value from a wide variety of data sources - often these sources are difficult and messy for traditional data processing. These sources are also part of the "big data" analytics story and often yield valuable insight. "Messy" data sources include: multi-structured or unstructured data with highly variable formats and semantics like social media content, log files, and e-mail; and machine-generated data from medical devices, industry sensors, automated machinery and systems, and GPS.
Metadata Management Services
An important must-have capability for big data processing services is comprehensive metadata management. Metadata provides information about a data item, such as product, that uniquely describes that item. Product metadata can include product ID, product category, supplier ID, size or dimensions, and so on. A field like product ID is also a means for linking to other data sources, for integration purposes. Through metadata descriptors, we can talk about data items in common terms, outside the actual item, and take advantage of metadata to integrate and better understand disparate data sources.
Metadata management is one of the key areas of comprehensive information and data governance. Agile data governance tools are in demand and all sorts of data management vendors are chasing this business. As MSPs expand their presence in data management and analytics services, metadata management and data governance should be of great interest to them. Hopefully the availability of managed services in these areas will make comprehensive data management more affordable and available to organizations of all sizes.
What Metadata Does for Big Data Analytics
Metadata is not just about data integration and enterprise data warehouses; for other enterprise needs, metadata helps find data during data discovery, and points the way to interpret and use data correctly. Metadata can greatly streamline and enhance processes to collect, integrate, and analyze big data sources. Working with messy big data like multi-structured sources means that metadata is critical to understanding this data and to connect it to other data sources. Multi-structured data comes with many difficulties - information can be mined, but for it to have meaning and value, attributes like sentiment, purpose and context must be determined as well, and correlated with data sources for customers, products, and so on.
Traditional data sources, such as relational data, provide plenty of logical structure through more easily obtained metadata. However, big data often does not carry a lot of 'native' metadata, so metadata from external sources is essential to unlock meaning. Big data may have to go through certain analytics processes to construct the beginnings of metadata. Then this metadata is correlated with the metadata from other data sources to derive the most useful logical model. As big data processing evolves, new types of metadata may arise to meet the special circumstances of different kinds of big data.
Image source: Princeton Tutoring
This post was brought to you by IBM for MSPs and opinions are my own. To read more on this topic, visit IBM's PivotPoint. Dedicated to providing valuable insight from industry thought leaders, PivotPoint offers expertise to help you develop, differentiate and scale your business.