It’s Time to Demystify Big Data Tools and Platforms

El Umbel
H.O. Maycotte Founder, CEO, Umbel

Posted on April 11th 2014

It’s Time to Demystify Big Data Tools and Platforms

big data toolsFirst things first: big data isn’t only difficult for people to wrap their heads around, it is also difficult for computers to process. It’s why Hadoop exists (the platform that allows data queries across multiple computers or cluster batching). It’s why data scientists exist, not to mention multiple programs like the Insight Data Science Fellows Program that turn previously academia-minded Ph.D.s into the new media world’s foremost mathematicians. It’s also why big data startups have been getting remarkable amounts of venture funding, and though there have certainly been a fair share of articles claiming that VCs are now bored with big data anything, the recent $100M and $900M of HortonWorks and Cloudera prove otherwise.

Add to that the fact that the big data field is chock full of some of the smartest minds not just in the IT or digital media industries, but from neuroscience, linguistics, mathematics, bond trading, bio-engineering — the list goes on — and it’s no surprise that this field is an intimidating one for the rest of us. Not that we aren’t just as smart. After all, many of the executives leading digital companies these days earned their wings during the dot com bubble — and many of them are to thank for the current digital age.

The intimidation comes not from lack of experience, but in being tasked with the challenge (as executives always are) of making this new asset actionable — and not letting big ideas and often big talk undermine that mission.

Truth is, big data is only useful if you ask it the right questions. And not every big data tool will properly decipher the right answer, either. Here are the four big data platforms, tools and terms currently dominating the industry — and likely being misused or confused.

It's time to close the information loop and make big data work for whoever uses it — Ph.D. or not.

Hadoop

Hadoop is what you get when you mix billions of pieces of data with needing to test queries and algorithms in a reasonable amount of time. It’s cluster batching, via the cloud, which is complicated and quite impressive.

Batch processing helps a development team test out new queries and algorithms, consistently attending to the details where tests go a bit off course. As companies like Yahoo!, the company Doug Cutting was working for when he co-created Hadoop, grew, they began collecting data sets too big to fit on one computer. Computer clustering was the solution -- where a set of connected computers work together in such a way that they can be considered a single system.

That said, Hadoop puts the power of a computer cluster into the cloud, and allows a dev team to batch process multiple queries across massive amounts of data (that couldn’t ever be stored on one computer alone). But, what Hadoop spits out generally needs a data scientist to decipher. It doesn’t come in pretty visualizations -- but programs like HortonWorks or Cloudera can sit on top of Hadoop to make understanding your data easier.

Do you need Hadoop? Not unless your business is scaling large amounts of data like Google or Yahoo!.

Data Management Platform

A data management platform, or a DMP, is a marketing tool that is primarily designed to feed data into your ad server, better targeting audiences across the web. For the most part, DMPs including BlueKai, x+1 and Adobe Audiencemanager benefit brands and agencies, and are generally buy side biassed. They track user cookies across the web, collecting behavioral data from sites a user has visited before and after your target site. From that information, ads are targeted to reach the right customer at the right time -- for instance, serving them an ad for Paul Krugman’s newest book after tracking that user across an article of his in the New York Times.

In other words, you can thank DMPs every time you share content and start to see ads related to your share, for example a health care article that gives the DMP a signal to target health related ads to you.

A DMP basically spits out the data a user gives it in an easier to understand format so that marketing teams can standardize across particular customer behaviors. Why is this useful? Because if you ask a DMP the right questions, you can customize your ads or the items on your shelf in ways that encourage your audience or customers to convert (whether you consider a conversion sharing, commenting or actually purchasing is up to you).

Customer Relationship Management

Customer Relationship Management (CRM) has moved to the big data space. Typically, a CRM platform helps you to craft your messaging to your consumer through multiple interaction channels including social media, newsletters and so on. But big data changes the game when it comes to social media. No longer do you need to keep records of how often something was retweeted. The system will do that for you -- plus, it’ll show you what times of the day your customers engage most, where they are coming from, when they are most likely to buy, which other brands they like (and would support you having a partnership with), and so on.

In other words, today’s CRM platform lets you tailor your user’s experience, on any medium, to individual users based on their digital identities. It’s kind of like that old school train of thought in which you treat every customer like they are the only one. No longer is that mantra solely found in offline boutiques.

Smart Data Platform

A smart data platform is often a mixing board of platforms (usually including a DMP, CRM, ticketing platform, media ratings, among others) that integrates essential data solutions generally fragmented across a company and aggregates them into a single view -- from which immediate actions can be taken.

For instance, on a smart data platform, you can learn that 25% of your users who regularly login "like" General Electric on Facebook. Better yet, you can email that set of people with customized content or customized newsletters with GE advertising -- and ensure a higher return for your own brand. Smart data platforms are actionable and have clear ROI.

In other words, if you want to better understand your audience, use a smart data platform with social login to see all their holistic social or professional identity and how they interact with your content in a visualized fashion.

In all, a smart data platform converts the data points you give it in an easier to understand format so that you can connect dots across various customer identities that you didn’t before know were relatable. Why is this useful? Because if you ask a smart data platform to use CRM data to inform prospecting efforts or DMP data to inform ad targeting efforts, you will see an instant return.

Big data is supposed to make our lives easier -- and we’re getting to the point where it truly does. But we all need to understand the actionability of the platforms we use, and allow big data to truly change the way we do business. It is our newest asset, and, sooner than you think, we’ll be putting it on the balance sheet with a large, green number next to it.

(Demystifying Big Data / shutterstock)

El Umbel

H.O. Maycotte

Founder, CEO, Umbel

As the great-grandson of the Mexican General who shot off Pancho Villa's leg, H.O. embodies the spirit of a true revolutionary. His background is in starting up innovative companies from scratch, such as RateGenius, Flightlock (a travel safety company purchased by Control Risks), Finetooth (a contract management solution, now doing business as Mumboe), the Texas Tribune (a non-profit, nonpartisan public media organization) and now Umbel, a smart data company focused on fueling the monetization of digital media. And despite his ancestor's sharpshooting tendencies, he's one of the friendliest guys you'll ever meet.

See Full Profile >