It's no secret that we get pretty excited about social data here at Mass Relevance. We live and breathe analytics - whether it's measuring ROI for our client campaigns, predicting trends at CES, or measuring the popularity of UK Christmas adverts. If there is an opportunity to analyze real-world events with social metrics, count us in.
The Dresses, The Celebs...The Data
So of course I jumped at the opportunity to merge the ever-sexy world of social analytics with the star-studded land of the Golden Globes. For today's study, I wanted to learn how data can give us a view behind the Hollywood curtain, and see if we can reveal the power players behind the scenes.
After doing some digging, I found that thirty of the 2014 Golden Globe nominees are active on Twitter. If we just take a quick look at the high-level data associated with their accounts, we can see simple things like who has the most followers (Leonardo DiCaprio at 8.5M) and who is following the most people (Idris Elba with 607).
But, c'mon, we can do better than that.
Uncovering Hollywood's Social Network
To dive deeper, we'll have to get our hands dirty. There's a field of study called "social network analysis" (SNA) that maps connections between people and tries to make sense of what those connections can tell us. There are a lot of great tools available to help with this process. For my study, I used the Twitter REST API to collect data, a series of Python scripts to process the results, and an open-source tool called Gephi that helps visualize social networks for analysis.
Using the above methods, I discovered which nominees follow each other on Twitter and mapped out a network for the 2014 Golden Globe Nominees. Here's what that network looks like:
Lines between circles are called "edges", and they show a one-way connection between two users. The circles (called "nodes") are sized according to their calculated influence in the network.
SNA can tell us some interesting things about networks using a few key metrics to look at influence. For example the nominee following the most other nominees is Tatiana Maslany. The Orphan Black nominee is clearly interested in what the other nominees are up to. But anyone on Twitter can click a link to follow other people - not a great metric if we're looking to measure influence. What we really want to know is who is being followed, not the other way around. So how many of the other nominees are following Tatiana? Checking the data, turns out it's zero. Hmm.
So let's look at the other side - who is being followed the most. The great Tom Hanks, who has two Oscars and four Golden Globes to his name, is leading the pack and being followed by ten other nominees. More surprisingly, the second most followed by nominees is relative newcomer Lena Dunham with nine. Now we're getting somewhere.
Influence isn't all about sheer number of followers, it's also about people that are bridges and connectors between networks. Malcolm Gladwell's great book The Tipping Point investigates this theory in depth, and talks about the unique influence connectors wield. In SNA, a measure known as "betweenness" is used to find connectors. Within the Golden Globes nominee network, Lena Dunham tops the group (again.) Lena bridges the most gaps between diverse groups, giving her a unique form of power in this small network.
But an even better measure of influence is called Eigenvector Centrality, which works a bit like Google's pagerank to weigh connections. The resulting ranking is based not just on the number of people connecting to you, but also on howinfluential those people connecting to you are. So who rules this network based on Eigenvector Centrality? Tom Hanks, Lena Dunham, Aaron Paul, and Zooey Deschanel are the top four. Can this metric predict the outcome of the awards? Well, I tried to do that on Friday and it didn't work out so well. Given the small sample size of the network, the fact that the Hollywood Foreign Press is voting (and not the actors), and that many of the nominees that won aren't even on Twitter, it's not surprising that this data didn't turn out to be predictive. But, hey, I had to try.
What Have We Learned?
As you can see, I've just been having a bit of fun here, but hopefully got across a few key concepts of SNA. While the tactics I used on the Golden Globes nominee network didn't give us insight into the winners, it did give us a better understanding of who some of the power players are in Hollywood from a social standpoint. Remember: the value in social data is not only in the conversations people are having - you can find additional value by analyzing influence and connections to understand the full story. We'll dive into more SNA in future blog posts and maybe even try a revised methodology for predicting some upcoming contests. I've heard there's another little award show coming up in a few weeks...