Can social media data predict the outcome of an election?
It's a key question many have asked, and have worked to decipher from the troves of user data available. But thus far, there's no definitive answer, especially when it comes to modern US politics.
Part of that is likely because of the celebrity status of US President Donald Trump, and the way in which he has used social media to connect with his constituents. Trump's massive social media reach seems to somewhat skew the data - for example, previous academic research has suggested that sheer mention volume is the best indicator of a candidate's performance, and likelihood of winning.
A study Dublin City University in 2011 found that tweet volume was "the single biggest predictive variable" in election results, a finding that was echoed in another study conducted by the Technical University of Munich:
"The mere number of tweets reflects voter preferences and comes close to traditional election polls."
Tweet volume, reflecting relative discussion and popularity, has been a consistent indicator of subsequent candidate performance - however, that wasn't the case in the 2016 US Presidential election.
President Trump was still able to claim victory through the Electoral College system, but the final results showed that even though Trump dominated the social media discussion, that did not translate into voting behavior.
That, as noted, could indicate that Trump's status shifts the scales in terms of predictive metrics, so we can't say, for sure, what's a great indicator of the probable election outcome. But, for context, here's a look at some of the current data points, and how the two US Presidential candidates are tracking on key social metrics.
First off, on mentions - according to data from Facebook's analytics dashboard CrowdTangle, Trump is trouncing Biden in overall engagement across The Social Network over the last three months.
The shares here may be most important - while direct engagement with your posts is a good indicator of popularity and message resonance, shares are essentially message spread, and indicate that people are looking to pass on your messaging to other people in their own networks.
Reach is the key strength of The Social Network, and shares are a key element of this - and as you can see, on this front, Trump is seeing more than 5x more share activity on the platform.
Of course, Trump is also starting from a larger base - Trump has 32.5 million followers on Facebook, in comparison to Biden's 3.7 million. That could skew the data, while it's also not clear why people are sharing Trump's messages.
Many of Trump's comments, like his recent statement about the #BlackLivesMatter protests, have been shared in criticism, which again skews the data. But in a direct comparison, Trump is clearly leading the discussion on the most influential social network.
As reported by The New York Times, Trump has also seen nearly twice as many likes and comments on Instagram as Biden over the past month, underlining his presence, in pure volume metrics.
But, at the same time, Biden has been gaining momentum. According to social media analytics company Socialbakers, Biden's Twitter account has seen significant growth in 2020:
"In January 2020, Biden had 2,657,870 total interactions, just 8.2% of Trump’s monthly average of 60,518,463. After only 7 months, Biden peaked at 32,283,027 total interactions in August - a whopping 50.34% of Trump’s monthly average."
So while Biden is not at the same level as Trump in terms of overall mentions or engagement, the data shows that he has gained, in relative terms - which, given Trump's celebrity, could be indicative.
Maybe, it's simply not possible to expect a candidate to be able to catch up to Trump on volume, given his social media dominance, and as such, relative gains could be the best indicator of performance. It'll be impossible to say, of course, till after the poll.
The next step, then, is to try to glean insight into how each candidate is being mentioned.
Using the Twitter analytics tool HappyGrumpy, a basic analysis suggests that sentiment around Trump's tweets is 27% positive and 38% negative, compared to Biden who sees a 20% positive rate, and 40% negative. So Trump is seeing more positive responses, but the divide between the two is fairly close.
Yadav's more specific methodology actually found the opposite - that Biden is seeing more positive replies, in comparison to negative reviews. But overall, the gap is fairly close, there's nothing definitive here, and no clear winner in overall sentiment.
Yadav also notes the limitation in analysis due to sarcasm, which is not generally picked up by automated analytics systems:
"So, If a sentence contains a large number of positive words like “greatest”, “excellent” in a negative comment which is written in a sarcastic way. So, it will definitely classify it as a positive sentiment."
That does make sentiment a difficult element.
That then leaves one other key factor to consider - audience growth, and who's gaining more followers leading into the poll, which could suggest relative popularity with voters.
In terms of Facebook Page Likes, Biden has seen a lot more growth in the last three months.
So again, a direct comparison on overall numbers is not going to provide much insight, given Trump's pre-existing celebrity status. But in relative terms, Biden is gaining momentum.
That same trend is also reflected on Twitter - using the Wayback Machine, the screengrabs show that:
Volume-wise, in terms of engagement and audience, Trump is by far the leader. But the trends show that Biden is winning, on both fronts, in relative terms in the most recent period.
So, what does that mean in terms of an overall prediction? As noted, several past studies have shown that volume alone is the best predictor, but Trump's status changes that, and may shift the results.
With this in mind, and knowing that a direct volume comparison is not effective, it could be that you need to look at recent growth in isolation, which shows that Biden is gaining significant momentum on Twitter engagement, Twitter followers, Facebook Likes. But Trump still dominates the space, and his huge, established social media presence gives him significant capacity to spread his messaging.
Which is the best indicator of success? We won't know till early next month, but these data trends could provide some new insight into the predictive capacity of social media for election results.