Next #SMTLive Twitter chat:
Brand Yo' Self (Building a Personal Brand in a Digital World) 
Tuesday, June 20th

Twitter Analytics : These words may be affecting your popularity

Text Mining techniques can be used to identify specific words that are correlated with Twitter accounts having high or low popularity. This can be done in two ways : (1) By analyzing the text of the Tweets of each user and (2) By analyzing the text of the biography of each user.

Let's start with the results of the first type of analysis with data originating from user Tweets. Pay attention only to cells that are highlighted in red, their corresponding category column (LOWFOLLOWERS , HIGHFOLLOWERS) and the word at the beginning of each corresponding row. Results show which words appear to be important especially because the affinity shown here is moderate. Use results as possible clues only.

The results so far show us that :

hate, bed : are found to be correlated with low popularity
top, online, send, list,web,media, join : with high popularity

Here is another portion of the results table :

The pattern should be evident by now : Words of negative attitude appear to be influencing a user's follower count negatively. As also shown above, foul language appears to work negatively also. Several other insights were found such as the existence of specific phrases that are correlated with low popularity ("watching TV") while other phrases ("stay tuned" ) with popular accounts. The number shown in parentheses quantifies the magnitude of the association that each word has and thus enables us to order words by their importance.

Some of the words -and their synonyms- that were found to be associated with very low follower counts are :

- Sleep, Hate, Damn, Feeling, Homework, Class, Boring, Stuck

A total of 63 words and 25 phrases were found having either a positive or negative association with the followers count. Interestingly, specific phrases that communicate any kind of opportunity are also associated with high number of followers. "Thank you" is highly related with a user's large popularity.

Here comes the interesting part : Once the Text Mining analysis is completed, a predictive model can be generated that may be used for scoring future Tweets. Let's assume that you are about to send the following 2 Tweets :

1) 'Today i feel like sleeping all day. Yawn...'
2) '@xyz Your website traffic can be increased with good marketing'

Before you post however, you decide to feed these 2 sentences to a predictive model. The predictive model returns for every Tweet the predicted result (GOOD or BAD) and the associated probability. Here are the results for these 2 examples from an actual run :

In other words :

1) The first Tweet may have a negative effect with a probability of 83.5%
2) The second Tweet may have a positive effect with probability 99.9%

Note that :

  • A predictive model is able to consider combination of words, not just single words. This raises considerably the accuracy of any prediction.
  • In any real world application of Text Mining a 100% prediction accuracy cannot be achieved: Although application-specific, a 72-78% accuracy may be achieved - with considerable effort. Of course many more things are important to achieve high popularity and the example above is given merely to discuss what techniques currently exist. A combination of analytical techniques is the best option and this will be discussed in a future post.

Several other types of analysis can extract similarly interesting insights : Let's not forget that Twitter Tweets contain the emotions, beliefs and values of users. They contain what people want and what they don't want. See Clustering the thoughts of Twitter Users and Know your customers the Twitter way for a further discussion on this.

There will be more to say about Text Mining and how it can be put to use by PR Agencies and Marketing companies with practical examples shortly.

Link to original post

Join The Conversation

  • HankWasiak's picture
    Dec 10 Posted 7 years ago HankWasiak Thanks for the information. Very useful and thought provoking. It does seem to confirm a truth in communications that we have learned in the non-profit sector that is becoming even more relevant today with the prevalence of social media. Studies show that while negative communications, built on fear for example, may generate some initial awareness and get noticed they generally do not produce sustainable changes in behavior. Positive communications build on attainable rewards and benefits work best in this regard all the time. The twitter analytics you discussed seem to indicate that on an initial surface level people would rather engage with positive proactive peeps and tweets...something that asset-based thinkers know as the "laws of attraction". Look forward to more of your posts. @hankwasiak




  • ThemosKalafatis's picture
    Dec 7 Posted 7 years ago ThemosKalafatis @Carl Vervisch

     There are many problems with Twitter accounts : Spam accounts, celebrities and politicians that have many thousands of followers because of who they are to name a few. These users were not part for the analysis although in some cases it is hard to identify which Twitter users have auto-follow bots. Most users tend to un-follow users that do not give valuable information. You could also analyze this information based on the follower-to-following ratio and not strictly follower count (which is also discussed in one of my other posts).

    Note also that i do not imply that by just using these words you are going to draw many followers. To do this you need many other things and this is explicitly stated towards the end of this post.



  • ThemosKalafatis's picture
    Dec 7 Posted 7 years ago ThemosKalafatis @Ari Herzog

     Good point. Although this text mining exercise did not take into account the fact that 70% of Twitter users quit shortly after they create an account this problem could possibly be solved by checking the date of the last Tweet for each of the followers.

  • ThemosKalafatis's picture
    Dec 7 Posted 7 years ago ThemosKalafatis @joel foner

    Any causal relationship is by no means implied (notice how many times i use the word appears to the text). Please also notice the phrase "Use as possible clues only"



    What you say is correct..this is merely an example of what kind of insights could possibly be extracted from Twitter. Please check lifeanalytics for more examples.

  • ariherzog's picture
    Dec 6 Posted 7 years ago ariherzog Great research, introducing me to some concepts I hadn't considered, but isn't there a fallacy to correlate popularity with followers at a time when 70% of new Twitter users quit in the first 30 days? Therefore, who cares how many followers one has if x% are inactives?

    The better metric is to look at number of @ replies and/or retweets within a certain time frame.

Webinars On Demand

  • May 09, 2017
    With all of the technologies available to marketers today, have we lost that personal touch? Join VP of Content Marketing for ON24, Mark Bornste...
  • April 05, 2017
    In the ever-changing world of digital marketing, operational efficiency, quick turn-around times, testing and adapting to change are crucial to...