But how accurately are these tools in identifying fake accounts? Probably not as accurate as you might think.
What's in a number?
At first glance, there' s a lot of consistency between the tools. Consider two assessments of the @leaderswest Twitter account:
Each tool determined that of my 20,000 followers, about 600 may be "fake." But the Social Bakers tool goes one step further and proposes "fake" accounts to block:
To SocialBaker's credit, they list the criteria for inclusion in this list and say specifically:
"We understand that these criteria, number 6 in particular, don't necessarily define fake followers. However these kinds of followers can be considered empty or inactive and therefore not helpful to you in terms of reach."
But let's look at the first "fake" follower @tweetcaroline:
Caroline appears to have been miscategorized as a "fake" account because she uses the paper.li content aggregator. The standard paper.li tweet repeats the same phrase in multiple tweets. But @carolinetweets most certainly isn't a fake account. And this is a common reason that some legitimate accounts were identified as "fake" with the SocialBakers tool.
Assumptions for fake Twitter followers
Anytime you make assumptions about a large population, slight imprecision can be hugely inaccurate. A recent (more serious) example of this were the Excel errors in the calculations for European austerity measures by two renowned economists. The austerity measures that have been implemented over the last few years were based on small miscalculations that produced huge errors.
Percentage of fake Twitter followers will never approach the seriousness of austerity, but the same principle applies. Assumptions about a group of two-hundred million people, no matter how slight can result in big errors.
What do you think? Are these tools useful? Are they accurate?