Twitter Says That its getting Better at Detecting and Removing Bots, Outlines Common Misinterpretations

As many, including myself, have noted many times over the years, Twitter has a problem with bot profiles.

For years, users have complained about the impact of bots and fake accounts on the platform, and while various research reports have pegged Twitter's fake profile levels at between 5% and 15%, their presence is seemingly even more significant than that. Researchers have repeatedly pointed to massive swarms of bot accounts being used for malicious purpose, with the most concerning use being the amplification of political messages, in order to drown out opposing views.

But Twitter now says that a lot of those stats are incorrect - and not only are they incorrect, but they're also, potentially, dangerous to public discourse.

Twitter says that it's improved its detection and removal systems for problematic bot profiles over time, and the insinuation of these reports - that Twitter enables mass-retweeting on such a scale as to influence public opinion - is wrong.

In a post co-written by Yoel Roth, Twitter's Head of Integrity, and Nick Pickles, the company's Director of Global Public Policy Strategy, they outline the various limitations of third-party Twitter bot analysis, and point to their own results as a more accurate indicator of performance.

As per the post:

"Going back a few years, automated accounts were a problem for us. We focused on it, made the investments, and have seen significant gains in tackling them across all surfaces of Twitter."

Twitter says that its Transparency Report is a good indicator of its progress in this respect - in its most recent update, Twitter says that its total count of challenges issued to suspected spam accounts - which include malicious bots - was 15.4 million, covering June 2019.

Using that as an indicator, Twitter had 139 million mDAU at that time - so based on this, as a very rough measure, the total number of possible fake profiles detected in that period could have only amounted to 11.08%. But that figure also includes every other possible violation, which would suggest that Twitter's actual spam account figure, based on its own actions taken, is in the low single digits - and they are indeed being both detected and removed systematically.

Given that this number is also improving over time (as shown in the chart), Twitter says that it is mostly getting rid of deceptive bots. Which leads to another key distinction that Twitter wants to make.

"It’s important to note, not all forms of automation are necessarily violations of the Twitter Rules. We've seen innovative and creative uses of automation to enrich the Twitter experience - for example, accounts like @pentametron and @tinycarebot."

Twitter also notes that business profiles can also use bots and automation to improve customer service interactions, so you can't blanket tag all bots as bad, further clouding the figures.

So which bots does Twitter consider to be breaking its rules?

Twitter lays out various types of misuse that will get a bot profile banned.

Malicious use of automation to undermine and disrupt the public conversation, like trying to get something to trend
Artificial amplification of conversations on Twitter, including through creating multiple or overlapping accounts
Generating, soliciting, or purchasing fake engagements
Engaging in bulk or aggressive tweeting, engaging, or following
Using hashtags in a spammy way, including using unrelated hashtags in a tweet (aka "hashtag cramming")

So the main focus of bot criticism - manipulating conversation - is indeed a violation, and Twitter will remove bots that break this rule. So long as it detects them, which it says, for the most part, it does.

"Our technological power to proactively identify and remove these behaviors across our service is more sophisticated than ever. We permanently suspend millions of accounts every month that are automated or spammy, and we do this before they ever reach an eyeball in a Twitter Timeline or Search."

So Twitter is removing bots, and its advanced detection processes suggest that the problem is not as pronounced as some reports have suggested.

Whether that rings true with you or not is largely then a matter of opinion. Or the data provided in academic studies of the offending tweets.

That's where it gets more difficult to take Twitter's counter-argument - the studies they're referring to actually have the tweets as data, they show that tweets are often repeated, word for word, and re-posted in rapid succession amidst trending discussion periods. We know this happens - but according to Twitter, it's either actually being tweeted by real people, or it's a misinterpretation of the broader trend.

Twitter specifically notes that tools like Botometer and Bot Sentinel, which aim to identify bot activity based on a range of parameters, are generally not accurate.

"These tools start with a human looking at an account - often using the same public account information you can see on Twitter - and identifying characteristics that make it a bot. So in essence, the account name, the level of Tweeting, the location in the bio, the hashtags used etc. This is an extremely limited approach."

Again, Twitter says that its own detection systems are now far more significantly developed than these tools account for, and as such, the numbers indicated by such tools are not only wrong, but can be damaging to related discussion.

"Binary judgments of who’s a “bot or not”, which have real potential to poison our public discourse - particularly when they are pushed out through the media."

In summary, Twitter's saying that reports which point to widespread bot manipulation - including those referenced here, here, here and here - are likely misleading about the actual impact of such. That's not to say Twitter has every angle covered, nor that it's free of bot manipulation entirely. But when people run a profile through a bot checker, that's probably not enough evidence to point the finger and accuse any user of coordinated manipulation.

Based on the available data, that could be true, but it's very difficult to say - because nobody knows, for sure, how many Twitter profiles are, indeed, bots. A lot of supposed bot activity, as Twitter says, may well be real people who simply share similar messages, or engage heavily with certain topics. Some suggested bot campaigns could also be tagged as such by people who simply don't want to accept the actual results or trends, so they dismiss them as artificial.

Twitter's definitely best placed to understand such, with access to the raw data, and according to Twitter, most bot activity is being addressed. It's worth noting, also, that Twitter is considering new tags for bot accounts, like verified ticks for bots, which could help to better understand the origins of specific trends.

Again, how you take that will come down to your personal feeling, but the numbers suggest that Twitter is working to address its bot problem - and that the problem itself may not be as pronounced as it may seem.

UPDATE (5/21): Christopher Bouzy, the creator of Bot Sentinel, one of the apps specifically noted in the post from Twitter, has refuted to the suggestion that the system Bot Sentinel uses is inaccurate. Bouzy says that it's an insult to suggest that Bot Sentinel uses "inferior methodology" and claims that its process is highly accurate in detecting problematic accounts.

As per Bouzy:

"There are enough people already spreading misinformation and disinformation on this platform, we don't want people who represent publicly traded companies posting inaccurate and misleading information that can be easily disproven by readily accessible public data."