Facebook Data Use: Myths and Misconceptions

One of the more interesting aspects of the ongoing investigation into Facebook data misuse has been the various revelations as to how, exactly, Facebook is tracking and storing data, some of which have confirmed – or at least solidified – long-standing myths about how The Social Network operates.

To clarify, and get some perspective on the various ways in which Facebook is keeping tabs on its users, here’s a listing of some of the findings which have been uncovered as part of the latest investigation.

1. Facebook has been tracking call and SMS data on Android phones

This was a big one – with users now scanning through their Facebook data with a fine tooth comb, after downloading their Facebook data file (you can check yours by following these steps), some have found that The Social Network has logs of their incoming and outgoing phone calls, and SMS messages they’d sent, totally independent of Facebook.

Downloaded my facebook data as a ZIP file

Somehow it has my entire call history with my partner's mum pic.twitter.com/CIRUguf4vD
— Dylan McKay (@dylanmckaynz) March 21, 2018

Facebook acknowledged this in an official post, explaining that:

“Call and text history logging is part of an opt-in feature for people using Messenger or Facebook Lite on Android. This helps you find and stay connected with the people you care about, and provides you with a better experience across Facebook. People have to expressly agree to use this feature.”

Why Facebook would need such data has been a question – but really, the fault also lies with Android, who made the information available.

Facebook has stated that:

“This feature does not collect the content of your calls or text messages. Your information is securely stored and we do not sell this information to third parties.”

The company has also now said that they have re-examined the feature to confirm they are not collecting any content from user messages, and they’ve implemented a new policy to delete all logs older than one year.

The case for collecting and keeping such data seems fairly loose - and really, Facebook doesn’t need it, even if they may have found use for it at an earlier stage. As such, it’s good to see the company implementing a change in response to user concerns about this element.

2. Facebook has been keeping video content you recorded but never posted

Ever recorded a video via the Facebook camera then deleted it? As it turns out, you probably didn’t actually get rid of your bad take – in another discovery from the user data files, people have found that Facebook has been keeping and storing deleted videos on their servers.

In this case, Facebook has blamed a bug, saying that they never intended to keep these outtakes, but have inadvertently done so.

The Social Network now says it will remove these unpublished videos from their servers.

“We investigated a report that some people were seeing their old draft videos when they accessed their information from our Download Your Information tool. We discovered a bug that prevented draft videos from being deleted. We are deleting them and apologize for the inconvenience."

There’s not really a heap of value in these deleted videos, you’d think, though Facebook, with its advanced video object recognition tools, could have kept them with a view to utilizing such content for enhanced data purposes in future.

Either way, the company says they’re now being deleted – how much trust you put in their word on this is down to your personal viewpoint.

3. Facebook scans your private messages, which are monitored when flagged

Yes, Facebook reads your private messages on Messenger.

Well, not all of them – Facebook has also admitted that in some cases, when conversations are flagged by their systems, they do read people’s messages in order to combat misuse.

As explained by Bloomberg:

“While Messenger conversations are private, Facebook scans them and uses the same tools to prevent abuse there that it does on the social network more generally. All content must abide by the same "community standards." People can report posts or messages for violating those standards, which would prompt a review by the company’s “community operations” team. Automated tools can also do the work.

Facebook also scans photos and links shared to detect potential violations of their policies.

This is a big one, particularly considering the growth of Messenger - and worth noting, Facebook doesn’t scan the messages shared on WhatsApp, the company’s other messaging platform.

If you’re concerned about Facebook reading your messages, you can choose to encrypt them, but this is opt-in.

Facebook says that your message data is not scanned for advertising purposes (a long-speculated myth), it’s only for the purposes of potential moderation.

Again, how much you trust the company’s official line on this is down to your individual perspective.

4. Facebook’s not listening in to your private conversations

And the last one that keeps coming up time and time again - and has been dismissed by Facebook many times - is that the platform is listening in to your private conversations, via your phone’s internal microphone.

Theoretically, this one is possible, but as explained by Wired, it’s not entirely feasible.

The two problems with this theory are data load and accuracy.

On data load, as explained by Wired:

“To make it happen, Facebook would need to record everything your phone hears while it's on. This is functionally equivalent to an always-on phone call from you to Facebook. Your average voice-over-internet call takes something like 24kbps one way, which amounts to about 3 kBs of data per second. Assume you've got your phone on half the day, that's about 130 MBs per day, per user. There are around 150 million daily active users in the US, so that's about 20 petabytes per day, just in the US.”

That’s a lot of data to be consuming, for what you would assume would be minimally valuable insight, considering Facebook can track your interests in a range of other ways.

The other issue is in the accuracy of speech to text systems. While speech to text tools generally work pretty well when trained to a single voice, over time, they’re not so great at identifying different voices, and separating a person’s speech from background noise and other distractions.

Of course, the counter-argument is Shazam – the song-identifying app is able to give you accurate audio matches almost instantaneously, which proves that audio recognition is fairly well advanced. And that’s true, it is far more advanced than it used to be, but Shazam works on specific musical cues which have already been uploaded to its servers. If Facebook had access to every users’ voice clips as a training model, then maybe that would work the same – and again, if they were to go down that path, that would take a huge amount of data and processing work, which Facebook really doesn’t need to do.

Again, a lot of people are convinced Facebook's doing this – check out Reddit for the many stories of people who will swear that Facebook must be listening in to their conversations. But it doesn’t seem entirely likely – and Facebook, as noted, has repeatedly denied they do this.

It’s interesting to see the varying perspectives and questions being raised, and to get a measure of the validity of each, based on Facebook’s own data.

Really, it’s good to have more transparency on how Facebook operates in this regard, and you can expect even more disclosure on this front in coming weeks.