At one of his semi-regular Q and A sessions earlier this year, Facebook CEO Mark Zuckerberg answered a question about what he thinks the world will look like in ten years time. Zuckerberg's answer covered Facebook's efforts to connect the world through internet.org and the evolution of virtual reality through Oculus. But a key element of Zuckerberg's focus was artificial intelligence, and his response highlights, at least in part, why Facebook is going to such effort and spending such huge amounts on AI research and development.
"...we're working on AI because we think more intelligent services will be much more useful for you to use. For example, if we had computers that could understand the meaning of the posts in News Feed and show you more things you're interested in, that would be pretty amazing. Similarly, if we could build computers that could understand what's in an image and could tell a blind person who otherwise couldn't see that image, that would be pretty amazing as well. This is all within our reach and I hope we can deliver it in the next 10 years."
Zuckerberg's view gives some insight into Facebook's wider push on AI, but not all, as there are obviously also significant financial benefits to such a system -an image recognition system which could show you more things you're interested in would greatly expand Facebook's understanding of audience behaviors and interests, and thus, would further expand the platform's capacity to deliver what each individual user wants, which is great for users, but also has major implications for advertisers.
Learning to See
In line with this, Facebook has today released a new blog post detailing advances made by the Facebook AI Research (FAIR) team and detailing just how close they are to taking their AI capabilities to the next level - and developing smarter, more attuned systems as a result. In the post, titled 'New Milestones in Artificial Intelligence', Facebook's Chief Technology Officer Mike Schroepfer first notes how their efforts on visual recognition are evolving
"Next month FAIR will be presenting a new paper which details a state-of-the-art system that segments, or distinguishes between, objects in a photo. This new system segments images 30 percent faster, using 10x less training data, than previous industry benchmarks."
To add further context, FAIR's Director of AI Research Yann LeCun released a video earlier this year which explains how their AI system works to segment images and sort the various elements into different categories in order to determine their content.
The fact that Facebook's system is now able to do this 30% faster is significant - but the secondary point, regarding training data, is far more important.
Training data, in basic terms, is human input, the manual intervention required to help the system analyze and interpret the content. The more input the system has, the more it can attune it's systems and deliver the right result - one of the best examples of this is another Facebook AI project, their 'M' personal assistant. At the recent Web Summit in Dublin, Schroepfer discussed how M works, and how the system is learning from the responses of its team of human assistants. Schroepfer detailed how M's functions are currently largely served by real people - when a query is sent to M, the AI comes back with its best-match responses, then a human takes over and performs the requested task. But M is listening, M is learning from every interaction conducted, and over time, it's getting better at understanding how to respond.
An example Schroepfer highlighted was buying flowers - originally, if you'd asked M to buy some flowers the system would come back with some partial responses but no real refinement. Now, when given the same query, M will come back with two questions "What's your budget?" and "Where do you want them sent?" These two questions have been 'learned' by M after analyzing the way the human operators interact with users.
"There is some percentage of responses that is coming straight from the AI, and we will increase that percentage over time."
In this context, the learning noted in the latest FAIR image recognition tests - a reduction in training data of 10x less input - highlights that the system is getting significantly better, and improving at an impressive rate.
In further underlining the capacity of Facebook's AI systems in translating natural language queries and matching them with image translation, Facebook has released a video of their Memory Networks tool which can answer questions about image content.
Visual Question and Answering Demo
Earlier this year, we showed some of our work on natural language understanding - specifically, a system called Memory Networks (MemNets) that can read and then answer questions about short texts. In this demo of a new system we call VQA, or visual Q&A, MemNets are combined with our image recognition technology, making it possible for people to ask the machine what's in a photo.Posted by Facebook Engineering on Tuesday, November 3, 2015
The immediate functionality of this capacity would be to help visually-impaired people in understanding what's in an image, opening up a whole new world of communicative possibilities.
Facebook has also shared detail on their advancements in AI predictive learning and understanding video content - in this next video, Facebook's AI system has been asked to predict the outcome of what will happen to each stack of blocks, with a percentage value placed on their likelihood of toppling over.
Unsupervised Learning: Predicting Falling Blocks
Unsupervised or predictive learning is the ability to understand what will happen in the future by learning from observation. To try to give computers this ability, Facebook's artificial intelligence research team has developed a system that can "watch" a series of visual tests - in this case, sets of precariously stacked blocks that may or may not fall - and predict the outcome. After just a few months' work, the system can now predict correctly 90 percent of the time, which is better than most humans.Posted by Facebook Engineering on Tuesday, November 3, 2015
That may not seem overly impressive, but the system has learned the likely outcome through observation, by developing it's own understanding of what's happening in the video.
"After just a few months' work the system can now predict correctly 90 percent of the time, which is better than most humans."
Developing the Future
Artificial intelligence is a term that sparks a range of reactions. And while there are some concerns, most see it as a natural progression, a necessary advancement to help us understand and develop better systems.
"Intelligence is what makes us uniquely human. If we want to understand ourselves, we have to understand what intelligence is all about"
In essence, Facebook's already utilizing a form of AI - the Facebook News Feed algorithm 'learns' about your interests and behaviors based on your on-platform activities, then seeks to serve you similar content based on patterns from other user actions. That, at core, is what machine learning is all about, but the next level is understanding how we develop our thoughts, and how we learn, in order to provide more accurate, more responsive data matching and content.
These latest findings show that Facebook's AI team is making significant advances on this front - and while all of these little steps may seem minor in isolation, they all add to wider development of an artificial neural networks. If a neural system can learn by itself, and develop accurate responses based on inputs, it would have the capacity to accelerate development of new technologies and understandings at a much faster rate than a human mind.
In immediate and practical terms, such capacity would enable Facebook to create more intelligent systems to help find more relevant content and improve its search capacity by utilizing more contextual inputs beyond text matching alone (search by image is something we're very likely to see in the near future). That, in itself, is significant, but as highlighted in this video, the broader possibilities of such findings reach far beyond those familiar roots.