MOUNTAIN VIEW–Scientists from Google and Stanford University, working independently, have created artificial intelligence software capable of recognizing and describing the content of photographs and videos with far greater accuracy than ever before. The software’s description of pictures was similar to that written by a human.
The new software, described on Monday by researchers at Google and at Stanford University, teaches itself to identify entire scenes: a group of young men playing Frisbee, for example, or a herd of elephants marching on a grassy plain.
The innovation could make it easier to search for images on Google, help visually impaired people understand image content and provide alternative text for images when Internet connections are slow.
The machine-learning software developed by Google used two neural networks – one which deals with image recognition, the other with natural language processing. Neural networking itself is a computational model that mimics some of the same architecture used in the brain. Such systems have a series of interconnected neurons which can take information from a variety of sources and are also capable of learning.
The neural network developed by Google was the work of four scientists – Oriol Vinyals, Alexander Toshev, Samy Bengio and Dumitru Erhan.
“A picture may be worth a thousands words, but sometimes it’s the words that are the most useful – so it’s important we figure out ways to translate from images to words automatically and accurately.” They wrote on Google Research Blog.
Two years ago Google researchers created image-recognition software and showed it 10 million images taken from YouTube videos. After three days the programme had taught itself how to pick out pictures of cats.
Sources: BBC, Times of India, MIS Asia