The race to exascale isn’t the only rivalry stirring up the advanced computing space. Artificial intelligence sub-fields, like deep learning, are also inspiring heated competition from tech conglomerates around the globe.
When it comes to image recognition, computers have already passed the threshold of average human competency, leaving tech titans, like Baidu, Google and Microsoft, vying to outdo each other.
The latest player to up the stakes is Chinese search company Baidu. Using the ImageNet object classification benchmark in tandem with Baidu’s purpose-built Minwa supercomputer, the search giant achieved an image identification error rate of just 4.58 percent, besting humans, Microsoft and Google in the process.
An updated paper [PDF] from a team of Baidu engineers, describes the latest accomplishment carried out by Baidu’s image recognition system, Deep Image, consisting of “a custom-built supercomputer dedicated to deep learning [Minwa], a highly optimized parallel algorithm using new strategies for data partitioning and communication, larger deep neural network models, novel data augmentation approaches, and usage of multi-scale high-resolution images.”
The ImageNet classification challenge has become the defacto standard benchmark for large-scale object classification. Software is trained on 1.5 million images using a predefined set of 1,000 different categories. Different tests are employed to evaluate performance. One of these is called “top-5” — referring to the miss rate of the software’s top 5 guesses.
“Our system has achieved the best result to date, with a top-5 error rate of 4.58% and exceeding the human recognition performance, a relative 31% improvement over the ILSVRC 2014 winner,” state the report’s authors.
The Baidu colleagues add that this is significantly better than the latest results from both Google, which reported a 4.82 percent error rate, and Microsoft, which days prior had declared victory over the average human error rate (of 5.1 percent) when it achieved a 4.94 percent score. Both companies were also competing in the ImageNet Large Scale Visual Recognition Challenge.
The Beijing-based Minwa supercomputer that achieved the low error rate has 36 server nodes, each with two six-core Intel Xeon E5-2620 processors and four NVIDIA Tesla K40m GPUs running on FDR InfiniBand (56Gb/s) fabric that supports RDMA. Each GPU comes with 12GB of memory and offers 4.29 teraflops of peak single precision floating point performance. With the GPUDirect RDMA, the InfiniBand interface can access GPU memory without involvement from the CPU. The system operates Linux with CUDA 6.0 and MPI MVAPICH2, which supports GPUDirect RDMA. Figure 1 shows the system architecture
“Our company is now leading the race in computer intelligence,” said Baidu scientist Ren Wu, speaking at the Embedded Vision Summit on Tuesday, as reported by MIT Technology Review. It’s enough computational power to rank the machine within the top 300 fastest computers in the world if it weren’t relegated solely to deep learning, Wu added, boasting: “I think this is the fastest supercomputer dedicated to deep learning. We have great power in our hands — much greater than our competition.”
An even larger machine is in the works according to reporting by the WSJ, one that can pump out about 7 petaflops, but since it’s deep learning, these aren’t the double-precision variety.
Not surprisingly Facebook is also pushing the envelope of deep learning, but the social media company isn’t talking up its stats. In the WSJ piece, computer vision pioneer and Director of AI Research at Facebook Yann LeCun took a shot at the ImageNet test, suggesting it’s on its way out as a benchmark. “People are focusing on much larger data sets and more challenging tasks that involve object recognition, such as object detection and localization,” he said.