Enlarge / The 180 TFLOPS Cloud TPU card. (credit: Google)
Google has developed a 45-teraflops chip for machine learning and artificial intelligence, its second generation tensor processor unit (TPU), and the company is bringing it to the cloud. The custom chips are 15-30 times faster and 30-80 times more power efficient than CPUs and GPUs for these workloads, and the company has been using them already for its AlphaGo Go-playing computer, and its search results.
Starting today, Google will be offering the TPUs to users of Google Compute Cloud. The chips are arranged into modules of 4, for 180 TFLOPS per card. 64 of the cards can be linked into what Google calls a pod, with 11.5 petaflops total; one petaflops is 1015 floating point operations per second.
Typically in machine learning workloads there is a division between the initial training and model building, and the subsequent pattern matching against the model. The former workload is the one that is most heavily dependent on massive compute power, and it’s this that has generally been done on GPUs. The company’s first generation TPUs were used for the second part—making inferences based on the model, to recognize images, language, or whatever. The new TPUs are optimized for both workloads, allowing the same chips to be used for both training and making inferences.
