Google claims[0] the TPU is many times faster for the workloads they've designed it for.
> On our production AI workloads that utilize neural network inference, the TPU is 15x to 30x faster than contemporary GPUs and CPUs.
As far as I know this will be the first opportunity for the public to prove those claims, as until now they've not been available on GCP. I don't mean to sound skeptical–I'm quite confident they're not exaggerating.
Keep in mind that what you linked refers to TPUv1, which is built for quantized 8-bit inference. The TPUv2, which was announced in this blog post, is for general purpose training and uses 32-bit weights, activations, and gradients.
It will have very different performance characteristics.
The reserve TPU button has been available on the dashboard for the last few months. But I assume instances have been prioritized for large customers such as Two Sigma.
From the paper:
"Despite low utilization for some applications, the TPU is on average about 15X - 30X faster than its contemporary GPU or CPU, with TOPS/Watt about 30X - 80X higher. Moreover, using the GPU's GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU."
In-Datacenter Performance Analysis of a Tensor Processing Unit
It will be interesting to see some benchmarks that compare TPUs to V100, since all previously published comparisons from Google compare TPU to K80 (3 GPU architectures ago).
They're closely related though, since if the perf per watt is lower then Google can charge you less doller per perf. The price they charge you is ultimately tied to the operating cost.
> On our production AI workloads that utilize neural network inference, the TPU is 15x to 30x faster than contemporary GPUs and CPUs.
As far as I know this will be the first opportunity for the public to prove those claims, as until now they've not been available on GCP. I don't mean to sound skeptical–I'm quite confident they're not exaggerating.
[0]: https://cloudplatform.googleblog.com/2017/04/quantifying-the...