NVIDIA Volta Tensor Core GPU Achieves
New AI Performance Milestones
GPU-POWERED DEEP LEARNING IS TRANSFORMING
EVERY INDUSTRY, SOLVING CHALLENGES ONCE
THE IDEAL AI COMPUTING PLATFORM NEEDS TO
PROVIDE IMPROVED PERFORMANCE, SCALABILITY
AND PROGRAMMABILITY TO ADDRESS THE
DIVERSITY OF MODEL ARCHITECTURES.
NVIDIA’S VOLTA TENSOR CORE GPU ACHIEVED
RECORD-SHATTERING RESNET-50 PERFORMANCE
FOR A SINGLE CHIP, SINGLE NODE, AND SINGLE
FASTEST SINGLE CHIP
A single V100 Tensor Core GPU achieves
1,075 images/second when training
ResNet-50, a 4X performance increase
compared to the previous generation
“New figures from NVIDIA illustrate the contribution hardware improvements can make to progress in machine learning: the AlexNet model that won
ImageNet in 2012 took six days to train, can now be done in 18 minutes — a 500x speedup.” - Tom Simonite, WIRED
FASTEST SINGLE NODE
A single DGX-1 server powered by eight
Tensor Core V100s achieves
7,850 images/second, almost 2X the 4,200
images/second from a year ago on the
“I feel like it’s important to note that these performance improvements [by NVIDIA] are more important than they immediately appear, because while
these gains dramatically impact today’s workloads, they’re effectively preempting even more complex workloads of the future.”
- Rob Williams, TechGage
FASTEST SINGLE CLOUD INSTANCE
A single AWS P3 cloud instance powered
by eight Tensor Core V100 GPUs can train
ResNet-50 in less than three hours, 3X
faster than a TPU instance.
“4 #TPU chips in a ‘Cloud TPU’ deliver 180 teraFLOPS of performance; by comparison, four V100 chips deliver 500 teraFLOPS. #NVIDIAwins.”
- Karl Freund, Moor Insights
NVIDIA TENSOR CORE GPU ARCHITECTURE ALLOWS US TO SIMULTANEOUSLY PROVIDE GREATER
PERFORMANCE THAN SINGLE-FUNCTION ASICS, YET BE PROGRAMMABLE FOR DIVERSE WORKLOADS.
EACH TESLA V100 TENSOR CORE GPU DELIVERS 125 TERAFLOPS OF PERFORMANCE FOR DEEP
LEARNING COMPARED TO 45 TERAFLOPS BY A GOOGLE TPU CHIP.
4 TPU CHIPS IN A ‘CLOUD TPU’ V2 DELIVER 180 TERAFLOPS OF PERFORMANCE.
BY COMPARISON, 4 NVIDIA V100 CHIPS DELIVER 500 TERAFLOPS OF PERFORMANCE.