Text preview for : pascal-architecture-whitepaper.pdf part of NVIDIA pascal-architecture-whitepaper NVIDIA pascal-architecture-whitepaper.pdf



Back to : pascal-architecture-white | Home

Whitepaper


NVIDIA Tesla P100
The Most Advanced Datacenter Accelerator Ever Built
Featuring Pascal GP100, the World's Fastest GPU




NVIDIA Tesla P100 WP-08019-001_v01 | 1
GP100 Pascal Whitepaper




Table of Contents
Introduction ....................................................................................................................................................................... 4
Tesla P100: Revolutionary Performance and Features for GPU Computing ...................................................................... 5
Extreme Performance for High Performance Computing and Deep Learning............................................................. 6
NVLink: Extraordinary Bandwidth for Multi-GPU and GPU-to-CPU Connectivity ........................................................ 7
HBM2 High-Speed GPU Memory Architecture ............................................................................................................. 8
Simplified Programming for Developers with Unified Memory and Compute Preemption ........................................ 9
GP100 GPU Hardware Architecture In-Depth .................................................................................................................. 10
Exceptional Performance and Power Efficiency ......................................................................................................... 11
Pascal Streaming Multiprocessor................................................................................................................................ 12
Designed for High-Performance Double Precision ..................................................................................................... 13
Support for FP16 Arithmetic Speeds Up Deep Learning ............................................................................................ 14
Better Atomics............................................................................................................................................................. 14
L1/L2 Cache Changes in GP100................................................................................................................................... 15
GPUDirect Enhancements ........................................................................................................................................... 15
Compute Capability ..................................................................................................................................................... 16
Tesla P100: World's First GPU with HBM2 ................................................................................................................. 16
Memory Resilience ................................................................................................................................................. 18
Tesla P100 Design ....................................................................................................................................................... 18
NVLink High Speed Interconnect...................................................................................................................................... 20
NVLink Configurations ................................................................................................................................................. 21
GPU-to-GPU NVLink Connectivity .......................................................................................................................... 21
CPU-to-GPU NVLink Connectivity ........................................................................................................................... 22
NVLink Interface to the Tesla P100 ............................................................................................................................. 24
Unified Memory ............................................................................................................................................................... 25
Unified Memory History.............................................................................................................................................. 25
Pascal GP100 Unified Memory ................................................................................................................................... 27
Benefits of Unified Memory........................................................................................................................................ 28
Compute Preemption....................................................................................................................................................... 30
NVIDIA DGX-1 Deep Learning Supercomputer ................................................................................................................. 31
250 Servers in a Box .................................................................................................................................................... 31
12X DNN Speedup in One Year ................................................................................................................................... 32
DGX-1 Software Features ............................................................................................................................................ 32
NVIDIA DGX-1 System Specifications .......................................................................................................................... 33
Conclusion ........................................................................................................................................................................ 34
Appendix A: NVLink Signaling and Protocol Technology .................................................................................................. 35
NVLink Controller Layers ............................................................................................................................................. 35
Physical Layer (PL) .................................................................................................................................................. 35
Data Link Layer (DL) ................................................................................................................................................ 36
Transaction Layer ................................................................................................................................................... 36


NVIDIA Tesla P100 WP-08019-001_v01 | 2
GP100 Pascal Whitepaper Introduction



Appendix B: Accelerating Deep Learning and Artificial Intelligence with GPUs ............................................................... 37
Deep Learning in a Nutshell ........................................................................................................................................ 37
NVIDIA GPUs: The Engine of Deep Learning .............................................................................................................. 40
Tesla P100: The Fastest Accelerator for Training Deep Neural Networks ................................................................ 41
Comprehensive Deep Learning Software Development Kit ...................................................................................... 41
Big Data Problem Solving with NVIDIA GPUs and DNNs............................................................................................. 42
Self-driving Cars........................................................................................................................................................... 43
Robots ......................................................................................................................................................................... 44
Healthcare and Life Sciences ...................................................................................................................................... 44




NVIDIA Tesla P100 WP-08019-001_v01 | 3
Introduction



Nearly a decade ago, NVIDIA