Why Amazon EC2 P5 Instances?
Amazon Elastic Compute Cloud (Amazon EC2) P5 instances, powered by NVIDIA H100 Tensor Core GPUs, and P5e and P5en instances powered by NVIDIA H200 Tensor Core GPUs deliver the highest performance in Amazon EC2 for deep learning (DL) and high performance computing (HPC) applications. They help you accelerate your time to solution by up to 4x compared to previous-generation GPU-based EC2 instances, and reduce cost to train ML models by up to 40%. These instances help you iterate on your solutions at a faster pace and get to market more quickly. You can use P5, P5e, and P5en instances for training and deploying increasingly complex large language models (LLMs) and diffusion models powering the most demanding generative artificial intelligence (AI) applications. These applications include question answering, code generation, video and image generation, and speech recognition. You can also use these instances to deploy demanding HPC applications at scale for pharmaceutical discovery, seismic analysis, weather forecasting, and financial modeling.
To deliver these performance improvements and cost savings, P5 and P5e instances complement NVIDIA H100 and H200 Tensor Core GPUs with 2x higher CPU performance, 2x higher system memory, and 4x higher local storage as compared to previous-generation GPU-based instances. P5en instances pair NVIDIA H200 Tensor Core GPUs with high performance Intel Sapphire Rapids CPU, enabling Gen5 PCIe between CPU and GPU. P5en instances provide up to 2x the bandwidth between CPU and GPU and lower network latency compared to P5e and P5 instances thereby improving distributed training performance. P5 and P5e instances support provide up to 3,200 Gbps of networking using second-generation Elastic Fabric Adapter (EFA). P5en, with third generation of EFA using Nitro v5, shows up to 35% improvement in latency compared to P5 that uses the previous generation of EFA and Nitro. This helps improve collective communications performance for distributed training workloads such as deep learning, generative AI, real-time data processing, and high-performance computing (HPC) applications. To deliver large-scale compute at low latency, these instances are deployed in Amazon EC2 UltraClusters that enable scaling up to 20,000 H100 or H200 GPUs interconnected with a petabit-scale nonblocking network. P5, P5e, and P5en instances in EC2 UltraClusters can deliver up to 20 exaflops of aggregate compute capability—performance equivalent to a supercomputer.
Amazon EC2 P5 Instances
Benefits
Features
Customer testimonials
Here are some examples of how customers and partners have achieved their business goals with Amazon EC2 P4 instances.
-
Anthropic
Anthropic builds reliable, interpretable, and steerable AI systems that will have many opportunities to create value commercially and for public benefit.
-
Cohere
Cohere, a leading pioneer in language AI, empowers every developer and enterprise to build incredible products with world-leading natural language processing (NLP) technology while keeping their data private and secure
-
Hugging Face
Hugging Face is on a mission to democratize good ML.
Product details
Instance Size | vCPUs | Instance Memory (TiB) | GPU | GPU memory | Network Bandwidth (Gbps) | GPUDirect RDMA | GPU Peer to Peer | Instance Storage (TB) | EBS Bandwidth (Gbps) |
---|---|---|---|---|---|---|---|---|---|
p5.48xlarge | 192 | 2 | 8 H100 | 640 GB HBM3 |
3200 Gbps EFA | Yes | 900 GB/s NVSwitch | 8 x 3.84 NVMe SSD | 80 |
p5e.48xlarge | 192 | 2 | 8 H200 | 1128 GB HBM3e |
3200 Gbps EFA | Yes | 900 GB/s NVSwitch | 8 x 3.84 NVMe SSD | 80 |
p5en.48xlarge | 192 | 2 | 8 H200 | 1128 GB HBM3e | 3200 Gbps EFA | Yes | 900 GB/s NVSwitch | 8 x 3.84 NVMe SSD | 100 |
Getting started with ML use cases
Getting started with HPC use cases
P5, P5e, and P5en instances are an ideal platform to run engineering simulations, computational finance, seismic analysis, molecular modeling, genomics, rendering, and other GPU-based HPC workloads. HPC applications often require high network performance, fast storage, large amounts of memory, high compute capabilities, or all of the above. All three instance types support EFA that enables HPC applications using the Message Passing Interface (MPI) to scale to thousands of GPUs. AWS Batch and AWS ParallelCluster help HPC developers quickly build and scale distributed HPC applications.
Learn more