Amazon EC2 G7e Instances with NVIDIA RTX Pro 6000 Blackwell GPUs
- Today, we're announcing the general availability of Amazon Elastic Compute Cloud (Amazon EC2) G7e instances that deliver cost-effective performance for generative AI inference workloads and the highest performance...
- G7e instances are accelerated by the NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs and are well suited for a broad range of GPU-enabled workloads including spatial computing...
- Amazon EC2 G7e instances are the latest generation of GPU instances designed for machine learning and high-performance computing workloads, launched by Amazon Web Services (AWS) on November 30,...
Today, we’re announcing the general availability of Amazon Elastic Compute Cloud (Amazon EC2) G7e instances that deliver cost-effective performance for generative AI inference workloads and the highest performance for graphics workloads.
G7e instances are accelerated by the NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs and are well suited for a broad range of GPU-enabled workloads including spatial computing and scientific computing workloads. G7e instances deliver up to 2.3 times inference performance compared to G6e instances.
Improvements made compared to predecessors:
- NVIDIA RTX PRO 6000 Blackwell GPUs – NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs offer two times the GPU memory and 1.85 times the GPU memory bandwidth compared to G6e instances. By using the higher GPU memory offered by G7e instances, you can run medium-sized models of up to 70B parameters with FP8 precision on a single GPU.
- NVIDIA GPUDirect P2P – For models that are too large to fit into the memory of a single GPU,you can split the model or computations across multiple GPUs. G7e instances reduce the latency of your multi-GPU workloads with support for NVIDIA GPUDirect P2P, which enables direct dialog between GPUs over PCIe interconnect. These instances offer the lowest peer to peer latency for GPUs on the same PCIe switch. Additionally, G7e instances offer up to four times the inter-GPU bandwidth compared to L40s GPUs featured in G6e instances, boosting the performance of multi-GPU workloads. these improvements mean you can run inference for larger models across multiple GPUs offering up to 768 GB of GPU memory in a single node.
- Networking – G7e instances offer four times the networking bandwidth compared to G6e instances, which means you can use the instance for small-scale multi-node workloads. Additionally, multi-GPU G7e instances support NVIDIA GPUDirect Remote Direct Memory Access (RDMA) with
g7e.4xlarge 1 96 16 128 1.9 x 1 8 50 g7e.8xlarge 1 96 32 256 1.9 x 1 16 100 g7e.12xlarge 2 192 48 512 3.8 x 1 “`html What are Amazon EC2 G7e Instances?
Table of Contents
Amazon EC2 G7e instances are the latest generation of GPU instances designed for machine learning and high-performance computing workloads, launched by Amazon Web Services (AWS) on November 30, 2023.
These instances feature NVIDIA H100 Tensor Core GPUs,offering significant performance improvements over previous generations. they are optimized for demanding tasks like large language models (LLMs),generative AI,and scientific simulations. G7e instances are available in various sizes to accommodate different workload requirements, and they integrate with the broader AWS ecosystem.
According to AWS News Blog, the G7e instances deliver up to 40% better price performance for LLM training compared to previous-generation instances. https://aws.amazon.com/blogs/aws/introducing-new-gpu-instances-for-generative-ai-and-hpc/
Key Features of G7e Instances
G7e instances provide a combination of powerful hardware and software features to accelerate demanding workloads.
- NVIDIA H100 GPUs: Each instance is equipped with NVIDIA H100 Tensor Core GPUs, offering considerable compute power.
- AMD EPYC Processors: They utilize 3rd Gen AMD EPYC processors, providing a strong CPU foundation.
- High Bandwidth Memory (HBM3): HBM3 memory delivers fast data access for GPU-intensive applications.
- Elastic Fabric Adapter (EFA): EFA enables low-latency, high-throughput networking for distributed training.
- AWS Nitro System: the AWS Nitro System provides virtualization and security features.
The AWS documentation details that G7e instances offer up to 8 NVIDIA H100 GPUs per instance, with up to 3.2 TB of HBM3 memory. https://aws.amazon.com/ec2/instance-types/g7/
Use Cases for G7e Instances
G7e instances are well-suited for a variety of applications requiring significant GPU acceleration.
- large Language Models (LLMs): Training and inference of LLMs like those used in chatbots and content generation.
- Generative AI: Creating images, videos, and other content using generative AI models.
- High-performance Computing (HPC): running complex simulations in fields like weather forecasting, financial modeling, and drug discovery.
- Machine Learning Training: Accelerating the training of various machine learning models.
- Deep Learning Inference: Deploying and scaling deep learning models for real-time predictions.
Stability AI,a leading open-source generative AI company,used G7e instances to accelerate the training of Stable Diffusion XL,reducing training time by 30%. https://aws.amazon.com/blogs/aws/stability-ai-accelerates-generative-ai-innovation-with-amazon-ec2-g7e-instances/
How to Access G7e Instances
You can access Amazon EC2 G7e instances through the AWS Management Console, AWS Command Line Interface (CLI), or AWS SDKs.
For a managed experience, you can use G7e instances with Amazon Elastic Container Service (amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS) to simplify deployment and management. AWS Marketplace also offers pre-configured AMIs with popular machine learning frameworks.
As of January 22, 2026, on-demand pricing for G7e instances starts at $32.77 per hour for the g7e
