Skip to main content
News Directory 3
  • Home
  • Business
  • Entertainment
  • Health
  • News
  • Sports
  • Tech
  • World
Menu
  • Home
  • Business
  • Entertainment
  • Health
  • News
  • Sports
  • Tech
  • World
Revolutionizing AI: NVIDIA Blackwell Smashes Records in MLPerf Inference Test Debut, Redefining Generative AI Capabilities

Revolutionizing AI: NVIDIA Blackwell Smashes Records in MLPerf Inference Test Debut, Redefining Generative AI Capabilities

September 2, 2024 Catherine Williams - Chief Editor Tech

Unlocking⁢ the Power of Generative AI: NVIDIA’s Leading ‍Performance in MLPerf Inference

As enterprises⁣ rapidly adopt generative AI and bring new services ⁤to ‌market, the demand for robust data center infrastructure has never‍ been greater. While ⁢training giant language models is a significant challenge, providing LLM-based real-time services poses another ‍hurdle altogether.

In the latest MLPerf industry benchmark, Inference v4.1,⁣ NVIDIA platforms demonstrated unparalleled performance across all data center tests. The upcoming NVIDIA Blackwell platform,⁤ featuring⁤ the ⁤2nd generation Transformer Engine and FP4 Tensor​ Core, is set to revolutionize the landscape. On Rama2 70B, MLPerf’s largest LLM ​workload, the ‍NVIDIA ​H100 ‍Tensor Core GPU showed performance up to 4 times better than its predecessor.

The NVIDIA H200 Tensor Core GPU has achieved​ outstanding‌ results across ⁤all benchmarks in the ‌data⁢ center sector, including​ the Mixtral 8x7B MoE LLM, which boasts 12.9 billion parameters activated per‍ token, totaling 46.7⁣ billion parameters. MoE models are gaining popularity due​ to their ability to⁣ provide more diversity‌ to LLM deployments, answering a wider⁢ variety of questions ⁤and performing a broader range⁢ of tasks in a single deployment.

The continued growth of LLMs necessitates more compute⁣ to handle the large⁣ number of⁣ inference requests. ​Multi-GPU computing is essential to serve as⁤ many users as possible while meeting the lowest real-time latency requirements for delivering state-of-the-art LLMs. NVIDIA NVLink and NVSwitch, combined ‍with the NVIDIA Hopper Architecture, provide high-bandwidth ​communication between GPUs, offering significant benefits for real-time, cost-effective, large-scale ⁣model ⁢inference.

In addition to NVIDIA, ten NVIDIA partners‌ submitted MLPerf⁣ inference results, highlighting the broad availability of⁣ NVIDIA’s platforms. These partners include AsusTek, Cisco, Dell Technologies, Fujitsu,‌ Giga Computing, Hewlett Packard Enterprise (HPE), Juniper Networks,⁣ Lenovo, Quanta ⁣Cloud Technology, and Supermicro.

Continuous Software ⁤Innovation

NVIDIA platforms are continuously improving performance and features every month through ongoing software development. The NVIDIA Hopper architecture, NVIDIA Jetson platform, and NVIDIA Triton Inference Server have shown dramatic performance improvements‌ in the latest round of inference tests.

NVIDIA⁤ H200 GPUs⁢ delivered up ​to 27 percent higher AI inference performance​ than previous rounds, highlighting the added​ value customers can achieve over⁢ time​ from‍ their⁢ investments in⁤ the NVIDIA platform. The NVIDIA⁢ AI Enterprise Software, which ⁤includes the Triton Inference Server, is a fully-featured open source inference server that helps consolidate framework-specific⁢ inference servers into a single, unified platform.

Going to the⁣ Edge

Generative AI⁤ models deployed at the edge can transform⁣ sensor data​ like images and‍ video into actionable, real-time insights with powerful contextual awareness. The NVIDIA ⁣Jetson platform for edge⁢ AI​ and robotics has the unparalleled performance to run all types of models locally, including LLM, vision​ transformer, ⁤and stable diffusion.

In this MLPerf‍ benchmark, the NVIDIA Jetson AGX Music​ system-on-module delivers over 6.2x ‍throughput improvement and 2.4x latency improvement over the previous round on the GPT-J LLM workload. This ⁢general-purpose 6 billion-parameter model can seamlessly interface ‌with human language, revolutionizing generative AI at the edge.

Proven Performance Leadership Across All Sectors

This MLPerf Inference round demonstrates the versatility and leading performance of NVIDIA’s ‍platform to​ supercharge the most innovative AI-based applications and services across all benchmark⁢ workloads⁤ from the data center ⁣to the ⁣edge. ⁣The H200⁣ GPU-based system is the first cloud service provider to announce general availability, with server manufacturers ASUS, Dell Technologies, HPE, QCT,‌ and Supermicro also on board.

Share this:

  • Share on Facebook (Opens in new window) Facebook
  • Share on X (Opens in new window) X

Related

AI, artificial intelligence

Search:

News Directory 3

ByoDirectory is a comprehensive directory of businesses and services across the United States. Find what you need, when you need it.

Quick Links

  • Copyright Notice
  • Disclaimer
  • Terms and Conditions

Browse by State

  • Alabama
  • Alaska
  • Arizona
  • Arkansas
  • California
  • Colorado

Connect With Us

© 2026 News Directory 3. All rights reserved.

Privacy Policy Terms of Service