
Lançamento da Arquitetura Rubin da NVIDIA
Lançamento da Arquitetura Rubin da NVIDIA
aitech.pt
aitech.pt
NVIDIA Launches Powerful New Rubin Chip Architecture

Recent advancements in AI have paved the way for increasingly sophisticated hardware solutions. NVIDIA, a leader in graphics and AI technology, has unveiled its latest innovation: the Rubin chip architecture. This new architecture signifies a major leap in AI supercomputing, promising unparalleled performance and efficiency for large-scale AI applications.

Overview of the Rubin Architecture
The NVIDIA Rubin chip architecture is a cutting-edge platform engineered for AI tasks on a rack-scale system, comprising six custom chips. Designed to function as a cohesive AI supercomputer, the Rubin architecture has made headlines by claiming up to a 10x reduction in cost per token compared to its predecessor, Blackwell. This remarkable gain in efficiency is expected to elevate the capabilities of AI model training and inference dramatically.

Architecture Details

The architecture includes several cutting-edge components that work in tandem to optimize complex AI model processing:

- NVIDIA Rubin GPU: Features 224 Streaming Multiprocessors and 6th generation tensor cores optimized for low-precision execution with NVFP4 and FP8, achieving a remarkable 50 petaFLOPS of NVFP4 inference performance.
- NVIDIA Vera CPU: A custom processor with 88 cores, boasting up to 1.2 TB/s memory bandwidth, specifically designed for data movement and agentic reasoning.
- NVIDIA NVLink 6 Switch: Ensures up to 3.6 TB/s bandwidth per GPU, totaling 260 TB/s connectivity per rack.
- NVIDIA ConnectX-9 SuperNIC, BlueField-4 DPU, and the Spectrum-6 Ethernet Switch: These components are vital for network infrastructure.

Performance Specifications
Each GPU within the Rubin architecture features advanced capabilities, including:
- Up to 288 GB of HBM4 memory
- Aggregate bandwidth of up to 22 TB/s
- Enhanced efficiency in decoding and processing workloads based on transformers
The Rubin NVL72 rack combines 72 Rubin GPUs with 36 Vera CPUs, delivering a staggering 3,600 PFLOPS of NVFP4 inference performance and 260 TB/s NVLink bandwidth.
| Componente | Especificações |
|---|---|
| NVIDIA Rubin GPU | 224 Multiprocessadores, 50 petaFLOPS NVFP4 |
| NVIDIA Vera CPU | 88 núcleos, até 1.2 TB/s de largura de banda |
| Largura de banda total por rack | 260 TB/s |
| Memória HBM4 por GPU | Até 288 GB |
| Largura de banda agregada por GPU | Até 22 TB/s |
Manufacturing and Timeline
The Rubin chips are produced by TSMC using a 3nm process, and the official launch is anticipated for the third quarter of 2026. An upgraded version, known as Rubin Ultra, promising performance of up to 100 petaFLOPS, is expected to roll out in 2027.
Advanced Features
The Rubin architecture introduces several innovative features:
- Rack-Scale Resilience: Equipped with a second-generation RAS engine allowing for proactive maintenance.
- Modular Cableless Tray Designs: Enable assembly up to 18 times faster than Blackwell.
- Rack-Scale Confidential Computing: Facilitates secure computation through CPU, GPU, and NVLink domains.
- Transformers Engine: Accelerated by hardware, optimizing NVFP4 performance while maintaining precision through adaptive compression.
“The Rubin platform marks a transformative leap in AI infrastructure, meeting the demands of future model processing.” — Jensen Huang, CEO of NVIDIA.
Conclusion
The introduction of the NVIDIA Rubin chip architecture represents a pivotal advancement in the AI landscape, propelling us towards what NVIDIA calls “AI factories.” With a robust architecture, custom chipsets, and extensive innovations in performance and efficiency, Rubin not only minimizes operational costs but also helps meet the escalating demand for large-scale model processing.
For more details on this groundbreaking innovation, visit NVIDIA’s official blog.
Sources
Share this post
Like this post? Share it with your friends!