Skip to content

Advancements in AI Data Centers by Meta and Oracle

Advancements in AI Data Centers by Meta and Oracle

Both Meta and Oracle are moving towards enhancing their intelligent data centers using NVIDIA’s Spectrum-X Ethernet switches. This technology is designed to meet the growing demands of large-scale AI systems. The companies are adopting Spectrum-X as part of an open networking framework aimed at improving AI training efficiency and accelerating deployment across massive computing clusters.

Transforming Data Centers into Massive AI Factories

Jensen Huang, founder and CEO of NVIDIA, highlighted that models with trillions of parameters are transforming data centers into “massive AI factories.” He added that Spectrum-X acts as a neural system connecting millions of GPUs to train the largest models ever built.

Oracle plans to use Ethernet Spectrum-X with the Vera Rubin architecture to build large-scale AI factories. Mahesh Thiagarajan, Oracle’s Executive Vice President of Cloud Infrastructure, confirmed that the new setup will enable the company to connect millions of GPUs more efficiently, helping clients train and deploy new AI models faster.

Expanding AI Infrastructure at Meta

Meta is expanding its AI infrastructure by integrating Spectrum Ethernet switches into its open network switching system (FBOSS), its internal platform for managing network switches at scale. According to Jia Nagarajan, Meta’s Vice President of Network Engineering, the company’s next-generation network must be open and efficient to support larger intelligent models and serve billions of users.

Building Flexible AI Systems

According to Joe Delaire, who leads NVIDIA’s accelerated computing solutions for data centers, flexibility is crucial as data centers become more complex. He explained that NVIDIA’s MGX system offers a modular design that allows partners to integrate CPUs, GPUs, storage, and networking components as needed.

The system also supports operational compatibility, enabling organizations to use the same design across multiple generations of hardware. “It provides flexibility, speed to market, and future readiness,” Delaire told the media.

Challenges and Opportunities in Energy Efficiency

As AI models grow larger, energy efficiency has become a central challenge for data centers. Delaire emphasized that NVIDIA is working “from the chip to the network” to improve energy use and scalability, collaborating closely with power and cooling suppliers to maximize performance per watt.

One example is the shift to 800-volt DC power delivery, which reduces heat loss and improves efficiency. The company also offers power smoothing technology to reduce peaks on the electrical grid—a method that can lower maximum power needs by up to 30 percent, allowing for increased computing capacity within the same footprint.

Conclusion

NVIDIA’s Spectrum-X technology represents a crucial step towards improving AI infrastructure efficiency and expanding its reach across various levels. This technology offers high performance in training and inference operations, aiding major companies like Meta and Oracle in meeting the increasing demands of AI. With ongoing advancements in technology and partnerships, the future of intelligent data centers looks promising and full of potential.