NVIDIA Partners with Microsoft to Build Large Scale Cloud AI Computer

Santa Clara, California, USA, NVIDIA today announced a multi-year cooperation with Microsoft to jointly build the world’s most powerful AI supercomputer. Microsoft Azure’s advanced supercomputing infrastructure, combined with NVIDIA GPU, network and full stack AI software, enables this supercomputer to help enterprises train, deploy and expand AI including large-scale and advanced models.

NVIDIA

Azure’s cloud AI supercomputer includes powerful, scalable ND and NC series virtual machines, which are optimized for AI distributed training and reasoning. It is the first public cloud to use NVIDIA’s advanced AI stack, and tens of thousands of NVIDIA A100 and H100GPU, NVIDIA Quantum-2 400Gb/s InfiniBand networks and NVIDIA AI Enterprise software packages have been added to the platform.

In this cooperation, NVIDIA will use Azure’s scalable virtual machine instances to research and further accelerate the development of generative AI. Generative AI is a rapidly emerging AI field, in which basic models such as Megatron Turing NLG 530B are the benchmark for unsupervised, self-learning algorithms that are used to create new text, code, digital images, video or audio.

The two companies will also cooperate to optimize Microsoft’s DeepSpeed in-depth learning optimization software. NVIDIA’s full stack AI workflow and software development kit are optimized for Azure and will be provided to Azure enterprise customers.

Manuvir Das, NVIDIA’s vice president of enterprise computing, said, “AI technology is accelerating, and the adoption speed of the industry is also accelerating. Breakthroughs in basic models have triggered a wave of research, fostered new start-ups, and launched new enterprise applications. We will work with Microsoft to provide researchers and enterprises with the most advanced AI infrastructure and software, so that they can take full advantage of the transformative power of AI.”

Scott Guthrie, Executive Vice President of Microsoft Cloud and AI Business Unit, said: “AI is setting off the next round of automation wave of the entire enterprise and industrial computing, helping enterprises and institutions get twice the result with half the effort in the unpredictable economic environment. We cooperate with NVIDIA to create the most scalable supercomputer platform in the world, and provide the most advanced AI functions for every enterprise on Microsoft Azure.”

Microsoft Azure’s AI optimized virtual machine instance adopts NVIDIA’s most advanced data center GPU, and is the first public cloud instance equipped with NVIDIA Quantum-2 400Gb/s InfiniBand network. Customers can deploy thousands of GPUs in a single cluster to train the largest language model, build the most complex recommendation system and implement generative AI on a large scale.

The current Azure instance uses NVIDIA Quantum 200Gb/s InfiniBand network and NVIDIA A100 GPU. Future instances will integrate NVIDIA Quantum-2 400Gb/s InfiniBand network and NVIDIA H100 GPU. Combined with Azure’s advanced computing cloud infrastructure, network and storage, these AI optimized products will provide scalable peak performance for AI training and deep learning reasoning workloads of any scale.

Search

Recent Posts