NVIDIA announced a multi-year cooperation with Microsoft to jointly build the world’s most powerful AI supercomputer. Microsoft Azure’s advanced supercomputing infrastructure, combined with NVIDIA GPU, network and full-stack AI software, enables this supercomputer to help enterprises train, deploy and expand AI including large and advanced models.
Azure’s cloud AI supercomputer includes powerful and scalable ND and NC series virtual machines, which are optimized for AI distributed training and reasoning. It is the first public cloud to adopt NVIDIA’s advanced AI stack, and has added tens of thousands of NVIDIA A100 and H100GPUs, NVIDIA Quantum-2 400Gb/s InfiniBand network and NVIDIA AI Enterprise software suite on the platform.
In this cooperation, NVIDIA will use Azure’s scalable virtual machine instance to study and further accelerate the development of generative AI. Generative AI is a rapidly emerging field of AI. Basic models such as Megatron Turing NLG 530B are the benchmark of unsupervised and self-learning algorithms, which are used to create new text, code, digital image, video or audio.
The two companies will also cooperate to optimize Microsoft’s DeepSpeed in-depth learning optimization software. NVIDIA’s full-stack AI workflow and software development kit are optimized for Azure and will be provided to Azure enterprise customers.
Manuvir Das, Vice President of NVIDIA Enterprise Computing, said: “AI technology is accelerating, and the adoption speed of the industry is also accelerating. Breakthroughs in basic models have triggered a wave of research, fostered new start-ups and launched new enterprise applications. We will work with Microsoft to provide researchers and enterprises with the most advanced AI infrastructure and software, so that they can take full advantage of the transformative power of AI.”
Scott Guthrie, executive vice president of Microsoft Cloud and AI Business Unit, said: “AI is setting off the next wave of automation for the entire enterprise and industrial computing, helping enterprises and institutions to achieve twice the result with half the effort in the unpredictable economic environment. We have cooperated with NVIDIA to build the most scalable supercomputer platform in the world and provide the most advanced AI functions for every enterprise on Microsoft Azure.”
Scalable peak performance through NVIDIA computing on Azure and Quantum-2 InfiniBand
The AI optimized virtual machine instance of Microsoft Azure uses NVIDIA’s most advanced data center GPU, and is the first public cloud instance equipped with NVIDIA Quantum-2 400Gb/s InfiniBand network. Customers can deploy thousands of GPUs in a single cluster to train the largest large-scale language model, build the most complex recommendation system and implement generative AI on a large scale.
The current Azure instance uses NVIDIA Quantum 200Gb/s InfiniBand network and NVIDIA A100 GPU. The future instance will integrate NVIDIA Quantum-2 400Gb/s InfiniBand network and NVIDIA H100 GPU. Combined with Azure’s advanced computing cloud infrastructure, network and storage, these AI optimized products will provide scalable peak performance for AI training and deep learning reasoning workloads of any scale.