New Tools and Platforms To Accelerate GenAI

By Jessie Hobb On Mar 21, 2024

At the GPU Technology Conference (GTC) this week, NVIDIA made a slew of announcements highlighting how the company is making it easier than ever for developers to build and deploy generative AI applications at scale. New offerings include powerful computing platforms optimized for AI workloads, cloud services to access NVIDIA infrastructure and software, and microservices and APIs to streamline development.

“Generative AI is the defining technology of our time. Blackwell is the engine to power this new industrial revolution,” said Jensen Huang, founder and CEO of NVIDIA. “Working with the most dynamic companies in the world, we will realize the promise of AI for every industry.”

Blackwell GPU Architecture Powers Next-Generation AI Computing

Headlining the announcements was the new Blackwell GPU architecture, NVIDIA’s next-generation platform for accelerated computing and generative AI. Blackwell introduces several innovations to enable trillion-parameter AI models, including a unified 208 billion transistor GPU, a second-generation Transformer Engine, and the fifth generation of NVIDIA NVLink for high-speed interconnects between GPUs.

The Blackwell architecture delivers an impressive 2.5x the FP8 performance for AI training compared to NVIDIA’s previous Hopper GPUs. For inference and content generation, Blackwell provides up to 30x faster performance for large language models. This leap in performance will enable developers to create and run more sophisticated AI models than ever before.

“Blackwell offers massive performance leaps, and will accelerate our ability to deliver leading-edge models,” said Sam Altman, CEO of OpenAI. “We’re excited to continue working with NVIDIA to enhance AI compute.”

Dgx Supercomputer Delivers Exaflop of AI Performance

To showcase Blackwell’s capabilities, NVIDIA announced its new DGX supercomputer powered by Blackwell GPUs. A single rack of the new DGX delivers an exaflop of AI performance, equivalent to the world’s top 5 supercomputers. With 576 Blackwell GPUs connected as one system via NVLink, NVIDIA touts it as an “AI factory” for generative AI.

NVIDIA AI Model Microservices Streamline Deployment

To make Blackwell’s power accessible, NVIDIA announced dozens of NVIDIA NIM (NVIDIA AI model) inference microservices. Built on top of the NVIDIA CUDA platform, these cloud-native microservices provide optimized inference with industry-standard APIs for over two dozen popular AI models from NVIDIA and partners.

NIM microservices come pre-packaged with all necessary dependencies, such as CUDA, cuDNN, and TensorRT, to eliminate configuration hassles. They deliver the fastest AI inference via containers thanks to optimized NVIDIA software like Triton Inference Server.

Developers can easily deploy these microservices on any NVIDIA-accelerated computing platform, from cloud instances to on-premises servers to edge devices. Major cloud providers like AWS, Azure, and Google Cloud will offer NIM microservices, as will NVIDIA DGX Cloud and NVIDIA-Certified Systems from server vendors.

“Created with our partner ecosystem, these containerized AI microservices are the building blocks for enterprises in every industry to become AI companies,” Huang explained. “Established enterprise platforms are sitting on a goldmine of data that can be transformed into generative AI copilots.”

Omniverse and CUDA-X Microservices Accelerate Development

Beyond compute and deployment services, NVIDIA announced new SDKs and APIs to accelerate AI development across industries. Omniverse Cloud APIs enable developers to integrate core Omniverse technologies into existing design and simulation applications. These APIs provide physically accurate 3D simulation and visualization capabilities for digital twins.

Industrial software giants like Ansys, Autodesk, Bentley, and Siemens are integrating Omniverse Cloud APIs into their product design and engineering platforms. Omniverse enables users of these tools to seamlessly collaborate on 3D models and apply generative AI to computer-aided engineering workflows.

“The future convergence of 6G and AI holds the promise of a transformative technological landscape,” said Charlie Zang, SVP at Samsung Research America. “This will bring seamless connectivity and intelligent systems that will redefine our interactions with the digital world.”

CUDA-X microservices offer end-to-end building blocks for data preparation, training, and deployment for common AI workflows. These include NVIDIA Riva for customizable speech AI, cuOpt for routing optimization, Earth-2 APIs for global climate simulations, and NeMo Retriever services for knowledge retrieval and language understanding.

SAP Partnership Brings Generative AI to Enterprises

NVIDIA is bringing generative AI capabilities to key industries like healthcare and life sciences through targeted microservice suites and partnerships. A standout collaboration is with enterprise software leader SAP. SAP and NVIDIA are working to integrate generative AI with SAP’s portfolio of enterprise applications and the SAP AI Core platform.

Using NVIDIA’s AI foundations and NeMo customization tools, SAP will build generative AI assistants embedded across its product lines. This includes an AI copilot for its enterprise resource planning suite and AI-augmented capabilities in its SAP SuccessFactors HR software and SAP Signavio business process intelligence solutions.

“Strategic technology partnerships, like the one between SAP and NVIDIA, are at the core of our strategy to invest in technology that maximizes the potential and opportunity of AI for business,” said SAP CEO Christian Klein. “NVIDIA’s expertise in delivering AI capabilities at scale will help SAP accelerate the pace of transformation and better serve our customers in the cloud.”

NVIDIA AI Powers Next-Gen Robotics and Quantum Computing

In robotics, NVIDIA unveiled Project GR00T, a foundation model for teaching and training humanoid robots general skills. It leverages the new Jetson Thor robotics computer and updates to the Isaac robotics platform to create what Huang called “artificial general robotics.”

GR00T aims to enable robots to understand natural language and emulate human actions simply by observing examples. The model takes multimodal inputs spanning video, audio, and sensor data to learn tasks. It can then output motor control signals to reproduce the skills in the physical world using an NVIDIA-built robotics simulator.

Finally, for quantum computing, NVIDIA debuted Quantum Cloud, a cloud service based on the open-source CUDA-Q platform to let researchers develop quantum algorithms and applications. It features powerful new capabilities developed with the quantum ecosystem, including a generative model for quantum machine learning and integrations with software from QC Ware and Classiq.

“Quantum computing presents the next revolutionary frontier of computing and it’s going to require the world’s most brilliant minds to bring this future one step closer,” said Tim Costa, director of HPC and quantum computing at NVIDIA. “NVIDIA Quantum Cloud breaks down the barriers to explore this transformative technology.”

Comprehensive Platform Simplifies Generative AI Development

From chips to cloud services to AI microservices, NVIDIA’s GTC announcements showcase how the company is providing developers with an end-to-end platform to simplify and accelerate building state-of-the-art generative AI applications across industries. With these new tools, developers can focus on deploying transformative AI innovations faster than ever before.