NVIDIA has announced the launch of a generative AI microservice that supports enterprises to connect custom large language models with enterprise data, enabling their AI applications to provide highly accurate responses.
NVIDIA NeMo? Retriever is a new service from NVIDIA NeMo (a framework and tool series for building, customizing, and deploying generative AI models) that enhances generation (RAG) capabilities through enterprise level retrieval, helping organizations strengthen their generative AI applications.Nvidia
As a semantic retrieval microservice, NeMo Retriever utilizes NVIDIA optimized algorithms to help generative AI applications provide more accurate answers. Developers using this microservice can connect their AI applications with business data located in various clouds and data centers. This service adds NVIDIA optimized RAG functionality to AI foundry and is integrated into the NVIDIA AI Enterprise software platform on Amazon Cloud Technology Marketplace.
Companies such as Cadence, Dropbox, SAP, and ServiceNow have taken the lead in collaborating with NVIDIA to integrate production ready RAG functionality into their custom generative AI applications and services.
NVIDIA founder and CEO Huang Renxun said, “Generative AI applications with RAG capabilities are the next killer application for enterprises. With NVIDIA NeMo Retriever, developers can create custom generative AI chatbots, AI assistants, and summarization tools. They can access enterprise business data and improve productivity through accurate and valuable generative AI intelligence.”
Global top companies leverage NeMo Retriever to improve the accuracy of Large Language Modeling (LLM)
Cadence, a leading electronic system design firm, provides services to enterprises in the ultra large scale computing, 5G communication, automotive, mobile, aerospace, consumer, and healthcare markets.The company is collaborating with NVIDIA to develop RAG functionality for generative AI applications in the field of industrial electronic design.
Cadence President and CEO Anirudh Devgan said, “Generative AI introduces innovative methods to meet customer needs, such as tools that can detect potential defects early in the design process. Our researchers are collaborating with NVIDIA to further improve the accuracy and relevance of generative AI applications using NeMo Retriever to identify issues and help customers bring high-quality products to the market faster.”
Unlike the open-source RAG tool suite, NeMo Retriever empowers production ready generative AI through commercially viable models, API stability, security patches, and enterprise level support.
The NVIDIA optimized algorithm enables Retriever’s embedding model to produce the highest accuracy results. The optimized embedding model can capture the relationships between words, enabling LLM to process and analyze text data.
Enterprises can use NeMo Retriever to connect LLM to multiple data sources and knowledge bases, allowing users to easily interact with data and obtain accurate and up-to-date answers through simple dialogue commands. By using Retriever driven applications, businesses can securely access information in multiple data formats, such as text, PDF, images, and videos.
With NeMo Retriever, companies can achieve more accurate results with less training, accelerate product time to market, and reduce energy consumption generated by generative AI application development.
Reliable, simple, and secure deployment through NVIDIA AI Enterprise
Enterprises can deploy applications driven by NeMo Retriever on NVIDIA accelerated computing in almost any data center or cloud to run during the inference process. Can NVIDIA AI Enterprise support inference servers through NVIDIA Triton NVIDIA TensorRT Accelerated high-performance inference using NVIDIA TensorRT-LLM and other NVIDIA AI software.