IBM and Groq announced a strategic marketing and technology partnership aimed at providing customers with instant access to Groq’s inference technology, GroqCloud, through Watsonx Orchestra, enabling them to obtain high-speed AI inference capabilities at a lower cost and accelerate the deployment of autonomous AI agents.
As part of their collaboration, Groq and IBM plan to integrate and enhance RedHat’s open-source vLLM technology with Groq’s LPU architecture. IBM also plans to run its Granite model on GroqCloud for use by IBM customers.
When enterprises push AI agents from pilot to production, they still face challenges in terms of speed, cost, and reliability, which are particularly evident in key areas such as healthcare, finance, government, retail, and manufacturing. This collaboration combines Groq’s inference speed, cost-effectiveness, and access to the latest open-source models with IBM’s proprietary AI orchestration technology to provide the infrastructure needed to help businesses achieve scale.
GroqCloud is driven by its custom LPU, which has inference speed more than 5 times faster than traditional GPU systems and is more cost-effective. Even under globalized workloads, it is possible to achieve sustained low latency and reliable performance. This is particularly important for autonomous AI agents in regulated industries.
For example, IBM’s healthcare clients receive thousands of complex patient inquiries simultaneously. With Groq, IBM’s AI agents can analyze information in real-time and provide accurate answers immediately, enhancing customer experience and enabling organizations to make faster and wiser decisions.
This technology is also being applied in non regulatory industries. IBM’s customers in the retail and fast-moving consumer goods sectors are using Groq to support human resources agents, helping to enhance automation of human resources processes and increase employee productivity.
Rob Thomas, IBM Software and Chief Business Officer, said: “Many large enterprises have multiple options when conducting AI inference experiments, but when they want to go into production, they must ensure the successful deployment of complex workflows to ensure a high-quality experience. Our partnership with Groq demonstrates IBM’s commitment to providing customers with the most advanced technology to enable AI deployment and drive business value. ”
Groq CEO and founder Jonathan Ross said, “With Groq’s speed and IBM’s enterprise expertise, we are making autonomous AI agents a reality for businesses. We work together to enable organizations to unleash the full potential of AI driven response and achieve the performance required for scalability. In addition to speed and resilience, this collaboration is about changing the way businesses use AI, enabling them to confidently move from the experimental phase to enterprise wide adoption, and opening the door to new models where AI can act instantly and continuously learn. ”
IBM will immediately begin providing access to GroqCloud features, and the joint team of both parties will focus on delivering the following capabilities to IBM customers: high-speed, high-performance inference, unleashing the full potential of AI models and autonomous AI agents, empowering customer care, employee support, and productivity enhancement; AI deployment focused on security and privacy, aimed at supporting the strictest regulatory and security requirements, and achieving effective execution of complex workflows; Seamlessly integrated with IBM’s proprietary AI product Watsonx Orchestra, providing customers with flexibility to adopt specialized proxy patterns tailored to different use cases.
The collaboration also plans to integrate and enhance RedHat’s open-source vLLM technology with Groq’s LPU architecture, providing different solutions for common AI challenges faced by developers in the inference process. This solution is expected to enable Watsonx to leverage these capabilities in a familiar way and keep customers in their preferred tools, while accelerating inference through GroqCloud. This integration will meet the key needs of AI developers, including inference orchestration, load balancing, and hardware acceleration, ultimately simplifying the inference process.
IBM and Groq have jointly provided an enhanced pathway for enterprises to unlock the full potential of enterprise level AI that is fast, intelligent, and built to have a tangible impact.




