DemandTeq © 2025-2026 All Rights Reserved.
Scaling Generative AI with Confidence: LLM-d and OpenShift for Distributed Inference
As large language models grow in capability, they also grow in complexity—requiring GPU memory and compute beyond what most single systems can provide. For infrastructure and operations teams, this creates new challenges around deployment, scheduling, cost management, and reliability.
In this session, we’ll introduce LLM-d, an open, Kubernetes-native framework for distributed inference. You’ll learn how Red Hat is leading efforts across the community to shape LLM-d into a scalable, operator-friendly platform for production GenAI.
We’ll demonstrate how LLM-d integrates into OpenShift AI, supports multi-GPU workloads, and provides:
Event details
Date: Thursday, 11 September 2025
Time: 10:30 AM IST | 1 PM SGT | 3 PM AEST
Speakers
DemandTeq © 2025-2026 All Rights Reserved.
Scaling Generative AI with Confidence: LLM-d and OpenShift for Distributed Inference
As large language models grow in capability, they also grow in complexity—requiring GPU memory and compute beyond what most single systems can provide. For infrastructure and operations teams, this creates new challenges around deployment, scheduling, cost management, and reliability.
In this session, we’ll introduce LLM-d, an open, Kubernetes-native framework for distributed inference. You’ll learn how Red Hat is leading efforts across the community to shape LLM-d into a scalable, operator-friendly platform for production GenAI.
We’ll demonstrate how LLM-d integrates into OpenShift AI, supports multi-GPU workloads, and provides:
Event details
Date: Thursday, 11 September 2025
Time: 10:30 AM IST | 1 PM SGT | 3 PM AEST
Speakers