IBM TechXchange Community Group Library

TechXchange Group

Your hub for all things community! Ask questions, connect with fellow members, get the support you need, and stay informed with the latest updates.


#Other
#TechXchangePresenter

 View Only

3412 - KServe Deep Dive: Evolving Model Serving for the Generative AI Era 

Mon October 06, 2025 12:00 AM

As generative AI continues to transform the AI landscape, the demand for scalable, efficient, and interoperable model serving infrastructure is growing rapidly. This session explores the evolution from early, custom-built deployment patterns to today’s Kubernetes-native model serving solutions. We'll unpack the unique challenges of serving large language models (LLMs) — including inference efficiency, distributed execution, KV-cache management, and cost optimization. We’re excited to announce the release of KServe v0.17, a significant milestone that brings native support for generative AI workloads; purpose-built LLMInferenceService CRD designed for advanced LLM-serving patterns such as disaggregated serving, enhanced model and KV caching, and seamless integration with Envoy AI Gateway.

Session Topic: Open Source
Industry: Cross Industry
Speaker(s): Yuan Tang

Statistics
0 Favorited
1 Views
1 Files
0 Shares
1 Downloads
Attachment(s)
pdf file
3412 KServe Deep Dive Evolving Model Serving for the Gene....pdf   3.60 MB   1 version
Uploaded - Fri October 24, 2025