IBM TechXchange Community Group Library

TechXchange Group

Your hub for all things community! Ask questions, connect with fellow members, get the support you need, and stay informed with the latest updates.

#Other
#TechXchangePresenter

View Only

Back to Library

3412 - KServe Deep Dive: Evolving Model Serving for the Generative AI Era

Mon October 06, 2025 12:00 AM

Sophia Antar

As generative AI continues to transform the AI landscape, the demand for scalable, efficient, and interoperable model serving infrastructure is growing rapidly. This session explores the evolution from early, custom-built deployment patterns to today’s Kubernetes-native model serving solutions. We'll unpack the unique challenges of serving large language models (LLMs) — including inference efficiency, distributed execution, KV-cache management, and cost optimization. We’re excited to announce the release of KServe v0.17, a significant milestone that brings native support for generative AI workloads; purpose-built LLMInferenceService CRD designed for advanced LLM-serving patterns such as disaggregated serving, enhanced model and KV caching, and seamless integration with Envoy AI Gateway.

Session Topic: Open Source
Industry: Cross Industry
Speaker(s): Yuan Tang

Statistics

0 Favorited

1 Views

1 Files

0 Shares

1 Downloads

Attachment(s)

3412 KServe Deep Dive Evolving Model Serving for the Gene....pdf 3.60 MB 1 version
Uploaded - Fri October 24, 2025

Download

IBM TechXchange Community Group Library

TechXchange Group

3412 - KServe Deep Dive: Evolving Model Serving for the Generative AI Era

Additional
Resources

Office

Quick Links

IBM TechXchange Community Group Library

TechXchange Group

3412 - KServe Deep Dive: Evolving Model Serving for the Generative AI Era

Additional Resources

Office

Quick Links

Additional
Resources