IBM Storage Defender

IBM Storage Defender

Early threat detection and secure data recovery

 View Only

Day3-KubeCon2025-North-America

By Tony Pearson posted 4 hours ago

  
People at conference

Day 3 of KubeCon2025 started with keynote sessions, followed by breakout sessions. This is the final day, and ends in the early afternoon, for those who have to (or want to) travel home today.

Keynote: Scaling Smarter: Simplifying Multicluster AI with KAITO and KubeFleet
Jorge Palma, Microsoft Auzre Kubernetes Services, presented KubeFleet, a new CNCF project, in combination with Kubernetes AI Toolchain Operator (KAITO) that I discussed on Day 2. KubeFleet lets you distribute workloads across multiple Kubernetes clusters. This can be useful for AI inferencing.
Keynote: Predictive Scaling and Capacity Planning at Amazon.com
Chunpeng Wang and Artur Souza of Amazon talked about how Amazon (the retail e-commerce site) uses Amazon (Web Services) to run their business. There are two days that have peak sales. The first is Black Friday, the day after Thanksgiving that became the busiest shopping day in the United States since the 1980s.
The second is Amazon's Prime Day, a self-imposed peak, which has a similar spike in computer traffic, typically held in July. One of the advantages of Prime Day is that it acts like a "dress rehearsal" for Black Friday, and helps them tune their agility, capacity planning and dynamic scaling in response to e-commerse activity. 
Did you know other countries also have their busiest retail sales on Black Friday? What? For example, Brazil has adopted this tradition, even though they do not have a Thanksgiving Thursday to go with it. The speakers talked how they had planned for the spike in traffic, but were also aware that sports games that many Brazilians would watch, would alter the traffic patterns.
AI may be the lead singer, but you still need the band
Clyde Seepersad, Linux Foundation Education, presented the common trend that management goes after the latest "shiny new object". In this case, a lot of people consider AI, and GenAI in particular, as the shiny new object. However, Clyde warns, don't forget that we have a lot of Linux and Kubernetes infrastructure to work on, that might be needed to support new AI initiatives.
Feature Flags Suck! The problems and how to avoid them
Pete Hodgson, independent software delivery consultat, explained what "Feature Flags" are, and the problems that come with them. There are times where you need an if-then-else piece of code to do a small feature. To help keep track of these, some people have created Feature Flag databases. Now, some code has thousands of feature flags, and not managing these can cause disaster.
Evicted! All the ways Kubernetes kils your pods and how to avoid them
I have experienced this first-hand. My GenAI application container would get killed while it was waiting for a response from IBM watsonx API. Why? I finally figured out and had to build a queue processor for long-running tasks. The speaker, Ahmet Alp Balkan, is a Senior Staff Software Engineer at LinkedIn, where they run over Kubernets oon over 500,000 bare-metal nodes! Ahmet explained all the ways the evictions happen, and how this relates to the "Pod Disruption Budget"(PDB).
A few years ago, IBM sent an expert to help with a sales call to LinkedIn, but he came dressed in coat and tie. He was kicked out of the building. Then, they asked me to fly out to California to try again, and I wore IBM-logo tee-shirt, jeans and sneakers. I was warmly received and the call went well. I could tell the folks at LinkedIn employees enjoyed working there.
From Code to Cluster: Orchestrating 100,000+ Kubernetes Deployments with 1 Pipeline
Andrada Raducanu, DevOps Engineer at ING Hubs Romania, presented their success story using Red Hat OpenShift on Azure and on-premises. Their 1,400 in-house APIs reached 100K+ deployments in six months, using a single CI/CD pipeline.
How did they do this? They combined Red Hat Ansible, Open Policy Agent, Horizontal Pod AutoScaler (HPA), Helm rollback mechanisms, CertManager, Prometheus monitoring, and developed their own QuotaAustoscaler. The advantage is that this is "agnostic" to the target system, so developers don't have to know if something is deployed on-premises or Cloud provider.

As predicted, the weather has warmed up, and we are now back in the 60s and 70s Fahrenheit, the sun is shining and looks to be nice the rest of the week!

#Kubecon

0 comments
1 view

Permalink