View Only

How to Get Cloud Optimization Done Right, and Done Continuously

By Dina Henderson posted Tue May 31, 2022 12:52 PM

[Republished from Turbonomic.com, Authored by Rick Ochs]

The public cloud is at the heart of nearly every digital transformation. Organizations are modernizing their mission-critical applications & running them in the cloud to achieve elasticity and business value. Whether these applications are net new, re-architected, re-platformed, or simply re-hosted, achieving your business goals in the cloud requires a firm commitment to changing mindsets and culture within your IT and Engineering organizations. 

But cloud optimization is not easy. Embracing the opportunity to use cloud as a living, breathing ecosystem that scales, grows, and shrinks, can add tremendous value to your company’s business. While the public cloud providers advertise lower entry costs for agility, too often enterprises get caught in the trap of using legacy fixed-provisioning models that we all grew up with in our careers. Over-provisioning resources for a 5-year fixed lifespan on hardware orders used to be one of the best tricks in the book to protect our applications and our product revenue streams. With the public cloud’s OpEx billing models, the risk to paying per minute for vastly under-utilized resources contends with the daily need to ensure our applications are fulfilling their business-value commitments. The not-so-shocking result: organizations are over allocating resources as a method of risk mitigation—after all, performance and great end-user experiences are paramount. 

Optimizing your cloud estate is more than just scaling workloads, buying reservations, suspending workloads, and cleaning up un-used objects. If we, as an industry, commit to using cloud to its true elastic potential, we can implement an ongoing model where automation and analytics can continuously ensure our cloud environment is always performant, and always free of waste. Optimization is not a one-time activity, but a new way of running your IT business, day in and day out. We no longer need to run workloads at 4x the size they need to be for a two-week holiday spike in demand. The cloud gives us the ability to size resources on the fly to achieve our business goals. It also provides the ability to serve your needs without waste—you can pay for what you need, not just what you (over) allocate. 

Using automation and analytics is critical to bringing this elasticity to bear—not just to one application, but to your entire cloud portfolio: your VMs, your individual disks, your database instances, your Kubernetes clusters. Even though manually rightsizing a few VMs at a time can provide some short-term satisfaction for your business, lasting success in the cloud means utilizing the opportunity for a true software-defined approach: everything can be managed via automation, pipelines, approval processes, and hyper-intelligent software. 

We hope you’ll join us on this journey to achieving cloud elasticity with automation—and we’d like to show you how we do it: continuously, putting application performance and the end-user experience first, and without cloud waste.  

Cloud Compute Optimization  

Turbonomic’s cloud computing optimization capabilities automatically determine and scale to the correct VM instance type and family for your cloud application workloads. Many compute families have different processor models, features, capacity limits, quotas, support for different network or storage technologies, and more. Knowing when to pick between those different families can be extremely difficult, especially when it is critical to find the most optimal CPU selection and memory capacity together with all of the other constraints on sizing a VM.

scale virtual machine actionFigure 1: Scale Virtual Machine action.

IOPS and Throughput Aware Scaling 

To scale cloud compute most effectively for performance at minimal cost, you must account for the different IOPs and throughput limits across the different cloud virtual machine sizes and families. Some families have very different IOPS and throughput capacities that could potentially limit scaling success or hurt performance, so understanding these limits along with their historical utilization is critical to find lasting success with scaling workloads. 

IOPS throughput aware scaling

Figure 2. An IOPS- and throughput-aware scaling action.

Reservation and Discount-Aware Scaling 

Controlling cloud costs is top of mind for customers. It’s why so many turn to reserved instances or discounts from cloud providers for the promise of up to 72% savings. Reservations might look complex at first sight, but they are the easiest way to save in the cloud. Buying reservations can be done at any step of cloud adoption and is frequently used as a first aid for growing costs. When buying reservations customers face different financial models, multiple configurations, and pressing demands from the business side to have a strategic plan. However, managing reservations does not have to be a time-consuming manual process. Our platform can natively manage reservations from first purchase to ongoing daily optimization and finally re-purchase of soon-to-be expired reservations. The platform analyzes the on-demand virtual machine estate, suspension schedule, real-time scaling, virtual machine size flexibility, reservation scope, and reservation inventory to give the user the best-in-class recommendations. Automating reservations management saves valuable time, and results in significant cost optimization.  

With automated RI-aware scaling and purchase recommendations Turbonomic helps in the following ways: 

  1. Helps you purchase the first order of reservations 
  2. Optimizes the use of your current reservations 
  3. Helps you re-purchase soon-to-be-expired reservations 
  4. Continues reservation purchases for growing environments

reservation aware scaling action

Figure 3. A reservation-aware scaling action.



RI purchasing action

Figure 4. An RI purchasing action, increasing RI coverage to 98%!

Cloud Storage Optimization 

Optimization of this resource is often overlooked, resulting in cloud bills that could have been avoided, as well as risks to application performance by way of throughput starvation and IOPS. Often enterprise environments will have 3-4 disks per VM or more, making optimization at the storage layer of your ecosystem overwhelming to tackle. Turbonomic’s analytics find performance improvements by moving to storage tiers that fit the workload better, whether that is IOPS-intensive or throughput-intensive, while also reducing cost in the same action. Turbonomic evaluates throughput and IOPs demand to scale to the perfect storage solution for each individual disk: 

  • Scale between storage tiers 
  • Scale within storage tiers 
  • Size up volumes for higher IOPs capacity 
  • Modify IOPs or Throughput capacity on IO1/IO2/Azure Ultra, with no downtime 

action to scale a volume

Figure 5. An action to scale a volume that improves performance, reduces cost, and is non-disruptive. 🤯

Automatically Delete Unused Volumes 

As your enterprise cloud environment matures, you will experience deleted VMs when projects end or workloads are no longer needed. An unfortunate side effect of managing the cloud environment at the VM level is that the disks these VMs are attached to are often not deleted when the VM is, and your bill continues to accrue on these un-used disks. Sensitive application or customer data might also reside on these unattached disks, and Turbonomic will detect and automate deleting these disks. You can also discover the age of these unattached disks, as well as the last VM they were attached to, for compliance tracking purposes. 

PaaS Optimization 

Scaling DBaaS for Performance and Cost 

Cloud databases are one of the most highly adopted PaaS services- our world runs on data. Cloud based PaaS database services are incredibly powerful, with redundancy and fault tolerance built in natively. The side effect of this powerful service is the average cost per database is significant- individual databases can be priced as high as $20,000 a month. The need to correctly size these instances is crucial, as they drive a critical component of your business revenue and application value. The ability to scale databases, often times with zero downtime, allows us to take advantage of PaaS more completely. We can scale database resources exactly when the application needs them, without paying for un-used database capacity when it doesn’t. When we consider the hundreds or thousands of databases in an enterprise environment, the value of elastic scaling becomes mind boggling. 

Azure SQL Database Scaling 

Turbonomic optimizes Azure SQL Databases, continuously evaluating real-time demand and generating actions to: 

    • Scale Between Azure Database Tiers 
    • Scale Within Azure Database Tiers 
    • Size Up/Down Database Volumes 

azure database scaling action

Figure 6. An Azure SQL database scaling action. 

AWS RDS Scaling—coming soon! 

Support for AWS RDS scaling will be out later this year. Turbonomic will continuously analyze vCPU, vMem, DB Cache Hit Rate, Storage Amount, and IOPS, in order to generate specific scale up or scale down actions, which include: 

    • A change in the compute tier 
    • A change in the storage tier 
    • A change in the Storage Amount 
    • A change in the Provisioned IOPS (for the io1 storage type) 
    • Any combination of these actions, with only one downtime 

amazon scaling action

Figure 7. An Amazon RDS scaling action.

Kubernetes Optimization—Any Cloud, Any Flavor 

Our 2021 State of Multicloud Report found that for 61% of organizations containerization will play a strategic role within 18 months, today it is already strategic for nearly 20%. Customers are seeing transformational results when modernizing to cloud native applications running on Kubernetes: speed to market and agility are the most immediate gains. But past the first few applications, operating at scale is extremely complex. Here too, Turbonomic provides the intelligence and automation to dynamically scale resourcing for performance and efficiency, actions include: 

  • Container Rightsizing: Scale container limits/requests up or down based on application demand—execute in real-time or with a DevOps workflow.  
  • Continuous Pod Moves: Automatically (and non-disruptively) move pods to avoid resource congestion and defragment the cluster.  
  • Intelligent Cluster Scaling: When pods have too little or too much cluster capacity; Turbonomic will give the action to provision / suspend nodes. 
  • Container Planning: Simulate how to optimize the existing environment to unlock capacity for growth; onboard more applications faster! 

Turbonomic supports all upstream versions of Kubernetes. Learn more about managing Kubernetes at production scale 

Automate Cloud Optimization for Cloud Elasticity  

Turbonomic will optimize cloud compute, cloud storage, cloud databases, as well as any upstream version of Kubernetes in the cloud. In addition to offering the smartest cloud optimization, Turbonomic uniquely gives our customers ways to mobilize their organization to drive greater automation. 

Application-Awareness—Bridge the Gap Between LOB and Cloud Ops 

One of the biggest challenges Cloud teams face when they want to automate cloud optimization is the ability to demonstrate to Application and Product Owners that what they’re doing won’t hurt their applications and will instead ensure great customer experiences. Customers can see that as demand fluctuates, response-time stays low, while Turbonomic continuously optimizes cloud resources.  

Percentile-Based Scaling  

Performance degradation or cloud cost overruns are more likely to occur when scaling to peaks or averages—more often the latter, performance is paramount after all. Turbonomic uses percentile-based scaling to help customers achieve true cloud elasticity.  

Customizable Observation Periods 

Dynamic, fluctuating demand is true for just about every application. With Turbonomic, customers can configure observation periods to ensure that Turbonomic is analyzing data that accounts for their unique business cycles.  

So that’s what we do in the cloud—continuously, putting application performance and the end-user experience first, and without cloud waste. If you’re ready to make Turbonomic part of your digital transformation, check out Try Turbonomic, a self-service demo that you can explore at your own pace.