![](https://dw1.s81c.com//IMWUC/MessageImages/e44ea51a57ac43728a9872beed145246.jpg)
The content on this page was originally published on Instana.com and has been migrated to the community as a historical asset. As such, it may contain outdated information on our products and features. Please comment if you have questions about the content.
Scaling Microservices: General Strategies for Design and Optimization
When designing distributed systems it’s important to understand that explicit design decisions must be made to enable scalability within components. These applications must be engineered from the beginning with the requirement to meet anticipated needs with options that facilitate future growth. We build our systems in anticipation of scaling because we anticipate the platform will grow, which means more users, features, or data.
This is the first article in a series of posts where we will discuss topics which include:
- Identifying bottlenecks and refactoring
- Measuring results
- Being proactive
- Scaling up
- The AKF Kube
- And more…
![](https://dw1.s81c.com//IMWUC/MessageImages/8357127a7b2b4adbb8ecbe670bd9f11d.png)
Ensuring performance of our applications is critical. If we build systems which do not scale or slow down as usage increases our users will go to our competitors. Google discovered that an artificial delay as little as 400ms introduced to their search response would result in users conducting 0.2 to 0.6 percent fewer searches. This behavior translates into lost revenue for most businesses.
It may seem counter-intuitive but a slow web service has proven to be a more frustrating experience for users than one which is down. If a service is down, users see this immediately and typically come back later to try and complete their task. If the system is repeatedly slow, they will often leave frustrated and seek out competitors who can solve their problem without unresponsive or slow responses.
Design for Scale: Building Observability
During the design process careful consideration must be made as to how you will measure performance and scale. Building observability into your application is a deliberate process which requires open source tooling or vendor solutions. The most common metrics measured are transactions per second, transaction latency, or number of users. One or more of these metrics may be used to determine when to scale out and to measure the effectiveness of your scaling operations. If any one of these indicators begins to plateau that may be a sign that your service must be redesigned or refactored to continue its linear growth.
commons license
The most common strategy when building systems which can scale is to avoid design patterns which prevent scaling out in the future. These patterns may include stateful designs which rely on disk access to perform operations or accessing large data sets to perform complex sorting algorithms. It’s critical for designers to foresee scaling issues during the initial development however this should never be at the detriment of delivering a working solution to the customer or business. Spending too much time address scaling problems which may or may not happen is considered premature optimization and this should be avoided at all costs.
![](https://dw1.s81c.com//IMWUC/MessageImages/c473ccc78f1b4b9dacc8abc71fb49360.png)
Service Performance Analysis – Requests, Latency, Error Rate
Once our service is live, we can analyze the performance with monitoring tools to determine areas where the service can be improved. Improving performance is an ongoing process which requires precise instrumentation and a deep understanding behind the interactions our services have with other systems. In later posts we’ll discuss how to begin identifying bottlenecks, refactoring, and measuring results.
Be sure to subscribe to our newsletter to be updated about future posts in this series.
#SRE