...
spec:
config:
env:
- name: ENABLE_PROFILING_SERVER
value: "true"
Save the changes. This change will trigger the creation of a new BTS-Operator pod with profiling endpoints enabled. The process is also documented in the BTS Knowledge Center in [1].
The profiling endpoints are now available on the BTS-Operator pod and must be exposed to accessible to the outside world. An easy way to do this is port-forwarding. Use following command to expose the profiling port from the pod to your local machine:
$ oc port-forward pod/ibm-bts-operator-controller-manager-6bcc8d5cfb-45ztt 8082
Forwarding from 127.0.0.1:8082 -> 8082
Forwarding from [::1]:8082 -> 8082
The profiling endpoints are now available on port 8082
the hostname localhost
of your local machine.
Generating and Collecting Profiling Data
In order to generate CPU and Memory profiles from the running BTS-Operator, the pprof
APIs must be called.
Here is an example of creating a CPU and Memory profile using curl:
$ curl -s "http://127.0.0.1:8082/debug/pprof/heap" > ~/heap-profile.out
$ curl -s "http://127.0.0.1:8082/debug/pprof/profile" > ~/cpu-profile.out
The commands will get the profiling data from the pprof
server and save it locally into a file.
To diagnose Memory leaks and Out-Of-Memory (OOM) errors, it makes sense to periodically generate these files and save them for later analysis. As the heap slowly gets larger, it is impossible to predict when the exact OOM error will occur.
The following shell script will take a CPU and memory dump in a pre-defined period:
#!/bin/bash
# Check if period and target directory are provided
if [ -z "$1" ] || [ -z "$2" ]; then
echo "Usage: $0 <period-in-seconds> <target-directory>"
exit 1
fi
PERIOD="$1"
LOG_DIR="$2"
# Create target directory if it doesn't exist
mkdir -p "$LOG_DIR"
echo "Starting profiling loop with a period of $PERIOD seconds..."
echo "Profiles will be saved to $LOG_DIR"
while true; do
TIMESTAMP=$(date +"%Y%m%d-%H%M%S")
echo "Collecting profiles at $TIMESTAMP..."
curl -s "http://127.0.0.1:8082/debug/pprof/heap" > "$LOG_DIR/heap-profile-$TIMESTAMP.out"
curl -s "http://127.0.0.1:8082/debug/pprof/profile" > "$LOG_DIR/cpu-profile-$TIMESTAMP.out"
sleep "$PERIOD"
done
It can be used like this:
$ ./go-profile.sh 60 ~/bts-operator-profile
Starting profiling loop with a period of 60 seconds...
Profiles will be saved to /Users/bwende/logs/bts-operator-profile
Collecting profiles at 20250627-153532...
Collecting profiles at 20250627-153704...
This will take a CPU and Memory dump every 60 seconds until any issue occurs, like the pod goes into OOM error. The latest profile data can then be used to analyze OOM issues. It is also possible to compare the dumps with each other and see if there are any certain objects that sum up.
Analyzing Profile Data
When optimizing the performance of a Go application, profiling is one of the most powerful tools available. Go makes this especially developer-friendly through it's built-in pprof
tool, which allows you to visualize CPU and memory (heap) usage in both text-based and graphical formats. This chapter focuses on using the graphical user interface (GUI) provided by the go tool pprof -http command, which brings deep insights to life through interactive visualizations.
So let's jump right into opening the Profiling User Interface:
go tool pprof -http=:8080 ~/logs/bts-operator-profile/heap-profile-20250630-083751.out
This command will open a new Browser Window and show the pprof
UI:
The screenshot shows a heap profile call graph generated by Go's pprof web UI, providing a visual breakdown of memory allocations in the BTS-Operator executable. Each box represents a function, with it's size and color intensity indicating how much memory it has allocated — darker red means higher memory usage. At the center, we see that (*ConfigMap).Unmarshal
is the top memory consumer, responsible for over 31% of total heap allocations, making it a primary candidate for optimization. The arrows between nodes illustrate function call relationships, showing how memory usage propagates through the call stack. This graph makes it easy to trace high-memory paths and pinpoint the exact sources of inefficient allocations in the code.
This screenshot shows a CPU profile call graph from the pprof
web UI for the BTS-Operator, visualizing how the application consumed processor time during execution. Each box represents a function, with arrows indicating call relationships and labels showing how much CPU time (in milliseconds) was spent in each function. The highlighted box controller.(*ConfigMapReconciler).Reconcile
is at the root, showing it initiated most of the observed activity. Functions like runtime futex
, json stateT
, and client.(*Client).Do
appear as notable contributors to CPU time, indicating possible targets for optimization or deeper inspection. This graphical view makes it easy to identify the hottest code paths, understand execution flow, and optimize CPU-heavy logic within the operator.
The memory flame graph in Go's pprof web interface is one of the most powerful visual tools for analyzing heap usage and memory allocations in your application. It helps developers quickly identify which functions are responsible for the majority of memory consumption and how allocations flow through the call stack.
Each horizontal bar in the flame graph represents a function, and it's width indicates the amount of memory allocated by that function and it's callees. The wider the bar, the more memory it consumed. Bars are stacked vertically to show the call hierarchy, meaning functions at the top called those below. This layout gives you a top-down view of how memory usage propagates through the application.
By default, the graph is sorted so that the most memory-intensive call paths appear in the center. You can hover over any bar to see exact byte counts and percentages, and click to zoom into specific branches of the call tree. This makes it easy to drill down into complex allocation paths and isolate problems like memory bloat, excessive allocations, or potential leaks.
An in-depth discussion about (CPU-) flame graphs can be found in [2]
Summary
The article introduced CPU and memory profiling for Go-based Kubernetes operators, focusing on the BTS-Operator. It explained how profiling was enabled using pprof, how profiling data was collected via HTTP, and how a shell script was used to automate periodic dumps for diagnosing memory leaks and performance issues. The analysis section demonstrated how the pprof web interface was used to examine heap call graphs, CPU execution paths, and memory flame graphs to identify and understand optimization opportunities.
Reference
[1] https://www.ibm.com/docs/en/cloud-paks/foundational-services/4.13.0?topic=service-troubleshooting
[2] https://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html