IBM Storage Insights (SI) is a cloud service that monitors, analyzes, and optimizes storage across heterogeneous environments. It exposes a REST API so teams can automate inventory, capacity, performance, and alert workflows at scale. In this post we’ll walk through how to turn the SI REST API into a runnable workflows.
Why workflows (and not just endpoints)?
Real tasks are multi-step: identify systems → filter by health/capacity → fetch details → take action. Encoding those steps as a reusable workflow makes them testable, documentable, and shareable across teams, instead of living as fragile one-off scripts. Workflows also create a stable contract between product, platform, and ops: “this is how we accomplish X, end-to-end.”
OpenAPI is a machine-readable contract that describes your API—paths, parameters, schemas, auth, and errors. An SDK turns HTTP details into ergonomic, typed methods and models for your language. It accelerates adoption (fewer curl snippets, fewer auth/header bugs) and makes refactors safer because types and method names change in lockstep with the spec. For SI, a Python SDK means analysts and SREs can script storage use-cases quickly and predictably.
Arazzo is an OAI specification for describing multi-step workflows that call one or more API operations. You declare inputs, ordered steps bound to OpenAPI operations (via operationId), success criteria, and outputs—so the “story” of a task is explicit and executable. In this blog, we’ll capture SI scenarios (e.g., rank block systems by volume count, surface alert hotspots) as Arazzo files that live next to the spec.
Speakeasy consumes your OpenAPI to generate idiomatic SDKs (here: Python) and can also generate runnable code/tests from Arazzo workflows. This gives you two automation layers: the client (SDK) for day-to-day coding and the workflows/tests for validation and smoke checks.
TL;DR — Setup & Quickstart
1. Install the tools
i. OpenAPI CLI (validates OpenAPI & Arazzo)
brew install openapi
openapi --version
ii. Speakeasy CLI (generates SDKs; can run Arazzo tests)
curl -fsSL https://go.speakeasy.com/cli-install.sh | sh
~/.speakeasy/bin/speakeasy --version
~/.speakeasy/bin/speakeasy auth login
Now you can:
-
validate spec/workflows: openapi spec validate …, openapi arazzo validate …
-
generate the Python SDK: speakeasy run --schema openapi/openapi.yaml --target sdk --lang python --out sdks/python
Follow these steps from GitHub repo.
2. Create a Python env & install deps
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -U pip
pip install -r requirements.txt
3. Configure environment
Create .env :
SI_BASE_URL="https://insights.ibm.com"
SI_TENANT_ID="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
SI_API_KEY="***your long-lived API key***" # used to mint x-api-token
The repo has src/auth/token_manager.py and src/config.py; wrappers import from these so calls automatically send x-api-token.
4. Get the OpenAPI spec and validate
# update/replace if you need a fresh copy
curl -sSf "$SI_BASE_URL/restapi/openapi.yaml" -o openapi/openapi.yaml
# optional but recommended
openapi spec validate openapi/openapi.yaml
5. Author or tweak a workflow (Arazzo)
Workflows live in workflows/. Example files already present:
Validate:
openapi arazzo validate workflows
6. Generate the Python SDK (Speakeasy)
# install CLI once
curl -fsSL https://go.speakeasy.com/cli-install.sh | sh
~/.speakeasy/bin/speakeasy auth login
# generate SDK from your OpenAPI
~/.speakeasy/bin/speakeasy run \
--schema ./openapi/openapi.yaml \
--target sdk --lang python \
--out ./sdks/python
7. Use the wrappers (mapped to workflows)
Wrappers are under src/workflows/ (e.g., block_storage.py, block_alerts.py). A typical call shape looks like:
# example usage pattern
from src.workflows.block_storage import get_block_storage_volume_leaders
top = get_block_storage_volume_leaders(limit=5)
for s in top:
print(s["systemId"], s["volsCount"])
Run your tests to sanity-check:
pytest -q
8. Execute workflows as runnable tests from Arazzo
If you want executable flows alongside docs:
~/.speakeasy/bin/speakeasy lint arazzo -f workflows
~/.speakeasy/bin/speakeasy test
By default these use a mock; set SI_BASE_URL and a valid token (the repo’s TokenManager can mint one) to hit real endpoints.
What you have now: the SI OpenAPI under version control, Arazzo files that describe real multi-step tasks, a generated Python SDK, and lightweight wrappers that align code with your workflows—ready to extend with more storage scenarios.
Arazzo workflows for IBM Storage Insights
Single-step workflow — Block Storage: Volume Leaders
We’ll start with a single-step workflow: List all block storage systems and rank them by volume count, returning the top N. and shape the response into a clean, reusable output. This shows the smallest useful Arazzo file—one input, one operationId, one output—and how it maps directly to a Python wrapper.
Steps (bound to OpenAPI)
-
listBlockSystems → GET /restapi/v1/tenants/{tenant_uuid}/storage-systems?storage-type=block
-
operationId: listBlockStorageSystems (replace with the exact one from your spec)
-
successCriteria: $statusCode == 200
-
Output extraction: systems array → normalize storageSystems vs data.
Arazzo (workflows/block-storage-volume-leaders.arazzo.yaml):
arazzo: "1.0.1"
info:
title: "Block Storage: Volume Leaders"
version: "1.0.0"
sourceDescriptions:
- name: si
url: ../openapi/openapi.yaml
type: openapi
workflows:
- workflowId: blockVolumeLeaders
summary: "List block storage systems and return the top N by volumes"
inputs:
type: object
properties:
limit: { type: integer, minimum: 1, default: 5 }
required: [limit]
steps:
- stepId: listBlockSystems
operationId: listBlockStorageSystems # adjust to your spec
parameters:
- name: storage-type
in: query
value: block
successCriteria:
- condition: $statusCode == 200
outputs:
systems: $response.body.storageSystems ?? $response.body.data
outputs:
topN: $steps.listBlockSystems.outputs.systems
Run it:
# env
source .venv/bin/activate
set -a && source .env && set +a
# python wrapper call
python - <<'PY'
from src.workflows.block_storage import get_block_storage_volume_leaders
top = get_block_storage_volume_leaders(limit=5)
print({"count": len(top), "top": top})
PY
Example output:
{
"count": 5,
"top": [
{"systemId": "sys-9a2f", "name": "ProdArray-A1", "volsCount": 32821},
{"systemId": "sys-41d7", "name": "ProdArray-B2", "volsCount": 32470},
{"systemId": "sys-72cc", "name": "ProdArray-C3", "volsCount": 32202},
{"systemId": "sys-8f30", "name": "ProdArray-D4", "volsCount": 31882},
{"systemId": "sys-61b9", "name": "ProdArray-E5", "volsCount": 30916}
]
}
Multi-step workflow — Block Storage: Alert Hotspots
Next, let's find noisy systems over a time window by counting alerts (critical > warning > info). Return a ranked list and fetch the latest alert feed for the top system.
Steps (bound to OpenAPI)
-
listBlockSystems → GET /restapi/v1/tenants/{tenant_uuid}/storage-systems?storage-type=block
-
listAlerts → GET /restapi/v1/tenants/{tenant_uuid}/alerts?severity=critical&severity=warning&severity=info&duration={duration}
-
listSystemAlerts → GET /restapi/v1/tenants/{tenant_uuid}/storage-systems/{storage_system_uuid}/alerts?duration={duration}
Arazzo (workflows/block-alert-hotspots.arazzo.yaml)
arazzo: "1.0.1"
info:
title: "Block Storage: Alert Hotspots"
version: "1.0.0"
sourceDescriptions:
- name: si
url: ../openapi/openapi.yaml
type: openapi
workflows:
- workflowId: blockAlertHotspots
summary: "Rank block systems by recent alerts and fetch feed for the noisiest system"
inputs:
type: object
properties:
duration: { type: string, default: "7d" } # adapt to your API's time filter
limit: { type: integer, minimum: 1, default: 5 }
required: [duration, limit]
steps:
- stepId: listBlockSystems
operationId: listBlockStorageSystems
parameters:
- name: storage-type
in: query
value: block
successCriteria:
- condition: $statusCode == 200
outputs:
systems: $response.body.storageSystems ?? $response.body.data
- stepId: listAlerts
operationId: listAlerts
parameters:
- { name: severity, in: query, value: critical }
- { name: severity, in: query, value: warning }
- { name: severity, in: query, value: info }
- { name: duration, in: query, value: $inputs.duration }
successCriteria:
- condition: $statusCode == 200
outputs:
alerts: $response.body.alerts ?? $response.body.data
- stepId: listTopSystemAlerts
operationId: listStorageSystemAlerts
parameters:
- { name: storage_system_uuid, in: path, value: $steps.listBlockSystems.outputs.systems[0].systemId } # placeholder; compute from ranked
- { name: duration, in: query, value: $inputs.duration }
successCriteria:
- condition: $statusCode == 200
outputs:
topSystemAlerts: $response.body.alerts ?? $response.body.data
outputs:
rankedSystems: $steps.listAlerts.outputs.alerts
topSystemAlerts: $steps.listTopSystemAlerts.outputs.topSystemAlerts
Run it
python - <<'PY'
from src.workflows.block_alerts import analyze_block_alert_hotspots
res = analyze_block_alert_hotspots(duration="7d", limit=5)
print("Ranked systems (top 5):")
for s in res.ranked_systems:
c = s.counts.to_dict()
print(s.system_id, s.name, c, "score=", s.score)
print("\nTop system latest alerts (truncated):")
for a in res.top_system_alerts[:5]:
print(a.get("timestamp"), a.get("severity"), a.get("type"), a.get("summary"))
PY
Example output:
Ranked systems (top 5):
sys-9a2f ProdArray-A1 {'critical': 3, 'warning': 11, 'info': 2, 'total': 16} score= 341
sys-41d7 ProdArray-B2 {'critical': 2, 'warning': 7, 'info': 1, 'total': 10} score= 221
sys-72cc ProdArray-C3 {'critical': 1, 'warning': 5, 'info': 0, 'total': 6 } score= 150
sys-8f30 ProdArray-D4 {'critical': 1, 'warning': 2, 'info': 0, 'total': 3 } score= 120
sys-61b9 ProdArray-E5 {'critical': 0, 'warning': 6, 'info': 2, 'total': 8 } score= 62
Top system latest alerts (truncated):
2025-11-12T08:15:33Z CRITICAL CAPACITY "Pool XYZ at 92% utilization"
2025-11-12T07:50:12Z WARNING GENERAL "Background scrub delayed"
2025-11-12T07:45:05Z WARNING CAPACITY "Volume V-1234 approaching limit"
2025-11-12T07:22:11Z INFO GENERAL "Config checkpoint complete"
2025-11-12T07:05:43Z WARNING PERFORMANCE "High latency on host H-77"
Conclusion
By treating the IBM Storage Insights REST API as our system of record, we get a single, reliable contract to build on. From there, the path is straightforward:
-
OpenAPI as the source of truth (validated and versioned)
-
Arazzo to capture real, multi-step workflows in plain YAML
-
Speakeasy to turn the contract into an ergonomic Python SDK (and optionally generate runnable workflow code/tests)
This stack replaces scattered scripts with repeatable, reviewable, and executable flows. Teams gain faster onboarding (typed SDK), clearer intent (Arazzo workflows), and safer change management (spec-driven generation).
Next steps
-
Add more SI workflows (capacity hotspots, performance outliers, inventory diffs).
-
Evolve wrappers to match your Arazzo inputs/outputs and return typed results.
-
Expand testing with mock + real runs; use severity-weighted scoring or domain rules where it helps.
-
Layer in automation later to validate specs/workflows and refresh SDKs on change.
All code and examples from this post are in the GitHub repo vibe coded using Codex:
👉 Randhir123/storage-insights-workflows