Global Storage Forum

Global Storage Forum

Connect, collaborate, and stay informed with insights from across Storage

 View Only

OpenAPI Spec to Workflows for IBM Storage Insights: Design with Arazzo, Generate with Speakeasy

By Randhir Singh posted 15 hours ago

  

IBM Storage Insights (SI) is a cloud service that monitors, analyzes, and optimizes storage across heterogeneous environments. It exposes a REST API so teams can automate inventory, capacity, performance, and alert workflows at scale. In this post we’ll walk through how to turn the SI REST API into a runnable workflows.

Why workflows (and not just endpoints)?

Real tasks are multi-step: identify systems → filter by health/capacity → fetch details → take action. Encoding those steps as a reusable workflow makes them testable, documentable, and shareable across teams, instead of living as fragile one-off scripts. Workflows also create a stable contract between product, platform, and ops: “this is how we accomplish X, end-to-end.”

OpenAPI is a machine-readable contract that describes your API—paths, parameters, schemas, auth, and errors. An SDK turns HTTP details into ergonomic, typed methods and models for your language. It accelerates adoption (fewer curl snippets, fewer auth/header bugs) and makes refactors safer because types and method names change in lockstep with the spec. For SI, a Python SDK means analysts and SREs can script storage use-cases quickly and predictably.

Arazzo is an OAI specification for describing multi-step workflows that call one or more API operations. You declare inputs, ordered steps bound to OpenAPI operations (via operationId), success criteria, and outputs—so the “story” of a task is explicit and executable. In this blog, we’ll capture SI scenarios (e.g., rank block systems by volume count, surface alert hotspots) as Arazzo files that live next to the spec.

Speakeasy consumes your OpenAPI to generate idiomatic SDKs (here: Python) and can also generate runnable code/tests from Arazzo workflows. This gives you two automation layers: the client (SDK) for day-to-day coding and the workflows/tests for validation and smoke checks. 

TL;DR — Setup & Quickstart

1. Install the tools

i. OpenAPI CLI (validates OpenAPI & Arazzo)

brew install openapi
openapi --version

ii. Speakeasy CLI (generates SDKs; can run Arazzo tests)

curl -fsSL https://go.speakeasy.com/cli-install.sh | sh
~/.speakeasy/bin/speakeasy --version
~/.speakeasy/bin/speakeasy auth login

Now you can:

  • validate spec/workflows: openapi spec validate …, openapi arazzo validate …

  • generate the Python SDK: speakeasy run --schema openapi/openapi.yaml --target sdk --lang python --out sdks/python

Follow these steps from GitHub repo.

2. Create a Python env & install deps

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -U pip
pip install -r requirements.txt

3. Configure environment

Create .env :

SI_BASE_URL="https://insights.ibm.com"       
SI_TENANT_ID="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
SI_API_KEY="***your long-lived API key***"   # used to mint x-api-token

The repo has src/auth/token_manager.py and src/config.py; wrappers import from these so calls automatically send x-api-token.

4. Get the OpenAPI spec and validate

# update/replace if you need a fresh copy
curl -sSf "$SI_BASE_URL/restapi/openapi.yaml" -o openapi/openapi.yaml

# optional but recommended
openapi spec validate openapi/openapi.yaml

5. Author or tweak a workflow (Arazzo)

Workflows live in workflows/. Example files already present:

  • workflows/block-storage-volume-leaders.arazzo.yaml

  • workflows/block-alert-hotspots.arazzo.yaml

Validate:

openapi arazzo validate workflows

6. Generate the Python SDK (Speakeasy)

# install CLI once
curl -fsSL https://go.speakeasy.com/cli-install.sh | sh
~/.speakeasy/bin/speakeasy auth login

# generate SDK from your OpenAPI
~/.speakeasy/bin/speakeasy run \
  --schema ./openapi/openapi.yaml \
  --target sdk --lang python \
  --out ./sdks/python

7. Use the wrappers (mapped to workflows)

Wrappers are under src/workflows/ (e.g., block_storage.py, block_alerts.py). A typical call shape looks like:

# example usage pattern
from src.workflows.block_storage import get_block_storage_volume_leaders

top = get_block_storage_volume_leaders(limit=5)
for s in top:
    print(s["systemId"], s["volsCount"])

Run your tests to sanity-check:

pytest -q

8. Execute workflows as runnable tests from Arazzo

If you want executable flows alongside docs:

~/.speakeasy/bin/speakeasy lint arazzo -f workflows
~/.speakeasy/bin/speakeasy test

By default these use a mock; set SI_BASE_URL and a valid token (the repo’s TokenManager can mint one) to hit real endpoints.

What you have now: the SI OpenAPI under version control, Arazzo files that describe real multi-step tasks, a generated Python SDK, and lightweight wrappers that align code with your workflows—ready to extend with more storage scenarios.

Arazzo workflows for IBM Storage Insights

Single-step workflow — Block Storage: Volume Leaders

We’ll start with a single-step workflow: List all block storage systems and rank them by volume count, returning the top N. and shape the response into a clean, reusable output. This shows the smallest useful Arazzo file—one input, one operationId, one output—and how it maps directly to a Python wrapper.

 

Steps (bound to OpenAPI)

  1. listBlockSystemsGET /restapi/v1/tenants/{tenant_uuid}/storage-systems?storage-type=block

    • operationId: listBlockStorageSystems (replace with the exact one from your spec)

    • successCriteria: $statusCode == 200

    • Output extraction: systems array → normalize storageSystems vs data.

Arazzo (workflows/block-storage-volume-leaders.arazzo.yaml):

arazzo: "1.0.1"
info:
  title: "Block Storage: Volume Leaders"
  version: "1.0.0"

sourceDescriptions:
  - name: si
    url: ../openapi/openapi.yaml
    type: openapi

workflows:
  - workflowId: blockVolumeLeaders
    summary: "List block storage systems and return the top N by volumes"
    inputs:
      type: object
      properties:
        limit: { type: integer, minimum: 1, default: 5 }
      required: [limit]

    steps:
      - stepId: listBlockSystems
        operationId: listBlockStorageSystems   # adjust to your spec
        parameters:
          - name: storage-type
            in: query
            value: block
        successCriteria:
          - condition: $statusCode == 200
        outputs:
          systems: $response.body.storageSystems ?? $response.body.data

    outputs:
      topN: $steps.listBlockSystems.outputs.systems

Run it:

# env
source .venv/bin/activate
set -a && source .env && set +a

# python wrapper call
python - <<'PY'
from src.workflows.block_storage import get_block_storage_volume_leaders
top = get_block_storage_volume_leaders(limit=5)
print({"count": len(top), "top": top})
PY

Example output:

{
  "count": 5,
  "top": [
    {"systemId": "sys-9a2f", "name": "ProdArray-A1", "volsCount": 32821},
    {"systemId": "sys-41d7", "name": "ProdArray-B2", "volsCount": 32470},
    {"systemId": "sys-72cc", "name": "ProdArray-C3", "volsCount": 32202},
    {"systemId": "sys-8f30", "name": "ProdArray-D4", "volsCount": 31882},
    {"systemId": "sys-61b9", "name": "ProdArray-E5", "volsCount": 30916}
  ]
}

Multi-step workflow — Block Storage: Alert Hotspots

Next, let's find noisy systems over a time window by counting alerts (critical > warning > info). Return a ranked list and fetch the latest alert feed for the top system.

Steps (bound to OpenAPI)

  1. listBlockSystemsGET /restapi/v1/tenants/{tenant_uuid}/storage-systems?storage-type=block

    • operationId: listBlockStorageSystems

  2. listAlertsGET /restapi/v1/tenants/{tenant_uuid}/alerts?severity=critical&severity=warning&severity=info&duration={duration}

    • operationId: listAlerts

  3. listSystemAlertsGET /restapi/v1/tenants/{tenant_uuid}/storage-systems/{storage_system_uuid}/alerts?duration={duration}

    • operationId: listStorageSystemAlerts

    • Parameter storage_system_uuid comes from the top-ranked system from steps (1)+(2).

Arazzo (workflows/block-alert-hotspots.arazzo.yaml)

arazzo: "1.0.1"
info:
  title: "Block Storage: Alert Hotspots"
  version: "1.0.0"

sourceDescriptions:
  - name: si
    url: ../openapi/openapi.yaml
    type: openapi

workflows:
  - workflowId: blockAlertHotspots
    summary: "Rank block systems by recent alerts and fetch feed for the noisiest system"
    inputs:
      type: object
      properties:
        duration: { type: string, default: "7d" } # adapt to your API's time filter
        limit: { type: integer, minimum: 1, default: 5 }
      required: [duration, limit]

    steps:
      - stepId: listBlockSystems
        operationId: listBlockStorageSystems
        parameters:
          - name: storage-type
            in: query
            value: block
        successCriteria:
          - condition: $statusCode == 200
        outputs:
          systems: $response.body.storageSystems ?? $response.body.data

      - stepId: listAlerts
        operationId: listAlerts
        parameters:
          - { name: severity, in: query, value: critical }
          - { name: severity, in: query, value: warning }
          - { name: severity, in: query, value: info }
          - { name: duration, in: query, value: $inputs.duration }
        successCriteria:
          - condition: $statusCode == 200
        outputs:
          alerts: $response.body.alerts ?? $response.body.data

      - stepId: listTopSystemAlerts
        operationId: listStorageSystemAlerts
        parameters:
          - { name: storage_system_uuid, in: path, value: $steps.listBlockSystems.outputs.systems[0].systemId } # placeholder; compute from ranked
          - { name: duration, in: query, value: $inputs.duration }
        successCriteria:
          - condition: $statusCode == 200
        outputs:
          topSystemAlerts: $response.body.alerts ?? $response.body.data

    outputs:
      rankedSystems: $steps.listAlerts.outputs.alerts
      topSystemAlerts: $steps.listTopSystemAlerts.outputs.topSystemAlerts

Run it

python - <<'PY'
from src.workflows.block_alerts import analyze_block_alert_hotspots
res = analyze_block_alert_hotspots(duration="7d", limit=5)

print("Ranked systems (top 5):")
for s in res.ranked_systems:
    c = s.counts.to_dict()
    print(s.system_id, s.name, c, "score=", s.score)

print("\nTop system latest alerts (truncated):")
for a in res.top_system_alerts[:5]:
    print(a.get("timestamp"), a.get("severity"), a.get("type"), a.get("summary"))
PY

Example output:

Ranked systems (top 5):
sys-9a2f ProdArray-A1 {'critical': 3, 'warning': 11, 'info': 2, 'total': 16} score= 341
sys-41d7 ProdArray-B2 {'critical': 2, 'warning': 7,  'info': 1, 'total': 10} score= 221
sys-72cc ProdArray-C3 {'critical': 1, 'warning': 5,  'info': 0, 'total': 6 } score= 150
sys-8f30 ProdArray-D4 {'critical': 1, 'warning': 2,  'info': 0, 'total': 3 } score= 120
sys-61b9 ProdArray-E5 {'critical': 0, 'warning': 6,  'info': 2, 'total': 8 } score= 62

Top system latest alerts (truncated):
2025-11-12T08:15:33Z CRITICAL CAPACITY "Pool XYZ at 92% utilization"
2025-11-12T07:50:12Z WARNING  GENERAL  "Background scrub delayed"
2025-11-12T07:45:05Z WARNING  CAPACITY "Volume V-1234 approaching limit"
2025-11-12T07:22:11Z INFO     GENERAL  "Config checkpoint complete"
2025-11-12T07:05:43Z WARNING  PERFORMANCE "High latency on host H-77"

Conclusion

By treating the IBM Storage Insights REST API as our system of record, we get a single, reliable contract to build on. From there, the path is straightforward:

  • OpenAPI as the source of truth (validated and versioned)

  • Arazzo to capture real, multi-step workflows in plain YAML

  • Speakeasy to turn the contract into an ergonomic Python SDK (and optionally generate runnable workflow code/tests)

This stack replaces scattered scripts with repeatable, reviewable, and executable flows. Teams gain faster onboarding (typed SDK), clearer intent (Arazzo workflows), and safer change management (spec-driven generation).

Next steps

  • Add more SI workflows (capacity hotspots, performance outliers, inventory diffs).

  • Evolve wrappers to match your Arazzo inputs/outputs and return typed results.

  • Expand testing with mock + real runs; use severity-weighted scoring or domain rules where it helps.

  • Layer in automation later to validate specs/workflows and refresh SDKs on change.

All code and examples from this post are in the GitHub repo vibe coded using Codex:
👉 Randhir123/storage-insights-workflows

0 comments
14 views

Permalink