Introduction
In this article I will describe what the API Connect AI Gateway is and how to develop an API for this new capability.
In the last couple of years Generative AI has disrupted the technology landscape and it is evolving bringing new opportunities for businesses around the globe. However, in the same way cloud adoption requires a cloud strategy, Gen AI adoption requires a solid strategy to avoid struggles along the way. The following diagram shows a high level architecture of applications that integrate Generative AI Large Language Model services to provide end user services. A typical example will be a Web Chatbot that answers questions about specific topic to end users.

Challenges for Gen AI adoption
Large language model providers charge by token consumption which is different than typical SaaS services which usually charged by API call. Taken into consideration the high level architecture from figure 1 you realize that the following challenges will arise at some point in the adoption journey.
- Different application development groups will have individual registrations to different LLM providers
- Multiple separate purchases
- No central visibility of usage or cost
- Risk of runaway bills

Introducing API Connect AI Gateway
IBM has released a new feature on its API management plafrom, API Connect, called AI Gateway. The objective of this capability is to provide a centralized access for users and applications to the different Gen AI Large Language Model providers including IBM WatsonX ai, OpenAI and Google Gemini (more to be added to the list). The following diagram presents the high level architecture for Gen AI applications using IBM API Connect AI Gateway.

Token based policies
API Connect AI Gateway capabilities provide token based policies which includes token rate limiting by API.

Analytics
API Connect AI Gateway provides specific metrics and charts for the AI consumption including Total tokens, total requests, total

Benefits and features on API Connect AI Gateway
The AI Gateway acts a central control point for LLM providers and provides the following benefits:
- Single central subscription to the AI Model provider
- Consolidate buying power across the enterprise
- Integrated analytics to monitor and manage usage
- Control costs through specialized AI token rate limiting
The following picture shows the AI Gateway benefits.

AI Gateway APIs
From the development perspective this feature allows to develop and publish an API that specifically access one of the Generative AI providers API and method.
The list of methods that can be invoked for WatsonX ai are the following.
The list of methods that can be invoked for WatsonX ai are the following.
Develop an AI Gateway API
In this section we will go through the steps required to develop and publish an AI Gateway API
Prerequisites
To develop an AI Gateway API you need an API Connect SaaS instance in AWS and a subscription to one supported LLM provider. API Connect AI Gateway supports IBM WatsonX ai, OpenAI and Google Gemini as LLM providers.
API Connect SaaS trial instance
To obtain an API Connect SaaS trial instance on AWS go to the following link (You need an IBM Id).
API Connect SaaS Trial
WatsonX trial instance
To obtain a WatsonX.ai trial instance go to the following link (You need an IBM Id).
WatsonX.ai Trial
Create the API
In this section we will develop the AI Gateway API. The required steps are:
- Define the catalog properties
- Create the AI Gateway API
- Test the AI Gateway API
Define Catalog Properties
The first step is to define catalog properties for the LLM provider service. For WatsonX ai the catalog property name is watsonx-ai-apikey . For Open AI the catalog property name is openai-api-key . To create the catalog properties execute the following steps.
1. On the API Connect SaaS web interface click the Manage icon on the left hand side toolbar . The catalog page will be displayed.

2. Click on the Sandbox catalog
3. Click on Catalog Settings and then Catalog properties on the left toolbar.

4. On the catalog properties page click on Create to create a new property.
5. For the watsonx property enter the name watsonx-ai-apikey and for the value enter the IBM Cloud IAM api key value. Click on Create.

6. For the Open AI catalog property repeat step 5 and enter the name openai-api-key and for the value enter the value from your Open AI subscription.
Develop the AI Gateway API
To Develop an API for the AI Gateway execute the following steps.
1. On the API Connect home page click on Develop on the left hand side toolbar.

2. Click on Add – API

3. Select OpenAPI 3.0 for API Type and Create - AI Gateway and the click Next

4. For the platform select WatsonX and click Next

5. On the Create API from AI Gateway page enter the Name/Title of the API. For example WatsonXAIGatewayAPI and then click Next.

6. On the WatsonX ai authentication page enter the Project ID, API Key values and select the region (for example us-south). Click Next

7. On the next page check Create Product, Set rate limit and Set AI token limit.For the token limit interval set the time unit to minute. Click on Create.

The API has been created.
8. Click on Edit API.

9. On the Design page check you have 3 paths defined. The most important one is /text-generate. We will use this path to generate text using WatsonX LLM

10. Click on the Gateway tab
11. On the Gateway Tab click on Catalog Properties

12. Click on the + sign to add a new property. Leave the default value “untitled” and click Add

13. On the untitled property click on Update.

14. For the Catalog Name (Key) select Sandbox from the drop-down menu and click Save.

15. Click on the Test tab

16. Click on Target Configuration and set Auto-publish to On

17. The API is ready for local testing in API Connect. Click on Body and enter the following prompt.
1.
{
"model_id": "ibm/granite-3-8b-instruct",
"input": "Translate the following paragraph from English to Spanish: We are in winter and the temperatures are below zero. You need to wear a thick jacket",
"parameters": {
"decoding_method": "greedy",
"max_new_tokens": 900,
"min_new_tokens": 0,
"temperature": 0.7,
"stop_sequences": [],
"repetition_penalty": 1,
"time_limit": 5000
}
}
17. Click on Send .

18. If all has been properly configured you should receive the response from WatsonX ai and a 200 return code.

Test the AI Gateway API with Postman
In this section we will test the API with Postman. Open Postman and configure the test with the following:
- Operation -> POST
- URL -> The URL from your API in API Connect
- Authorization -> For Auth Type use API Key. For the key use X-IBM-Client-Id and for value the value from API Connect test page.

For the Body click on the Body tab, check raw and JSON as the type. Enter the following text for the body.
{
"model_id": "ibm/granite-3-8b-instruct",
"input": "<|start_of_role|>system<|end_of_role|>You are Granite, an AI language model developed by IBM in 2024. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.<|end_of_text|>\n<|start_of_role|>user<|end_of_role|>Describe how to build a Node.js application<|end_of_text|>",
"parameters": {
"max_new_tokens": 100,
"time_limit": 5000
}
}
You should have your test configured like in the following picture.

Click on Send to execute the test. The return should be similar to the one in the following picture showing a return code of 200.

You have successfully developed and tested an API for the IBM API Connect AI Gateway!
Review the AI Analytics
In this section we will review the API Connect Analytics portal and the section that is provided specifically for the AI Gateway APIs.
On the home page of API Connect click on Analytics on the left hand side toolbar. The Analytics main page will be displayed.

Click on AI usage. The summary page shows total requests and total tokens, the AI token count by date and the AI Token Heatmap.


On the AI Token count chart click on Details
The following charts will be displayed:
- Top AI consumers by token counts over time
- Top AI applications by token counts over time
- Total AI Consumers by token count
- Total AI Apps by token count
- AI Model usage
- Top AI Applications being rate limited
- AI cache hits
- AI response time percentiles

