Data Integration

Data Integration

Connect with experts and peers to elevate technical expertise, solve problems and share insights.

 View Only

Making Backend APIs Approachable: A Framework for Natural Language Queries

By Shiv Prakash Yadav posted 2 days ago

  

APIs are at the heart of modern web applications, allowing different software systems to communicate with each other. However, interacting with an API typically requires technical knowledge of the API's schema, methods, and how to send requests. For non-technical users or even internal users who are unfamiliar with the specifics of an API, this can create a barrier to seamless communication.

In this blog, we’ll explore a solution to this problem: building a framework that allows backend APIs to understand natural language queries and translate them into structured API calls. This solution enables users to interact with backend systems in plain English, making APIs more approachable and intuitive.

The Problem: Complexity of API Interactions

Consider the following situation:

You’re working with an API that provides resources like users, orders, and products. To interact with this API, you need to know its endpoint structure and request body format. For example:

  • GET /users to fetch a list of users

  • POST /users to create a new user with data like name, email, etc.

  • DELETE /users/{user_id} to delete a user

For a non-developer or someone unfamiliar with the API's documentation, such interactions may seem confusing.

This is where our framework comes into play. Instead of requiring users to know the exact HTTP methods and parameters, the goal is to allow them to use natural language to make API requests. For example, they could say:

  • "Create a new user named John Doe with email john@example.com
  • "List all users"
  • "Delete user with ID 123"

With this approach, you allow the user to interact with the backend system in a human-readable way, without requiring knowledge of the underlying technical details.

Solution: The Approachable API Middleware Framework

We’ve built a middleware framework that acts as a bridge between natural language inputs and HTTP API calls. The key features of this framework are:

  1. Query Parsing (Intent and Entity Extraction): Convert user queries into intents (e.g., create, get, update, delete) and entities (e.g., user names, IDs, emails).
  2. Request Generation: Automatically map the user’s query to the corresponding API request format (e.g., HTTP method, body).
  3. Response Formatting: Convert the raw API response into human-readable text.

The goal of this framework is to make it easy for developers to integrate natural language processing (NLP) into their systems without having to deal with the complexity of large language models or training sophisticated AI systems.

Let’s dive deeper into how this works.

How It Works: The Inner Components

The middleware consists of a few key components: Resource Definitions, NLP Processor, Request Generator, and Response Formatter. Let’s break each of these down and walk through how they work.

1. Resource Definitions

The Resource Definitions specify the available resources, the supported HTTP methods, and how to map user queries to those methods. This allows the system to understand what each resource represents and how to make requests.

For example, let’s define a users resource.

# resource_definition.py

class ResourceDefinition:
    def __init__(self):
        self.resources = {
            "users": {
                "intents": ["create", "get", "delete"],
                "http_methods": {
                    "GET": {
                        "description": "Fetch user details",
                        "request_sample": None  # GET doesn't need a body, just params
                    },
                    "POST": {
                        "description": "Create a new user",
                        "request_sample": {
                            "name": "<name>",
                            "email": "<email>",
                            "age": "<age>"
                        }
                    },
                    "DELETE": {
                        "description": "Delete a user",
                        "request_sample": {
                            "user_id": "<user_id>"
                        }
                    }
                },
                "entity_definitions": {
                    "user_id": {"type": "string", "description": "Unique identifier for the user"},
                    "name": {"type": "string", "description": "Name of the user"},
                    "email": {"type": "string", "description": "Email of the user"},
                    "age": {"type": "integer", "description": "Age of the user"}
                }
            }
        }

    def get_resource_info(self, resource):
        return self.resources.get(resource)
    
    def get_supported_methods(self, resource):
        return self.resources.get(resource, {}).get("http_methods", {})
    
    def get_entity_definition(self, resource):
        return self.resources.get(resource, {}).get("entity_definitions", {})

  • Intents: The available actions users can perform on a resource, like create, get, delete.

  • HTTP Methods: The HTTP methods supported by each intent (e.g., GET for fetch, POST for create).

  • Entity Definitions: The structure of the entities required to interact with the resource (e.g., name, email, user_id).

2. NLP Processor (Query Parsing)

Next, we need to process user queries. The NLP Processor is responsible for extracting intents (e.g., create, get) and entities (e.g., user_id, name) from a natural language query. For this, we use SpaCy, a popular Python library for natural language processing.

# nlp_processor.py
import spacy

class NLPProcessor:
    def __init__(self):
        self.nlp = spacy.load("en_core_web_sm")

    def process_query(self, query):
        doc = self.nlp(query)
        intent = self.extract_intent(doc)
        entities = self.extract_entities(doc)
        return intent, entities

    def extract_intent(self, doc):
        if "create" in [token.lemma_ for token in doc]:
            return "create"
        elif "get" in [token.lemma_ for token in doc]:
            return "get"
        elif "delete" in [token.lemma_ for token in doc]:
            return "delete"
        return "unknown"

    def extract_entities(self, doc):
        entities = {}
        for ent in doc.ents:
            entities[ent.label_] = ent.text
        return entities

  • Intent Extraction: Looks for action words like "create", "get", "delete" in the user query.

  • Entity Extraction: Uses SpaCy to extract named entities (e.g., John Doe, 123, etc.) from the query.

3. Request Generator

Once we’ve parsed the query and identified the intent and entities, we can use this information to generate the correct API request.


# request_generator.py
import requests

class RequestGenerator:
    def __init__(self, base_url, resource_definition):
        self.base_url = base_url
        self.resource_definition = resource_definition

    def generate_request(self, resource, intent, entities):
        resource_info = self.resource_definition.get_resource_info(resource)
        supported_methods = resource_info.get("http_methods", {})
        
        if intent not in supported_methods:
            return {"error": f"Unsupported intent: {intent} for resource {resource}"}
        
        http_method_info = supported_methods.get(intent.upper())
        request_sample = http_method_info.get("request_sample", {})
        
        for key, value in entities.items():
            if key in request_sample:
                request_sample[key] = value
        
        return self._send_request(resource, intent, request_sample)
    
    def _send_request(self, resource, intent, body):
        url = f"{self.base_url}/{resource}"
        method = getattr(requests, intent.lower())
        response = method(url, json=body)
        return response.json()

  • Generate API Request: This method uses the resource definition and the parsed intent and entities to generate an API request in the appropriate format.

  • HTTP Method Handling: It automatically uses the correct HTTP method (GET, POST, DELETE) based on the intent.

4. Response Formatter

Once the request is sent and a response is received, the Response Formatter converts the API’s JSON response into a human-readable string.

# response_formatter.py
class ResponseFormatter:
    def process_response(self, api_response):
        if isinstance(api_response, dict):
            if "error" in api_response:
                return f"Error: {api_response['error']}"
            elif "data" in api_response:
                return self.format_data(api_response["data"])
        return "Invalid API response."

    def format_data(self, data):
        if isinstance(data, list):
            if len(data) == 0:
                return "No results found."
            return f"Found {len(data)} items: {', '.join([str(item) for item in data])}"
        elif isinstance(data, dict):
            return f"Data: {', '.join([f'{key}: {value}' for key, value in data.items()])}"
        return "Unexpected data format."

Translate JSON to English: Converts the API response (usually a dictionary or list) into a human-readable string.

Putting It All Together

Finally, the Middleware ties all the components together to process user queries, generate API requests, and format responses.

# middleware.py
from resource_definition import ResourceDefinition
from nlp_processor import NLPProcessor
from request_generator import RequestGenerator
from response_formatter import ResponseFormatter

class Middleware:
    def __init__(self, base_url):
        self.base_url = base_url
        self.resource_definition = ResourceDefinition()
        self.nlp_processor = NLPProcessor()
        self.request_generator = RequestGenerator(base_url, self.resource_definition)
        self.response_formatter = ResponseFormatter()

    def process_query(self, query):
        intent, entities = self.nlp_processor.process_query(query)
        api_response = self.request_generator.generate_request("users", intent, entities)
        return self.response_formatter.process_response(api_response)

Comparison with Existing Solutions

  • GraphQL: While GraphQL allows flexible querying, it still requires users to know the syntax and structure of queries. The Approachable API Middleware allows users to interact using natural language, making it much more user-friendly.

  • Watson Assistant/LLMs: These solutions are more complex and require more setup, including API keys, training models, and possibly high costs. Our middleware is lightweight and can be easily integrated into existing systems.

Conclusion: Why This Approach Works

This approach simplifies the way users interact with APIs. Instead of having to understand complex API schemas or HTTP methods, users can simply ask in natural language and the backend system understands and responds accordingly.

This framework provides:

  • A lightweight solution: No need to train models or use complex cloud services.

  • Flexibility: Easily plug into existing APIs with minimal changes to the codebase.

  • User-centric: It makes backend systems more accessible, reducing the technical barrier.

Link to the Repository

Yout can find the full source code for this project in the following Gihub repository:

Github: approachable-api-middleware

0 comments
126 views

Permalink