Generating embeddings for semantic search typically requires Python scripts, REST APIs, and data pipelines moving information between systems. The latest Db2 12.1 Early Access Program (EAP) eliminates this complexity. Database developers can now call external language models directly from SQL. Embedding models, whether hosted in the cloud or deployed locally, can be registered as database objects. External scripts and data pipelines are no longer necessary to invoke models outside the database.
This early preview supports embedding models from Watsonx.ai and locally deployed OpenAI-compatible inference engines such as Ollama. Text generation models will follow. The design allows models to run on different platforms, giving teams freedom to choose where their language models live.
This update builds on the VECTOR data type introduced in Db2 12.1.2. Users could already store and query vectors, but generating embeddings required Python scripts, REST APIs, or some other form of custom coding, and data movement between systems. The new TO_EMBEDDING function brings embedding generation into SQL itself. The example use case in this post demonstrates the feature through an IT helpdesk knowledge base that performs semantic search using Watsonx.ai models.
Language models understand meaning, not just keywords. A query such as "Printer won't work" should return the same result as "Print job stuck in queue" even though the words differ. Vector embeddings capture this semantic similarity and make it searchable.
The traditional embedding generation workflow requires multiple steps. First, extract text from the database. Second, call an embedding API from Python or Node.js or some other language. Third, load the vectors back into the database. Each step added code to maintain and created potential points of failure.
Register model once → SQL INSERT with TO_EMBEDDING → SQL SELECT with vector search
Db2 now eliminates these intermediate steps. Register the external model as a database object with its credentials and endpoint, securing access to the model through standard database authorization controls over the model object within the database. The built-in TO_EMBEDDING function at Db2 calls the model directly to generate embeddings. Storing these embeddings becomes a simple SQL INSERT or UPDATE. Search queries can embed text and compare vectors in one SELECT statement.
The result: fewer moving parts, simpler deployment, and easier maintenance. SQL developers can now work with embeddings without learning API client libraries. The trade-off is that queries must wait for the API call, replacing overnight batch embedding with real-time flexibility.
Quick Start: See It Work in 30 Seconds
Want to see it work immediately? Here's the absolute minimum code:
Step 1: Register an embedding model:
CREATE EXTERNAL MODEL slate30m
PROVIDER WATSONX KEY 'your-watsonx-api-key-here'
ID 'ibm/slate-30m-english-rtrvr'
TYPE TEXT_EMBEDDING RETURNING VECTOR(384, FLOAT32)
URL 'https://us-south.ml.cloud.ibm.com/ml/v1/text/embeddings?version=2023-10-25'
PROJECT_ID 'your-project-id-1234-5678-9abc';
Step 2: Generate an embedding
VALUES TO_EMBEDDING('Hello world' USING slate30m);
That's it. You'll see a 384-dimensional vector returned immediately. Now let's build a complete semantic search system with this capability.
A Sample Use Case: Building a Semantic Search System
Step 1: Register your model (One-time setup)
Register an external embedding model from watsonx.ai:
Register an external embedding model from watsonx.ai:
CREATE EXTERNAL MODEL slate30m
PROVIDER WATSONX KEY 'your-watsonx-api-key-here'
ID 'ibm/slate-30m-english-rtrvr'
TYPE TEXT_EMBEDDING RETURNING VECTOR(384, FLOAT32)
URL 'https://us-south.ml.cloud.ibm.com/ml/v1/text/embeddings?version=2023-10-25'
PROJECT_ID 'your-project-id-1234-5678-9abc';
Description of the above parameters:
- PROVIDER WATSONX: uses IBM's Watsonx.ai as the LLM provider
- KEY: API authentication token for the provider
- ID: The specific model identifier from the provider
- TYPE TEXT_EMBEDDING: Language model type
- RETURNING VECTOR(384, FLOAT32): Output format, dimension must match embedding model's output dimension
- URL: Model API endpoint
- PROJECT_ID: User's project id at Watsonx.ai
Wrong credentials or parameter values surface when you first call TO_EMBEDDING.
Step 2: Create your table:
CREATE TABLE ANSWERS (
id INT NOT NULL GENERATED ALWAYS AS IDENTITY (START WITH 1 INCREMENT BY 1),
content CLOB(500),
embedding VECTOR(384, FLOAT32),
PRIMARY KEY (id)
);
Step 3: Insert data and generate embeddings in one step
INSERT INTO ANSWERS (content, embedding) VALUES
('Printer issue: Restart the spooler service, clear pending print jobs, and verify printer connectivity.', TO_EMBEDDING('Printer issue: Restart the spooler service, clear pending print jobs, and verify printer connectivity.' USING slate30m)),
('No internet access: Check network cables, restart router, and renew IP using ipconfig /renew.', TO_EMBEDDING('No internet access: Check network cables, restart router, and renew IP using ipconfig /renew.' USING slate30m)),
('Email sync problem: Verify Outlook settings, clear cache, and test connectivity to mail server.', TO_EMBEDDING('Email sync problem: Verify Outlook settings, clear cache, and test connectivity to mail server.' USING slate30m)),
('User login failed: Reset the password, verify account status in Active Directory, and unlock the user.', TO_EMBEDDING('User login failed: Reset the password, verify account status in Active Directory, and unlock the user.' USING slate30m)),
('Slow performance: Check CPU and memory usage, close background applications, and restart the system.', TO_EMBEDDING('Slow performance: Check CPU and memory usage, close background applications, and restart the system.' USING slate30m)),
('VPN not connecting: Verify credentials, restart VPN client, and check firewall or proxy settings.', TO_EMBEDDING('VPN not connecting: Verify credentials, restart VPN client, and check firewall or proxy settings.' USING slate30m));
Step 4: Query with semantic search:
SELECT id, content,
ROUND(VECTOR_DISTANCE(
embedding,
TO_EMBEDDING('Document stuck in print queue, printer not working' USING slate30m),
COSINE
), 4) AS score
FROM ANSWERS
ORDER BY score ASC
FETCH FIRST 2 ROWS ONLY;
TO_EMBEDDING call makes a network request to the external model. Lower distance scores mean higher similarity. The user's phrasing differs from stored answers, but vectors capture semantic relationships.
Another search—"Cannot browse websites or access internet"—should surface the internet access troubleshooting answer despite different terminology.
Managing access to the registered model at Db2:
-- Grant usage
GRANT USAGE ON EXTERNAL MODEL slate30m TO USER helpdesk_agent;
-- Check permissions
SELECT * FROM SYSCAT.EXTERNALMODELAUTH WHERE EXTERNALMODELNAME = 'SLATE30M';
-- Revoke access
REVOKE USAGE ON EXTERNAL MODEL slate30m FROM USER helpdesk_agent;
Drop model:
DROP EXTERNAL MODEL slate30m;
Users need USAGE privilege to call TO_EMBEDDING. Grant to users, groups, roles, or PUBLIC. Each call consumes API quota from your model provider.
Getting Started
Download the EAP: https://early-access.ibm.com/software/support/trial/cst/programwebsite.wss?siteId=824&h=&tabId=
Requirements:
- Db2 12.1 EAP access
- External model provider account with API credentials
- Project or space ID (for cloud providers)
- Model ID
When you call TO_EMBEDDING, Db2 makes HTTPS requests to your external model provider with your text and returns embeddings. Consider whether this data flow meets your security requirements.
This feature bridges external language models and databases through SQL. The helpdesk example shows registration, embedding generation, and similarity search. Whether this fits your architecture depends on where you prefer embedding logic and how you handle external API dependencies.
Conclusion
This capability transforms how database developers build AI applications. What traditionally required multiple languages, frameworks, and deployment pipelines now executes as native SQL. The helpdesk example demonstrates this clearly: six lines of SQL replace approximately 50 lines of Python, API clients, and synchronization logic.
As we extend support to text generation models and additional providers, the database becomes a complete AI application platform. Future EAPs will include ALTER statement support, enhanced catalog views, and expanded model capabilities.
Download the EAP, build your first semantic search system, and share your feedback. Early adopters will directly influence what reaches general availability.
Authors
Shaikh Quader
AI Architect and Master Inventor, IBM Db2
Shaikh Quader is AI Architect and Master Inventor at IBM Db2, where he designs and implements AI systems for enterprise database platforms. With 20 years at IBM, he transitioned from software development to AI research and engineering in 2016. Shaikh leads technical AI initiatives at IBM and directs collaborative research programs with Canadian universities. He writes regularly on LinkedIn and Substack about AI implementation, productivity systems, and knowledge work transformation. His work has resulted in multiple patents and publications in database AI applications. Currently completing his PhD at York University, his research examines practical AI integration in relational database systems. His expertise spans machine learning deployment, database optimization, and enterprise AI architecture for solving complex data challenges.
Advisory Software Developer, Db2 Development - Query Compiler
Software Developer, Db2 AI Team
Suvarsha Akkireddy is a Software Developer on the Db2 team at IBM. She works on projects involving AI integration, chatbot development, and vector search, with a focus on building innovative solutions using data and AI technologies.