Why monitoring vector database matters?
Vector databases are the backbone of modern AI applications. They support complex similarity searches and manage high-dimensional vector data. As these AI applications scale, real-time insights into the database performance become essential to ensure reliability and optimize resource usage.
Monitoring vector databases is especially important for modern AI applications that use Large Language Models (LLMs) and Retrieval Augmented Generation (RAG). These databases serve as the AI’s critical knowledge base. Their performance affects the accuracy, relevance, and speed of their responses.
Poor vector database performance can lead to inaccurate similarity searches, slow query latency, and unreliable AI output (commonly known as “hallucinations”). These issues reduce the overall user experience. Proactive monitoring ensures data integrity, optimizes resource utilisation, and allows for early detection of issues, guaranteeing that AI systems operate reliably and efficiently. It’s the key to maintaining the high performance and trustworthiness expected from today’s intelligent applications.
Why use vector databases?
The rise of unstructured data and the rapid development of machine learning that can effectively transform unstructured data into vector embeddings for more efficient processing, analytics, and comparisons. Vector embeddings are a more compact and meaningful data representation that can capture patterns, relationships, and underlying structures.
However, due to their high-dimensional nature, traditional databases often struggle to efficiently manage vector embeddings. As a result, specialized vector databases have emerged. These databases are designed to handle this data type.
Vector embeddings
Vector embeddings are numerical representations of data points that convert various types of data, including nonmathematical data such as text, audio, or images. Machine learning models use these embeddings to understand and process the data.
Vector storage
Vector databases store the output of an embedding model, the vector embeddings. By ingesting and storing these embeddings, the database can facilitate fast retrieval of a similarity search, matching the user’s prompt with a similar vector embedding.
Milvus DB
Milvus is an open-source vector database management system. It efficiently stores and searches large-scale and dynamic vector data. Milvus is built on top of Facebook Faiss, an open-source C++ library for vector similarity search.
With Milvus, you can define an environment where you can efficiently create, manage, and query vector data easily. This support helps developers build intelligent applications that need fast and scalable access to vector-based information.
Milvus Observability with Instana
You can now monitor your Milvus database seamlessly with IBM Instana. Export traces from the Milvus application to Instana to analyze calls and gain insights on your vector database operations.
Fig 1: Milvus Observability with Instana
By using OpenTelemetry with Instana, you can collect traces for Milvus database operations such as create, insert, upsert, and delete.
To start collecting traces, install traceloop by running the following command:
pip install traceloop-sdk
You can connect to Milvus locally in many ways. The following method is one of the ways to connect by using Docker.
Follow these steps to set up Milvus using Docker Compose.
Step 1: Install Docker
Make sure that you install Docker on your system. You can download the installer from the official Docker website.
Step 2: Install Milvus by using Docker Compose
Milvus provides a Docker compose configuration file in the Milvus repository.
a. Download the Docker configuration file by running the following command:
wget https://github.com/milvus-io/milvus/releases/download/v2.3.3/milvus-standalone-docker-compose.yml -O docker-compose.yml
b. Start Milvus by running the following command:
docker compose up -d
You will get the following output:
Creating milvus-etcd ... done
Creating milvus-minio ... done
Creating milvus-standalone ... done
Step 3: Verify Milvus containers
After you start Milvus, the following containers will be up and running:
- milvus-standalone
- milvus-minio
- milvus-etcd
You can check whether the containers are up and running by using the following command:
docker ps
You will get the following output:
Name Command State Ports
--------------------------------------------------------------------------------------------------------------------
milvus-etcd etcd -advertise-client-url ... Up 2379/tcp, 2380/tcp
milvus-minio /usr/bin/docker-entrypoint ... Up (healthy) 9000/tcp
milvus-standalone /tini -- milvus run standalone Up 0.0.0.0:19530->19530/tcp, 0.0.0.0:9091->9091/tcp
To install dependencies for the Milvus sample application, run the following command:
pip install pymilvus ibm-watsonx-ai langchain-ibm
To access the Watsonx models used in the Milvus sample, export the following credentials:
export WATSONX_URL=<watsonx-url>
export WATSONX_API_KEY=<watsonx-iam-api-key>
export WATSONX_PROJECT_ID=<watsonx-project-id>
Configure your environment to export traces to Instana either through an agent mode or agentless mode (directly to the Instana backend).
For agent mode
export TRACELOOP_BASE_URL=<instana-agent-host>:4317
export TRACELOOP_HEADERS="api-key=DUMMY_KEY"
For agentless mode
export TRACELOOP_BASE_URL=<instana-otlp-endpoint>:4317
export TRACELOOP_HEADERS="x-instana-key=<agent-key>,x-instana-host=<instana-host>"
Additionally, if the endpoint of the Instana backend otlp-acceptor or agent is not TLS-enabled, set the following environment variable to true:
export OTEL_EXPORTER_OTLP_INSECURE=true
The GitHub sample application demonstrates how to connect to Milvus, insert data, and perform CRUD (create, read, update, and delete) operations, which you can run to collect traces.
Run the sample application to verify the installation and configuration.
python WatsonxEmbeddingMilvus.py
Analyzing traces in Instana
You can create an application perspective in Instana to view trace information that is collected from the LLM application runtime. Complete the following steps:
- In the Instana UI, open the New Application Perspective wizard in one of the following ways:
- On the Instana dashboard, in the Applications section, click Add application.
- From the navigation menu, click Applications to open the Applications dashboard. Then, click Add, and select New Application Perspective.
2. Select Services or Endpoints and click Next.
3. Click Add filter and choose a service name. You can select multiple services and endpoints by using OR conditions. The service name is specified by the app_name parameter in Traceloop.init(). For example, Watsonx_Embeddings_MilvusClient.
4. In the Application Perspective Name field, enter a name for the LLM application perspective. Then, click Create.
5. Instana creates a new application perspective.
To view trace information, go to the navigation menu in the Instana UI and click Analytics. On the Analytics dashboard, here, you can use application, service, and endpoint to analyze calls. Instana presents the data by breaking it down into service, endpoint, and call names. You can filter and group traces or calls by using arbitrary tags, such as filtered by
‘Trace->Service Name’ equals Watsonx_Embeddings_MilvusClient.
The traces that are collected from the preceding code are displayed in the Instana UI.
The snapshots show the traces in Instana UI by running the preceding sample application:
Fig 2: Milvus get traces
Fig 3: Milvus insert traces
Fig 4: Milvus query traces
Fig 5: Milvus search traces
#Tracing