watsonx.data

 View Only

Getting Started with IBM watsonx.data Milvus

By Swati Karot posted 15 days ago

  

Getting Started with IBM watsonx.data Milvus

1. Introduction

With Milvus integrated into IBM watsonx.data, users can now leverage advanced vector search capabilities with enterprise-grade scalability, reliability and security. 

To kick off your Milvus journey on watsonx.data, please follow these prerequisites:

1.     2. Adding watsonx.data instance

Milvus is available as part of watsonx.data suite, thus we first need to provision watsonx.data instance.

Login to IBM Cloud or on prem and click on ‘Create resource’ button on top right corner of home page.


Or type ‘watsonx.data’ in the search bar at the top centre.


Click on watsonx.data catalog and you can provision free Lite plan instance for trial.



After provisioning, go to the hamburger symbol on top left and click on ‘Resource List’.


There, expand ‘Databases’ and you should be able to see the watsonx.data instance you just provisioned.


2.     Adding Milvus Service

After provisioning watsonx.data, now we can provision Milvus. watsonx.data Milvus is available on both OnPrem and IBM Cloud. Click on the provisioned watsonx.data instance, click on ‘Open web console’ button to land into watsonx.data console.


You would now be able to see the watsonx.data console.


Use the watsonx.data console to provision a Milvus instance in watsonx.data. Within the ‘Infrastructure components’ tile click on ‘Engine/Services’ or click on ‘Infrastructure Manager’ from the left menu.


Click on ‘Add Component’ button on extreme top right corner of the Infrastructure manager and select ‘Services’ -> ‘Milvus’.

For the details to be filled while creating Milvus, please refer the below documentation on Saas and on prem:

·       watsonx.data on IBM Cloud: Adding Milvus service.

·       watsonx.data on Cloud Pak for Data (on-prem). Follow the same steps as for watsonx.data stand-alone: Adding a Milvus service.

NOTE: Remember to keep your credentials secure and follow best practices for managing your connections.

3. Connecting to watsonx.data Milvus

3.1 On IBM Cloud

After Milvus provisioning completes (which may take few mins), you should see a tile with ‘Services’ and Milvus 2.4.0 (Display name that you used while provisioning) inside it.


Click on the Milvus tile to see connection details. Grab the grpc host and port from there (required in Milvus SDK connection).



Before connecting to the Milvus service on IBM Cloud, ensure you have the following:

·       A client side SDK: Milvus supports SDKs for Java, Go, Python (PyMilvus), and Node.js. For this blog post, we are using PyMilvus, which is the most popular among them. Install the PyMilvus package to interact with the Milvus service. For more information, see About PyMilvus.

·       If on jupyter notebook run this command and restart the kernel. If on python script, execute this command on terminal.

!pip install pymilvus

This will install the python SDK for Milvus on your environment.

·       Hostname and Port: Obtain the Milvus instance’s hostname and port from the Infrastructure Manager.

·       Authorized User Credentials: Ensure you have the necessary credentials to access the Milvus instance.

For the cloud-based solution, you have two options to connect to Milvus:

a) Using API Key:

For generating API key, complete the following steps:

1.     In the IBM Cloud console, go to Manage > Access (IAM) > API keys.

2.     From the left menu, click API key.

3.     Enter a name and description for your API keys.


1.     Click Create.

2.     Then, click Show to display the API key. Or, click Copy to copy and save it for later, or click Download.

Now we will look the code required on the backend (here python) to connect to watsonx.data Milvus. This is the only part which will be slightly different from open-source Milvus and thus is crucial.

In a Jupyter notebook or python script, use these values in the connection code as below:

from pymilvus import connections, utility
print("start connecting to Milvus"))
connections.connect(
    host="<grpc-host>",
    port="<port>",
    secure=True,
    server_name="<grpc-host>",
    user="ibmlhapikey",
    password="<api-key>"
)
has = utility.has_collection("hello_milvus")
print(f"Does collection hello_milvus exist in Milvus: {has}")

b) Using URI:

print("start connecting to Milvus")
connections.connect(
    alias="default",
    uri="https://<grpc-host>:<grpc-port>",
    user="ibmlhapikey",
    password="<api-key>"
)
has = utility.has_collection("hello_milvus")
print(f"Does collection hello_milvus exist in Milvus: {has}")

c) Using MilvusClient
With Milvus 2.4 release, they introduced ‘MilvusClient ‘which is a wrapper on top of ‘connections’. If we are planning to use ‘MilvusClient’ the below syntax must be used.

from pymilvus import MilvusClient, DataType

       milvus_uri = "https://<user>:<password>@<host>:<port>"

       client = MilvusClient(

       uri=milvus_uri,

       secure=True

       )

NOTE: Remember to use host name without ‘https:’ included for MilvusClient.

3.2 Connecting to Milvus on prem

Before connecting to the Milvus service on Cloud Pak for Data (CPD) on-premises, below are the prerequisites:

·       A client side SDK: Milvus supports SDKs for Java, Go, Python (PyMilvus), and Node.js. For this blog post, we are using PyMilvus. Install the PyMilvus package to interact with the Milvus service. Ensure you have the PyMilvus package installed.

·       Hostname and Port: Obtain the grpc hostname and port for the Milvus server from the web console.

·       Certificates: Acquire the self-signed certificate from the Milvus server.

·       Authorized User Credentials: Ensure you have valid credentials to access the Milvus server.

For the on-premises solution, follow these steps:

Step 1: Provision Milvus Service First, provision a Milvus service in watsonx.data through the web console.

Step 2: Obtain Self-Signed Certificate Run the following command from the terminal of the client machine where PyMilvus SDK is installed:

echo QUIT | openssl s_client -showcerts -connect <grpc-host>:443 | awk '/-----BEGIN CERTIFICATE-----/ {p=1}; p; /-----END CERTIFICATE-----/ {p=0}' > milvus_grpc.cert

Step 3: Connect Using Python SDK (PyMilvus). Use the following Python code to connect to Milvus:

from pymilvus import connections
connections.connect(
    alias='default',
    secure=True,
    server_pem_path=’<path/to/milvus_grpc.cert>’,
    server_name="<grpc-host>",
    host='<grpc-host>',
    port='443',
    user='<CPD_username>',
    password='<CPD_password>'
)

With Milvus 2.4 release, they introduced ‘MilvusClient ‘which is a wrapper on top of ‘connections’. If we are planning to use ‘MilvusClient’ the below syntax must be used.

from pymilvus import MilvusClient, DataType

milvus_uri = https://<user>:<password>@<grpc-host>:<port>

client = MilvusClient(

            uri=milvus_uri,

            secure=True,

            server_name=’<grpc-host>’,

            server_pem_path=’<path/to/milvus_grpc.cert>’  

)

Conclusion:

To wrap up, we've explored how to connect to Milvus, setting the foundation for leveraging its powerful capabilities. In the next blog, we’ll dive into one of the most exciting features of Milvus—performing similarity searches. Stay tuned as we uncover how to find insights and connections in your data like never before!

Start your free trial with IBM watsonx.data for free.

Read more:


#watsonx.data

0 comments
38 views

Permalink