watsonx.data

Put your data to work, wherever it resides, with the hybrid, open data lakehouse for AI and analytics

View Only

Back to Blog List

Exploring VTS (Vector Transport Service): An Open-Source Tool for Moving Vector Data

By Gifi Siby posted 21 days ago

Introduction

Vector Transport Service (VTS) is an open-source tool designed to simplify the migration and synchronisation of vector data across a wide range of platforms. It supports moving data from popular sources like Elasticsearch, Qdrant, PostgreSQL, Pinecone, and more into vector databases such as Milvus and Zilliz Cloud. VTS offers both real-time streaming and offline batch import modes, making it adaptable to different use cases.

GitHub Address: https://github.com/zilliztech/vts

Core Capabilities of VTS

VTS inherits the high throughput and low latency characteristics of Apache SeaTunnel, while extending support to vector and unstructured data. This makes it a powerful tool for building AI application data pipelines, enabling real-time synchronisation, transformation, and loading of vector data efficiently.

Key capabilities include:

Rich, extensible connectors
Unified stream and batch processing for real-time synchronisation and offline batch imports
Distributed snapshot support for ensuring data consistency
High performance, low latency, and scalability
Real-time monitoring and visual management

Primary Use Cases

Vector database migration: A core strength of VTS is its ability to migrate vector data, essential for AI and machine learning applications that handle large volumes of high-dimensional data.
AI application data pipelines: Build scalable pipelines tailored to AI workloads.
Real-time vector data synchronisation
VTS supports the ingestion of raw or semi-structured text data (e.g., JSON, CSV), and can convert it into vectors using embedding model plugins.
Cross-platform data integration: VTS enables seamless data migration between traditional relational databases and modern vector databases.

Vector Transport Service also introduces vector-specific capabilities such as:

Support for multiple data sources
Schema matching
Basic data validation

Supported Connectors

VTS supports a wide range of connectors, making it compatible with various data sources and storage systems. Current supported connectors include (but are not limited to):

Milvus
Pinecone
Qdrant
PostgreSQL
Elasticsearch
Tencent Vector DB

Supported Transforms

VTS provides flexible data transformation operations, allowing users to preprocess or restructure data before migration. Example include:

TablePathMapper – for renaming tables or changing table paths
FieldMapper – for adding or deleting columns
Embedding – for applying text vectorisation or generating vector representations of text

Supported Data Types

VTS can handle a variety of complex data types and operations, including:

Float Vectors
Sparse Float Vectors
Multi-vector columns
Dynamic columns
Upsert and Bulk Insert (optimised for large offline batches)

These capabilities enhance its effectiveness in managing sophisticated data migration workflows, especially in AI and vector-based systems.

Supported Deployments

The tool is compatible with both SaaS and On-Prem deployment environments.

How does Vector Transport Service (VTS) work?

Prerequisites

Docker installed
Access to source and target databases
Required credentials and permissions
Milvus Version >= 2.3.6

Obtain VTS

Pull the VTS Image

docker pull zilliz/vector-transport-service:latest
docker run -it zilliz/vector-transport-service:latest /bin/bash

Configure Your Migration Create a configuration file (e.g., migration.conf):


env {
  parallelism = 1
  job.mode = "BATCH"
}

source {
  # Source configuration (e.g., Milvus, Elasticsearch, etc.)
  Milvus {
    url = "https://your-source-url:19530"
    token = "your-token"
    database = "default"
    collections = ["your-collection"]
    batch_size = 100
  }
}

sink {
  # Target configuration
  Milvus {
    url = "https://your-target-url:19530"
    token = "your-token"
    database = "default"
    batch_size = 10
  }
}

Run the Migration
Cluster Mode (Recommended): Runs in a distributed environment using the SeaTunnel cluster. Supports parallel execution for large-scale or production migrations.
```
# Start the cluster
mkdir -p ./logs
./bin/seatunnel-cluster.sh -d

# Submit the job
./bin/seatunnel.sh --config ./migration.conf
```
Local Mode: Runs the migration locally on a single machine. Simple to set up and use—ideal for development, testing, or small-scale migrations.
```
./bin/seatunnel.sh --config ./migration.conf -m local
```

Usage Overview

Category	Description
Deployment	Self-hosted and user-managed
Ease of Use	Requires manual deployment and ongoing maintenance
Supported Data Sources	Milvus, Elasticsearch, OpenSearch, Pinecone, Qdrant, PostgreSQL, and other major vector databases
Real-time Sync	Supported (configuration required manually)
Network Requirement	Compatible with private networks
Cost	Free and open-source; users bear infrastructure and operational costs
Best for	Organizations with existing infrastructure that prefer on-premises, self-managed solutions

Performance

In a real-world demo (Pinecone to Milvus migration), VTS achievement is claimed as:

Sync rate: 2,961 vectors/sec
Total vectors: 100 million
Time taken: ~9.5 hours
Environment: 4 CPU cores, 8 GB RAM

Future Support: Unstructured Data Sources

VTS is actively expanding its support for unstructured data. Currently supported:

Shopify data types

Planned support:

PDFs
Google Docs
Slack data
Images and text

FAQs:

How does VTS work?
--> VTS automates the migration process by extracting data from your source system, transforming it to match the target schema, and then loading it into your destination vector database.
Does VTS support zero downtime migration?
--> Yes, VTS supports real-time, zero-downtime migration by creating an initial snapshot of your data and continuously synchronizing changes. This ensures your applications remain operational throughout the migration process.
Are there any limitations or requirements for zero downtime migration with VTS?
Currently, zero downtime migration is only supported for data migration from Milvus to Zilliz Cloud. To enable this feature, you need to manually deploy Milvus CDC (Change Data Capture) for continuous data synchronization.

#watsonx.data

0 comments

67 views

Permalink

https://community.ibm.com/community/user/blogs/gifi-siby/2025/07/14/vts-vector-transport-service

watsonx.data

watsonx.data

Exploring VTS (Vector Transport Service): An Open-Source Tool for Moving Vector Data

By Gifi Siby posted 21 days ago

Introduction

Core Capabilities of VTS

Primary Use Cases

Supported Connectors

Supported Transforms

Supported Data Types

Supported Deployments

How does Vector Transport Service (VTS) work?

Prerequisites

Obtain VTS

Usage Overview

Performance

Future Support: Unstructured Data Sources

FAQs:

Permalink

Additional
Resources

Office

Quick Links

watsonx.data

watsonx.data

Exploring VTS (Vector Transport Service): An Open-Source Tool for Moving Vector Data

By Gifi Siby posted 21 days ago

Introduction

Core Capabilities of VTS

Primary Use Cases

Supported Connectors

Supported Transforms

Supported Data Types

Supported Deployments

How does Vector Transport Service (VTS) work?

Prerequisites

Obtain VTS

Usage Overview

Performance

Future Support: Unstructured Data Sources

FAQs:

Permalink

Additional Resources

Office

Quick Links

Additional
Resources