Introduction
- Pull the VTS Image
docker pull zilliz/vector-transport-service:latest
docker run -it zilliz/vector-transport-service:latest /bin/bash
- Configure Your Migration Create a configuration file (e.g.,
migration.conf
):
env {
parallelism = 1
job.mode = "BATCH"
}
source {
# Source configuration (e.g., Milvus, Elasticsearch, etc.)
Milvus {
url = "https://your-source-url:19530"
token = "your-token"
database = "default"
collections = ["your-collection"]
batch_size = 100
}
}
sink {
# Target configuration
Milvus {
url = "https://your-target-url:19530"
token = "your-token"
database = "default"
batch_size = 10
}
}
- Run the Migration
Cluster Mode (Recommended): Runs in a distributed environment using the SeaTunnel cluster. Supports parallel execution for large-scale or production migrations.
# Start the cluster
mkdir -p ./logs
./bin/seatunnel-cluster.sh -d
# Submit the job
./bin/seatunnel.sh --config ./migration.conf
Local Mode: Runs the migration locally on a single machine. Simple to set up and use—ideal for development, testing, or small-scale migrations.
./bin/seatunnel.sh --config ./migration.conf -m local
Category |
Description |
Deployment |
Self-hosted and user-managed |
Ease of Use |
Requires manual deployment and ongoing maintenance |
Supported Data Sources |
Milvus, Elasticsearch, OpenSearch, Pinecone, Qdrant, PostgreSQL, and other major vector databases |
Real-time Sync |
Supported (configuration required manually) |
Network Requirement |
Compatible with private networks |
Cost |
Free and open-source; users bear infrastructure and operational costs |
Best for |
Organizations with existing infrastructure that prefer on-premises, self-managed solutions |
Performance
In a real-world demo (Pinecone to Milvus migration), VTS achievement is claimed as:
-
Sync rate: 2,961 vectors/sec
-
Total vectors: 100 million
-
Time taken: ~9.5 hours
-
Environment: 4 CPU cores, 8 GB RAM
Future Support: Unstructured Data Sources
VTS is actively expanding its support for unstructured data. Currently supported:
Planned support:
-
PDFs
-
Google Docs
-
Slack data
-
Images and text
FAQs:
- How does VTS work?
--> VTS automates the migration process by extracting data from your source system, transforming it to match the target schema, and then loading it into your destination vector database.
- Does VTS support zero downtime migration?
--> Yes, VTS supports real-time, zero-downtime migration by creating an initial snapshot of your data and continuously synchronizing changes. This ensures your applications remain operational throughout the migration process.
- Are there any limitations or requirements for zero downtime migration with VTS?
Currently, zero downtime migration is only supported for data migration from Milvus to Zilliz Cloud. To enable this feature, you need to manually deploy Milvus CDC (Change Data Capture) for continuous data synchronization.
#watsonx.data