This tutorial shows you how to deploy a Weaviate vector database cluster on Google Kubernetes Engine (GKE).
Weaviate is an open-source vector database with low-latency performance and basic support for different media types such as text and images. It supports semantic search, question answering, and classification. Weaviate is fully built on Go and it stores both objects and vectors, allowing the use of vector search, keyword search, and a combination of both as a hybrid search. From an infrastructure perspective, Weaviate is a cloud-native and fault-tolerant database. Fault tolerance is delivered by leaderless architecture where each node of the database cluster can serve read and write requests which in turn excludes a single point of failure.
This tutorial is intended for cloud platform administrators and architects, ML engineers, and MLOps (DevOps) professionals interested in deploying vector database clusters on GKE.
Benefits
Weaviate offers the following benefits:
- Libraries for various programming languages and open API to integrate with other services.
- Horizontal scaling.
- A balance between cost-effectiveness and query speed, especially when dealing with large datasets. You can choose how much data is stored in memory versus on disk.
Objectives
In this tutorial, you learn how to:
- Plan and deploy GKE infrastructure for Weaviate.
- Deploy and configure the Weaviate database in a GKE cluster.
- Run a Notebook to generate and store example vector embeddings within your database, and perform vector-based search queries.
Set up your environment
To set up your environment with Cloud Shell, follow these steps:
Set environment variables for your project, region, and a Kubernetes cluster resource prefix:
export PROJECT_ID=PROJECT_ID export KUBERNETES_CLUSTER_PREFIX=weaviate export REGION=us-central1
Replace
PROJECT_ID
with your Google Cloud project ID.This tutorial uses
us-central1
region to create your deployment resources.Check the version of Helm:
helm version
Update the version if it's older than 3.13:
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
Clone the sample code repository from GitHub:
git clone https://github.com/GoogleCloudPlatform/kubernetes-engine-samples
Navigate to the
weaviate
directory:cd kubernetes-engine-samples/databases/weaviate
Create your cluster infrastructure
In this section, you run a Terraform script to create a private, highly-available, regional GKE cluster to deploy your Weaviate database.
You can choose to deploy Weaviate using a Standard or Autopilot cluster. Each has its own advantages and different pricing models.
Autopilot
The following diagram shows an Autopilot GKE cluster deployed in the project.
To deploy the cluster infrastructure, run the following commands in the Cloud Shell:
export GOOGLE_OAUTH_ACCESS_TOKEN=$(gcloud auth print-access-token)
terraform -chdir=terraform/gke-autopilot init
terraform -chdir=terraform/gke-autopilot apply \
-var project_id=${PROJECT_ID} \
-var region=${REGION} \
-var cluster_prefix=${KUBERNETES_CLUSTER_PREFIX}
GKE replaces the following variables at runtime:
GOOGLE_OAUTH_ACCESS_TOKEN
uses thegcloud auth print-access-token
command to retrieve an access token that authenticates interactions with various Google Cloud APIsPROJECT_ID
,REGION
, andKUBERNETES_CLUSTER_PREFIX
are the environment variables defined in the Set up your environment section and assigned to the new relevant variables for the Autopilot cluster you are creating.
When prompted, type yes
.
The output is similar to the following:
...
Apply complete! Resources: 9 added, 0 changed, 0 destroyed.
Outputs:
kubectl_connection_command = "gcloud container clusters get-credentials weaviate-cluster --region us-central1"
Terraform creates the following resources:
- A custom VPC network and private subnet for the Kubernetes nodes.
- A Cloud Router to access the internet through Network Address Translation (NAT).
- A private GKE cluster in the
us-central1
region. - A
ServiceAccount
with logging and monitoring permissions for the cluster. - Google Cloud Managed Service for Prometheus configuration for cluster monitoring and alerting.
Standard
The following diagram shows a Standard private regional GKE cluster deployed across three different zones.
To deploy the cluster infrastructure, run the following commands in the Cloud Shell:
export GOOGLE_OAUTH_ACCESS_TOKEN=$(gcloud auth print-access-token)
terraform -chdir=terraform/gke-standard init
terraform -chdir=terraform/gke-standard apply \
-var project_id=${PROJECT_ID} \
-var region=${REGION} \
-var cluster_prefix=${KUBERNETES_CLUSTER_PREFIX}
GKE replaces the following variables at runtime:
GOOGLE_OAUTH_ACCESS_TOKEN
uses thegcloud auth print-access-token
command to retrieve an access token that authenticates interactions with various Google Cloud APIs.PROJECT_ID
,REGION
, andKUBERNETES_CLUSTER_PREFIX
are the environment variables defined in Set up your environment section and assigned to the new relevant variables for the Standard cluster that you are creating.
When prompted, type yes
. It might take several minutes for these commands to
complete and for the cluster to show a ready status.
The output is similar to the following:
...
Apply complete! Resources: 10 added, 0 changed, 0 destroyed.
Outputs:
kubectl_connection_command = "gcloud container clusters get-credentials weaviate-cluster --region us-central1"
Terraform creates the following resources:
- A custom VPC network and private subnet for the Kubernetes nodes.
- A Cloud Router to access the internet through Network Address Translation (NAT).
- A private GKE cluster in the
us-central1
region with autoscaling enabled (one to two nodes per zone). - A
ServiceAccount
with logging and monitoring permissions for the cluster. - Google Cloud Managed Service for Prometheus configuration for cluster monitoring and alerting.
Connect to the cluster
Configure kubectl
to fetch credentials and communicate with your new GKE cluster:
gcloud container clusters get-credentials \
${KUBERNETES_CLUSTER_PREFIX}-cluster --location ${REGION}
Deploy the Weaviate database to your cluster
To use Helm chart to deploy the Weaviate database to your GKE cluster, follow these steps:
Add the Weaviate database Helm Chart repository before you can deploy it on your GKE cluster:
helm repo add weaviate https://weaviate.github.io/weaviate-helm
Create the namespace
weaviate
for the database:kubectl create ns weaviate
Create a secret to store the API key:
kubectl create secret generic apikeys --from-literal=AUTHENTICATION_APIKEY_ALLOWED_KEYS=$(openssl rand -base64 32) -n weaviate
Deploy an internal load balancer to access Weaviate from within the virtual network:
kubectl apply -n weaviate -f manifests/05-ilb/ilb.yaml
The
ilb.yaml
manifest describes the load balancer service:Apply the manifest to deploy Weaviate cluster:
helm upgrade --install "weaviate" weaviate/weaviate \ --namespace "weaviate" \ --values ./manifests/01-basic-cluster/weaviate_cluster.yaml
The
weaviate_cluster.yaml
manifest describes the Deployment. A Deployment is a Kubernetes API object that lets you run multiple replicas of Pods that are distributed among the nodes in a cluster.:Wait for a few minutes for the Weaviate cluster to fully start.
Check the Deployment status:
kubectl get weaviate -n weaviate --watch
The output is similar to following, if the
weaviate
database is successfully deployed:NAME: weaviate LAST DEPLOYED: Tue Jun 18 13:15:53 2024 NAMESPACE: weaviate STATUS: deployed REVISION: 1 TEST SUITE: None
Wait for Kubernetes to start the resources:
kubectl wait pods -l app.kubernetes.io/name=weaviate --for condition=Ready --timeout=300s -n weaviate
Run queries with Vertex AI Colab Enterprise notebook
This section explains how to connect to your Weaviate database using Colab Enterprise.
You can use a dedicated runtime template to deploy to the weaviate-vpc
, so the
notebook can communicate with resources in the GKE cluster.
For more information about Vertex AI Colab Enterprise, see Colab Enterprise documentation.
Create a runtime template
To create a Colab Enterprise runtime template:
In the Google Cloud console, go to the Colab Enterprise Runtime Templates page and make sure your project is selected:
Click add_box New Template. The Create new runtime template page appears.
In the Runtime basics section:
- In the Display name field, enter
weaviate-connect
. - In the Region drop-down list, select
us-central1
. It's the same region as your GKE cluster.
- In the Display name field, enter
In the Configure compute section:
- In the Machine type drop-down list, select
e2-standard-2
. - In the Disk size field, enter
30
.
- In the Machine type drop-down list, select
In the Networking and security section:
- In the Network drop-down list, select the network where your GKE cluster resides.
- In the Subnetwork drop-down list, select a corresponding subnetwork.
- Clear the Enable public internet access checkbox.
To finish creating the runtime template, click Create. Your runtime template appears in the list on the Runtime templates tab.
Create a runtime
To create a Colab Enterprise runtime:
In the runtime templates list for the template you just created, in the Actions column, click more_vert and then click Create runtime. The Create Vertex AI Runtime pane appears.
To create a runtime based on your template, click Create.
On the Runtimes tab that opens, wait for the status to transition to Healthy.
Import the notebook
To import the notebook in Colab Enterprise:
Go to the My Notebooks tab and click Import. The Import notebooks pane appears.
In Import source, select URL.
Under Notebook URLs, enter the following link:
https://raw.githubusercontent.com/GoogleCloudPlatform/kubernetes-engine-samples/main/databases/weaviate/manifests/02-notebook/vector-database.ipynb
Click Import.
Connect to the runtime and run queries
To connect to the runtime and run queries:
In the notebook, next to the Connect button, click arrow_drop_down Additional connection options. The Connect to Vertex AI Runtime pane appears.
Select Connect to a runtime and then select Connect to an existing Runtime.
Select the runtime that you launched and click Connect.
To run the notebook cells, click the
Run cell button next to each code cell.
The notebook contains both code cells and text that describes each code block. Running a code cell executes its commands and displays an output. You can run the cells in order, or run individual cells as needed.
View Prometheus metrics for your cluster
The GKE cluster is configured with Google Cloud Managed Service for Prometheus, which enables collection of metrics in the Prometheus format. This service provides a fully managed solution for monitoring and alerting, allowing for collection, storage, and analysis of metrics from the cluster and its applications.
The following diagram shows how Prometheus collects metrics for your cluster:
The GKE private cluster in the diagram contains the following components:
- Weaviate Pods that expose metrics on the path
/metrics
and port2112
. - Prometheus-based collectors that process the metrics from the Weaviate Pods.
- A PodMonitoring resource that sends the metrics to Cloud Monitoring.
To export and view the metrics, follow these steps:
Create the
PodMonitoring
resource to scrape metrics bylabelSelector
:kubectl apply -n weaviate -f manifests/03-prometheus-metrics/pod-monitoring.yaml
The
pod-monitoring.yaml
manifest describes thePodMonitoring
resource:To import a custom Cloud Monitoring dashboard with the configurations defined in
dashboard.json
:gcloud --project "${PROJECT_ID}" monitoring dashboards create --config-from-file monitoring/dashboard.json
After the command runs successfully, go to the Cloud Monitoring Dashboards:
From the list of dashboards, open the
Weaviate Overview
dashboard. It might take some time to collect and display metrics. The dashboard shows amount of Shards, Vectors and operations latency