Vector Store API Basics

Vector Store is a type of database that stores vector embeddings, which are numerical representations of entities such as text, images or audio. These embeddings are created using AI models (like OpenAI) and help the system understand the meaning behind the data. This allows you to search not just by exact keywords, but by similar meaning, like finding documents or images that are related in context, even if they don’t use the same words.

Why Use Vector Stores

Traditional databases excel at handling exact matches, such as searching for a product name or filtering by a specific category. But they fall short when it comes to understanding meaning or context. That’s where Vector Stores comes in.

Let’s explore why vector stores are essential in modern AI applications

1. Semantic Search Over Keyword Search: Traditional search systems rely on exact keyword matches. But in AI-driven systems, semantic relevance matters more.

2. Core to Retrieval-Augmented Generation (RAG): Vector stores are foundational for RAG systems, where LLMs retrieve external knowledge before generating responses.

3. High-Performance Similarity Search: Vector stores use Approximate Nearest Neighbor (ANN) algorithms to perform fast and scalable similarity lookups, even in datasets with millions of vectors.

4. Scalable & Real-Time Capable: Modern vector databases support:

Real-time indexing
Metadata filtering
Distributed scaling
Cloud-native deployments

Common Vector Store API Operations

Common Vector Store API Operations that are typically used when working with vector databases or vector stores in AI/ML applications.

1. Indexing Vectors

Indexing vectors involves storing high-dimensional data in structures optimized for fast similarity search. This enables quick retrieval of semantically similar items using metrics like cosine or Euclidean distance.

Example:

Java

POST /vectors
{
  "id": "doc1",
  "vector": [0.123, 0.456, ..., 0.789],
  "metadata": {
    "title": "Spring Boot Guide",
    "tags": ["java", "spring"]
  }
}

2. Searching Vectors

Searching vectors finds the top-k most similar results based on meaning, enabling semantic search using similarity metrics.

Example:

Java

 POST /query
{
  "vector": [0.987, 0.654, ..., 0.321],
  "top_k": 5
}

Response:

Java

{
  "results": [
    {"id": "doc3", "score": 0.94, "metadata": {...}},
    {"id": "doc7", "score": 0.91, "metadata": {...}},
    ...
  ]
}

3.Deleting Vectors

Deleting vectors involves removing stored vector entries from the database using their unique IDs. This helps manage outdated or irrelevant data efficiently.

Example:

Java

DELETE /vectors
{
  "ids": ["doc1", "doc5"]
}

4. Updating Vectors

Updating vectors allows you to modify an existing vector's embedding or metadata. This ensures your vector store reflects the latest information or content changes.

Example:

Java

PUT /vectors
{
  "id": "doc1",
  "vector": [new values],
  "metadata": {
    "updated": true
  }
}

Popular Vector Store APIs and Tools

1. Pinecone

Pincone is fully managed vector database with real-time indexing and filtering.
It offers RESTful API and SDKs for Python, JavaScript, etc.

2. FAISS (Facebook AI Similarity Search)

FAISS is a open-source library for efficient similarity search and clustering of dense vectors.
It runs on CPU and GPU, often used with Python.

3. Weaviate

Weaviate is open-source vector database with built-in semantic search and GraphQL/REST API.
It supports hybrid search, classification and multi-modal inputs.

4. Chroma

Chroma is a lightweight open-source vector database focused on simplicity and local development.
It is python-native, great for prototyping with LLMs.

Sample Use Case: RAG with OpenAI + Vector Store

Step-by-Step Implementation

Prerequisites:
PostgreSQL (with pgvector extension).
OpenAI API Key
Spring Boot project
Postman

Project Structure:

Step 1. Embed Text via OpenAI (Java)

Use OpenAI Embedding API to convert text documents into high-dimensional vectors. These vectors capture the semantic meaning of each document for similarity search.

EmbeddingService.java

Java

import org.springframework.beans.factory.annotation.Value;
import org.springframework.http.*;
import org.springframework.stereotype.Service;
import org.springframework.web.client.RestTemplate;
import java.util.*;

@Service
public class EmbeddingService {

    @Value("${openai.api.key}")
    private String openaiApiKey;

    private final RestTemplate restTemplate = new RestTemplate();

    public List<Double> embedText(String inputText) {
        try {
            String url = "https://api.openai.com/v1/embeddings";

            HttpHeaders headers = new HttpHeaders();
            headers.setContentType(MediaType.APPLICATION_JSON);
            headers.setBearerAuth(openaiApiKey);

            Map<String, Object> request = new HashMap<>();
            request.put("input", inputText);
            request.put("model", "text-embedding-3-small");

            HttpEntity<Map<String, Object>> entity = new HttpEntity<>(request, headers);

            ResponseEntity<EmbeddingResponse> response =
                    restTemplate.postForEntity(url, entity, EmbeddingResponse.class);

            return response.getBody().getData().get(0).getEmbedding();
        } catch (Exception e) {
            throw new RuntimeException("Failed to fetch embeddings", e);
        }
    }
}

EmbeddingResponse.java:

Java

import java.util.List;

public class EmbeddingResponse {
    private List<EmbeddingData> data;

    public List<EmbeddingData> getData() { return data; }
    public void setData(List<EmbeddingData> data) { this.data = data; }

    public static class EmbeddingData {
        private List<Double> embedding;
        public List<Double> getEmbedding() { return embedding; }
        public void setEmbedding(List<Double> embedding) { this.embedding = embedding; }
    }
}

Step 2. Store Vector in PostgreSQL with pgvector

Save the generated vectors into a vector database like PostgreSQL with pgvector. Each entry includes the original text, its vector and a unique ID for retrieval.

VectorDocument.java:

Java

import jakarta.persistence.*;

@Entity
@Table(name = "documents")
public class VectorDocument {

    @Id
    private String id;

    private String text;

    @Column(columnDefinition = "vector(1536)")
    private float[] embedding; // pgvector requires proper binding

    public String getId() { return id; }
    public void setId(String id) { this.id = id; }

    public String getText() { return text; }
    public void setText(String text) { this.text = text; }

    public float[] getEmbedding() { return embedding; }
    public void setEmbedding(float[] embedding) { this.embedding = embedding; }
}

PostgreSQL table creation using pgvector:

CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE documents (
id TEXT PRIMARY KEY,
text TEXT,
embedding vector(1536)
);

Step 3: Repository for Similarity Search

VectorRepository.java

Java

import org.springframework.data.jpa.repository.*;
import org.springframework.data.repository.query.Param;
import org.springframework.stereotype.Repository;
import java.util.List;

@Repository
public interface VectorRepository extends JpaRepository<VectorDocument, String> {

    @Query(value = """
        SELECT id, text, embedding
        FROM documents
        ORDER BY embedding <-> cast(:queryVector AS vector)
        LIMIT :topK
        """, nativeQuery = true)
    List<VectorDocument> searchSimilar(@Param("queryVector") String queryVector,
                                       @Param("topK") int topK);
}

Step 4. Vector Search + RAG Prompt

Embed the user query, search for top-k similar vectors from the database and extract their text. Combine the query and matching texts into a prompt to guide the LLM for a contextual answer.

VectorSearchService.java

Java

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import java.util.List;
import java.util.stream.Collectors;

@Service
public class VectorSearchService {

    @Autowired
    private EmbeddingService embeddingService;

    @Autowired
    private VectorRepository vectorRepository;

    public String generateAnswer(String userQuery) {
        List<Double> queryEmbedding = embeddingService.embedText(userQuery);

        String vectorStr = "(" + queryEmbedding.stream()
                .map(String::valueOf)
                .collect(Collectors.joining(", ")) + ")";

        List<VectorDocument> topDocs = vectorRepository.searchSimilar(vectorStr, 3);

        StringBuilder context = new StringBuilder();
        topDocs.forEach(doc -> context.append(doc.getText()).append("\n"));

        return sendToOpenAI(context.toString(), userQuery);
    }

    private String sendToOpenAI(String context, String query) {
        String prompt = "Use the context below to answer:\n" + context + "\n\nQ: " + query + "\nA:";

        // TODO: Replace with actual OpenAI Chat API call
        return "[Simulated response from GPT: " + prompt + "]";
    }
}

Step 5: Expose RAG API

Create a REST API endpoint (e.g., /api/rag/ask) that accepts a user question. It returns an OpenAI-generated answer based on ector search and retrieved context.

RagController.java

Java

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;
import java.util.Map;

@RestController
@RequestMapping("/api/rag")
public class RagController {

    @Autowired
    private VectorSearchService ragService;

    @PostMapping("/ask")
    public ResponseEntity<String> ask(@RequestBody Map<String, String> body) {
        if (!body.containsKey("question")) {
            return ResponseEntity.badRequest().body("Missing field: question");
        }
        String question = body.get("question");
        String answer = ragService.generateAnswer(question);
        return ResponseEntity.ok(answer);
    }
}

Step 6. Add dependencies to pom.xml

Java

<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-jpa</artifactId>
    </dependency>
    <dependency>
        <groupId>org.postgresql</groupId>
        <artifactId>postgresql</artifactId>
        <version>42.5.0</version>
    </dependency>
    <dependency>
    <groupId>ai.pgvector</groupId>
    <artifactId>pgvector</artifactId>
    <version>0.1.1</version>
</dependency>
</dependencies>

Step 7. Configure application.properties:

server.port=8080
spring.datasource.url=jdbc:postgresql://localhost:5432/rag_db
spring.datasource.username=postgres
spring.datasource.password=your_password
spring.jpa.hibernate.ddl-auto=none
openai.api.key=${OPENAI_API_KEY}

Step 8: Run & Test with Postman

Your API should now be running at:

http://localhost:8080/api/rag/ask

Open Postman and Send Request

Step 1. Create a New Request:

Open Postman.
Set request type to POST.
Set URL:

http://localhost:8080/api/rag/ask

Step 2. Set Headers:

Key: Content-Type
Value: application/json

Step 3. Add Body:

Go to the Body tab.
Select raw and choose JSON.
Paste this

{
"question": "What is Spring Boot?"
}

Step 4: Send the Request:

Click the Send button. You should see a 200 OK status and a JSON response from the OpenAI-powered backend like:

{
"answer": "Spring Boot is a framework that simplifies the development of Java-based web applications. It provides auto-configuration, embedded servers and production-ready defaults..."
}

Why Use Vector Stores

Let’s explore why vector stores are essential in modern AI applications

Common Vector Store API Operations

1. Indexing Vectors

2. Searching Vectors

3.Deleting Vectors

4. Updating Vectors

Popular Vector Store APIs and Tools

1. Pinecone

2. FAISS (Facebook AI Similarity Search)

3. Weaviate

4. Chroma

Sample Use Case: RAG with OpenAI + Vector Store

Step-by-Step Implementation

Step 1. Embed Text via OpenAI (Java)

Step 2. Store Vector in PostgreSQL with pgvector

Step 3: Repository for Similarity Search

Step 4. Vector Search + RAG Prompt

Step 5: Expose RAG API

Step 6. Add dependencies to pom.xml

Step 7. Configure application.properties:

Step 8: Run & Test with Postman

Explore