Vector Store is a type of database that stores vector embeddings, which are numerical representations of entities such as text, images or audio. These embeddings are created using AI models (like OpenAI) and help the system understand the meaning behind the data. This allows you to search not just by exact keywords, but by similar meaning, like finding documents or images that are related in context, even if they don’t use the same words.
Why Use Vector Stores
Traditional databases excel at handling exact matches, such as searching for a product name or filtering by a specific category. But they fall short when it comes to understanding meaning or context. That’s where Vector Stores comes in.
Let’s explore why vector stores are essential in modern AI applications
1. Semantic Search Over Keyword Search: Traditional search systems rely on exact keyword matches. But in AI-driven systems, semantic relevance matters more.
2. Core to Retrieval-Augmented Generation (RAG): Vector stores are foundational for RAG systems, where LLMs retrieve external knowledge before generating responses.
3. High-Performance Similarity Search: Vector stores use Approximate Nearest Neighbor (ANN) algorithms to perform fast and scalable similarity lookups, even in datasets with millions of vectors.
4. Scalable & Real-Time Capable: Modern vector databases support:
- Real-time indexing
- Metadata filtering
- Distributed scaling
- Cloud-native deployments
Common Vector Store API Operations
Common Vector Store API Operations that are typically used when working with vector databases or vector stores in AI/ML applications.
1. Indexing Vectors
Indexing vectors involves storing high-dimensional data in structures optimized for fast similarity search. This enables quick retrieval of semantically similar items using metrics like cosine or Euclidean distance.
Example:
POST /vectors
{
"id": "doc1",
"vector": [0.123, 0.456, ..., 0.789],
"metadata": {
"title": "Spring Boot Guide",
"tags": ["java", "spring"]
}
}
2. Searching Vectors
Searching vectors finds the top-k most similar results based on meaning, enabling semantic search using similarity metrics.
Example:
POST /query
{
"vector": [0.987, 0.654, ..., 0.321],
"top_k": 5
}
Response:
{
"results": [
{"id": "doc3", "score": 0.94, "metadata": {...}},
{"id": "doc7", "score": 0.91, "metadata": {...}},
...
]
}
3.Deleting Vectors
Deleting vectors involves removing stored vector entries from the database using their unique IDs. This helps manage outdated or irrelevant data efficiently.
Example:
DELETE /vectors
{
"ids": ["doc1", "doc5"]
}
4. Updating Vectors
Updating vectors allows you to modify an existing vector's embedding or metadata. This ensures your vector store reflects the latest information or content changes.
Example:
PUT /vectors
{
"id": "doc1",
"vector": [new values],
"metadata": {
"updated": true
}
}
Popular Vector Store APIs and Tools
1. Pinecone
- Pincone is fully managed vector database with real-time indexing and filtering.
- It offers RESTful API and SDKs for Python, JavaScript, etc.
2. FAISS (Facebook AI Similarity Search)
- FAISS is a open-source library for efficient similarity search and clustering of dense vectors.
- It runs on CPU and GPU, often used with Python.
3. Weaviate
- Weaviate is open-source vector database with built-in semantic search and GraphQL/REST API.
- It supports hybrid search, classification and multi-modal inputs.
4. Chroma
- Chroma is a lightweight open-source vector database focused on simplicity and local development.
- It is python-native, great for prototyping with LLMs.
Sample Use Case: RAG with OpenAI + Vector Store
Step-by-Step Implementation
Prerequisites:
- PostgreSQL (with pgvector extension).
- OpenAI API Key
- Spring Boot project
- Postman
Project Structure:

Step 1. Embed Text via OpenAI (Java)
Use OpenAI Embedding API to convert text documents into high-dimensional vectors. These vectors capture the semantic meaning of each document for similarity search.
EmbeddingService.java
import org.springframework.beans.factory.annotation.Value;
import org.springframework.http.*;
import org.springframework.stereotype.Service;
import org.springframework.web.client.RestTemplate;
import java.util.*;
@Service
public class EmbeddingService {
@Value("${openai.api.key}")
private String openaiApiKey;
private final RestTemplate restTemplate = new RestTemplate();
public List<Double> embedText(String inputText) {
try {
String url = "https://api.openai.com/v1/embeddings";
HttpHeaders headers = new HttpHeaders();
headers.setContentType(MediaType.APPLICATION_JSON);
headers.setBearerAuth(openaiApiKey);
Map<String, Object> request = new HashMap<>();
request.put("input", inputText);
request.put("model", "text-embedding-3-small");
HttpEntity<Map<String, Object>> entity = new HttpEntity<>(request, headers);
ResponseEntity<EmbeddingResponse> response =
restTemplate.postForEntity(url, entity, EmbeddingResponse.class);
return response.getBody().getData().get(0).getEmbedding();
} catch (Exception e) {
throw new RuntimeException("Failed to fetch embeddings", e);
}
}
}
EmbeddingResponse.java:
import java.util.List;
public class EmbeddingResponse {
private List<EmbeddingData> data;
public List<EmbeddingData> getData() { return data; }
public void setData(List<EmbeddingData> data) { this.data = data; }
public static class EmbeddingData {
private List<Double> embedding;
public List<Double> getEmbedding() { return embedding; }
public void setEmbedding(List<Double> embedding) { this.embedding = embedding; }
}
}
Step 2. Store Vector in PostgreSQL with pgvector
Save the generated vectors into a vector database like PostgreSQL with pgvector. Each entry includes the original text, its vector and a unique ID for retrieval.
VectorDocument.java:
import jakarta.persistence.*;
@Entity
@Table(name = "documents")
public class VectorDocument {
@Id
private String id;
private String text;
@Column(columnDefinition = "vector(1536)")
private float[] embedding; // pgvector requires proper binding
public String getId() { return id; }
public void setId(String id) { this.id = id; }
public String getText() { return text; }
public void setText(String text) { this.text = text; }
public float[] getEmbedding() { return embedding; }
public void setEmbedding(float[] embedding) { this.embedding = embedding; }
}
PostgreSQL table creation using pgvector:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE documents (
id TEXT PRIMARY KEY,
text TEXT,
embedding vector(1536)
);
Step 3: Repository for Similarity Search
VectorRepository.java
import org.springframework.data.jpa.repository.*;
import org.springframework.data.repository.query.Param;
import org.springframework.stereotype.Repository;
import java.util.List;
@Repository
public interface VectorRepository extends JpaRepository<VectorDocument, String> {
@Query(value = """
SELECT id, text, embedding
FROM documents
ORDER BY embedding <-> cast(:queryVector AS vector)
LIMIT :topK
""", nativeQuery = true)
List<VectorDocument> searchSimilar(@Param("queryVector") String queryVector,
@Param("topK") int topK);
}
Step 4. Vector Search + RAG Prompt
Embed the user query, search for top-k similar vectors from the database and extract their text. Combine the query and matching texts into a prompt to guide the LLM for a contextual answer.
VectorSearchService.java
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import java.util.List;
import java.util.stream.Collectors;
@Service
public class VectorSearchService {
@Autowired
private EmbeddingService embeddingService;
@Autowired
private VectorRepository vectorRepository;
public String generateAnswer(String userQuery) {
List<Double> queryEmbedding = embeddingService.embedText(userQuery);
String vectorStr = "(" + queryEmbedding.stream()
.map(String::valueOf)
.collect(Collectors.joining(", ")) + ")";
List<VectorDocument> topDocs = vectorRepository.searchSimilar(vectorStr, 3);
StringBuilder context = new StringBuilder();
topDocs.forEach(doc -> context.append(doc.getText()).append("\n"));
return sendToOpenAI(context.toString(), userQuery);
}
private String sendToOpenAI(String context, String query) {
String prompt = "Use the context below to answer:\n" + context + "\n\nQ: " + query + "\nA:";
// TODO: Replace with actual OpenAI Chat API call
return "[Simulated response from GPT: " + prompt + "]";
}
}
Step 5: Expose RAG API
Create a REST API endpoint (e.g., /api/rag/ask) that accepts a user question. It returns an OpenAI-generated answer based on ector search and retrieved context.
RagController.java
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;
import java.util.Map;
@RestController
@RequestMapping("/api/rag")
public class RagController {
@Autowired
private VectorSearchService ragService;
@PostMapping("/ask")
public ResponseEntity<String> ask(@RequestBody Map<String, String> body) {
if (!body.containsKey("question")) {
return ResponseEntity.badRequest().body("Missing field: question");
}
String question = body.get("question");
String answer = ragService.generateAnswer(question);
return ResponseEntity.ok(answer);
}
}
Step 6. Add dependencies to pom.xml
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.5.0</version>
</dependency>
<dependency>
<groupId>ai.pgvector</groupId>
<artifactId>pgvector</artifactId>
<version>0.1.1</version>
</dependency>
</dependencies>
Step 7. Configure application.properties:
server.port=8080
spring.datasource.url=jdbc:postgresql://localhost:5432/rag_db
spring.datasource.username=postgres
spring.datasource.password=your_password
spring.jpa.hibernate.ddl-auto=none
openai.api.key=${OPENAI_API_KEY}
Step 8: Run & Test with Postman
Your API should now be running at:
http://localhost:8080/api/rag/ask
Open Postman and Send Request
Step 1. Create a New Request:
- Open Postman.
- Set request type to POST.
- Set URL:
http://localhost:8080/api/rag/ask
Step 2. Set Headers:
Key: Content-Type
Value: application/json
Step 3. Add Body:
- Go to the Body tab.
- Select raw and choose JSON.
- Paste this
{
"question": "What is Spring Boot?"
}
Step 4: Send the Request:
Click the Send button. You should see a 200 OK status and a JSON response from the OpenAI-powered backend like:
{
"answer": "Spring Boot is a framework that simplifies the development of Java-based web applications. It provides auto-configuration, embedded servers and production-ready defaults..."
}
