Loading lesson…
Store embeddings, search by similarity. The foundation of every RAG system. Postgres plus pgvector gets you there.
An embedding turns text into a list of numbers — a vector. Similar meanings land near each other in that space. A vector database does one thing: find the nearest neighbors to a query vector, fast.
CREATE EXTENSION IF NOT EXISTS vector; CREATE TABLE docs ( id BIGSERIAL PRIMARY KEY, body TEXT NOT NULL, embedding vector(1536) NOT NULL -- 1536 for text-embedding-3-small ); -- HNSW index for fast approximate nearest-neighbor search CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);The vector type plus an HNSW index is all pgvector needs. Works inside regular Postgres.import OpenAI from "openai"; import { Pool } from "pg"; const openai = new OpenAI(); const pool = new Pool({ connectionString: process.env.DATABASE_URL }); async function embed(text: string): Promise<number[]> { const r = await openai.embeddings.create({ model: "text-embedding-3-small", input: text, }); return r.data[0].embedding; } export async function insert(body: string) { const vec = await embed(body); await pool.query( "INSERT INTO docs (body, embedding) VALUES ($1, $2)", [body, `[${vec.join(",")}]`] ); } export async function search(query: string, k = 5) { const vec = await embed(query); const { rows } = await pool.query( "SELECT id, body, 1 - (embedding <=> $1) AS score FROM docs ORDER BY embedding <=> $1 LIMIT $2", [`[${vec.join(",")}]`, k] ); return rows; }The <=> operator is cosine distance. Lower = more similar. 1 - distance = a nice 0..1 score.The big idea: embed once, search many. Postgres plus pgvector handles production-grade semantic search without a separate vector service.
6 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-progx-vector-db-creators
What is the main idea of "Vector DB Basics With pgvector"?
Which concept is most central to "Vector DB Basics With pgvector"?
What should a careful learner remember about "Keep dimensions consistent"?
You want to use AI after this lesson. What is the safest next step?
How should AI output about embedding be treated?
Name one way to verify an AI answer about embedding.