Lab

Embedding Space Explorer

Visualize how AI understands meaning through vector similarity and semantic clustering.

JavaScript Required

The Embedding Space Explorer requires JavaScript to run in your browser. Please enable JavaScript to use this tool.

Interactive vector visualization

Add words and watch them cluster by meaning. Click points to find similar words, or try vector arithmetic like "king - man + woman = queen".

What this is

A browser-only sandbox with curated 50-dimensional demo vectors and a static 2D layout. Words with similar meanings are placed near each other to build intuition for embedding geometry.

What you'll learn

How embeddings capture semantic meaning
Why similar concepts cluster together
How vector arithmetic enables analogies

Builder notes

This lab is intentionally practical: add words, inspect nearest neighbors, and use analogy mode to build geometric intuition for embeddings.

Load presets first to recognize clustering patterns quickly.
Switch to Find Similar mode and compare cosine-distance behavior.
Use analogy mode to test where vector arithmetic holds or fails.

Learning resources

These are conceptual references; this lab uses curated demo vectors for clarity.

Add Words

Preset Collections

Load themed word sets

Mode

Explore Find Similar Analogy

Display

Show labels Show similarity lines Color by category

Embedding Space

0 words | Scroll to zoom, drag to pan

Selected Word

Click a point to select

Nearest Neighbors

Select a word to see neighbors

Vector preview: [ select a word to see its embedding vector ]

Each demo vector has 50 dimensions (production embeddings are often 768-1536)

How Embeddings Work

Words as Vectors

Neural networks convert words into high-dimensional vectors (typically 768-1536 numbers). Words with similar meanings get similar vectors. "Cat" and "dog" are close; "cat" and "democracy" are far apart.

Cosine Similarity

We measure how similar two vectors are by the angle between them. Cosine similarity of 1.0 means identical direction; 0.0 means perpendicular; -1.0 means opposite. Most word pairs fall between 0.3 and 0.9.

Dimensionality Reduction

Real systems often use UMAP or t-SNE to project high-dimensional embeddings to 2D. In this demo, the 2D coordinates are static and hand-tuned for readability, so treat them as an educational map rather than model output.

Vector Arithmetic

Embeddings capture relationships. The famous example: king - man + woman ≈ queen. The "royalty" direction minus the "male" direction plus the "female" direction lands near "queen". This works for many analogies.

Real-World Applications

RAG Retrieval-Augmented Generation

When you ask ChatGPT a question, RAG systems embed your query and find the most similar documents in a knowledge base. The retrieved context helps the model give accurate answers.

Search Semantic Search

Unlike keyword search, semantic search understands meaning. Searching for "automobile" finds documents about "cars" because their embeddings are similar.

Classification Content Classification

Embeddings power spam detection, sentiment analysis, and content moderation. Cluster similar content together, then label the clusters.

Keyboard shortcuts

+ / - Zoom in/out
Arrow keys Pan the view
R Reset view
Escape Deselect
L Toggle labels

About the embeddings

This demo uses precomputed, curated 50-dimensional vectors stored as static data in the page code. The 2D positions are also static and chosen for readability. This is useful for intuition, but it is not a live embedding model and not a benchmark for production embedding quality.

Security model

Everything runs in your browser. No words you enter are sent to any server. The embeddings are pre-computed and loaded as static data.