Lab
RAG Pipeline Playground
Paste text, adjust chunk size and overlap, visualize how documents split into chunks, and run simulated similarity searches.
Interactive chunking & retrieval simulator
What this is
A browser-only sandbox for experimenting with RAG chunking strategies. Paste any text, adjust chunk size and overlap, switch between character and paragraph chunking, and run simulated similarity searches. All processing happens locally.
What you'll learn
- How chunk size affects retrieval granularity
- Why overlap prevents information loss at boundaries
- The tradeoff between character and paragraph chunking
Builder notes
This lab pairs with the RAG pipeline tutorial. Use it to build intuition for chunking parameters before choosing values for your own pipeline.
- Try character chunking at 200 vs. 800 characters to see granularity differences.
- Add overlap and watch the highlighted boundaries shift.
- Switch to paragraph mode to see how semantic boundaries change chunk quality.
- Run similarity searches to compare retrieval across chunk configurations.
Document Text
Chunking Parameters
Chunking mode
Statistics
Chunks
0
Avg length
0
Range
0–0
Overlap
0%
Similarity Search
Keyword-based (no real embeddings)
Chunk Visualization
Hover chunks to cross-reference
Paste text and click "Chunk" to visualize
Chunk List
How chunking works
RAG pipelines split documents into smaller pieces (chunks) before embedding them as vectors. When a user asks a question, the system embeds the question, searches for the most similar chunk vectors, and passes those chunks to the language model as context.
Chunk size controls the granularity of retrieval. Smaller chunks (200-300 characters) are more precise: each chunk covers one idea, so the best match is more likely to be exactly what was asked about. Larger chunks (800-1000 characters) preserve more context, which helps the model generate better answers. The tradeoff is precision vs. context.
Overlap prevents information loss at chunk boundaries. Without overlap, a sentence that spans two chunks gets split, and neither chunk contains the full thought. With 50 characters of overlap, the end of one chunk repeats at the start of the next, preserving continuity.
Paragraph chunking splits on semantic boundaries (double newlines) instead of character count. This produces variable-length chunks but ensures each chunk contains complete thoughts. It works best for structured documents with clear sections.
About the similarity search
The similarity search in this lab uses term-frequency cosine similarity, not real embedding vectors. It tokenizes the query and each chunk, builds term-frequency vectors, and computes cosine similarity between them. This approximates how embedding-based search works but without the semantic understanding that neural embeddings provide.
In a real RAG pipeline (like the one in the
tutorial), an embedding model like nomic-embed-text converts text into dense
vectors that capture semantic meaning. "SSH vulnerability" and "OpenSSH security flaw"
would be close in embedding space even though they share few words. The keyword-based
search here would not catch that relationship.
Use the Embedding Space Explorer to see how neural embeddings cluster semantically similar words.
Security model
Everything runs in your browser. The text you paste is processed locally using JavaScript. No data is sent to any server. No embeddings are generated remotely. The similarity search uses simple term-frequency math, not a neural model.
The sample texts are hardcoded in the page source. If you paste sensitive text, it stays in your browser tab and is discarded when you close or refresh the page.