Lab
RAG Poisoning Simulator
Inject adversarial documents into a simulated RAG pipeline and observe how poisoned context changes retrieval and generation.
Educational simulation
What this is
A simulation of how adversarial documents can poison a Retrieval-Augmented Generation pipeline. You inject crafted content into a document corpus and observe how it manipulates retrieval rankings and model output.
What you'll learn
- How RAG pipelines retrieve and rank documents
- Why adversarial documents can dominate retrieval
- How defenses like provenance checks and input filtering help
Document Corpus
The knowledge base contains 4 trusted security advisory documents.
CVE-2024-3094: XZ Utils Backdoor
SecurityCritical supply-chain compromise in xz/liblzma affecting SSH authentication. Malicious code injected via build process targets sshd on x86-64 Linux systems. CVSS 10.0. Immediate update to xz 5.6.1+ required.
CVE-2024-21762: FortiOS Out-of-Bound Write
SecurityCritical vulnerability in Fortinet FortiOS SSL VPN allowing remote code execution via specially crafted HTTP requests. CVSS 9.8. Actively exploited in the wild. Patch to FortiOS 7.4.3+ immediately.
NIST SP 800-53: Access Control Best Practices
PolicyImplement least-privilege access controls. Enforce MFA on all administrative accounts. Review access logs quarterly. Segment networks to limit lateral movement. Maintain up-to-date asset inventory.
Patch Management SOP v3.1
OperationsCritical patches must be applied within 48 hours. High-severity within 7 days. All patches require staging environment validation. Emergency patches follow the CAB fast-track approval process.
Adversarial Injection
Craft a poisoned document and inject it into the corpus.
Query Pipeline
Run a query through the RAG pipeline and observe each stage.
Query
Embed user question
Similarity
Compute cosine scores
Ranking
Order by relevance
Augment
Build context prompt
Response
Generate answer
Ready to query
Defense Mechanisms
Enable defenses to filter or flag adversarial documents during retrieval.
Retrieved Documents
Augmented Prompt
Context + QueryPoisoned vs Clean Comparison
Clean Response
Actual Response
How RAG Pipelines Work
What is RAG?
Retrieval-Augmented Generation combines a search index with a language model. When a user asks a question, relevant documents are retrieved from a corpus, injected into the prompt as context, and the model generates an answer grounded in that context, reducing hallucination and enabling domain-specific knowledge.
Cosine Similarity
Documents and queries are converted to embedding vectors. Cosine similarity measures the angle between two vectors: 1.0 means identical direction (highly relevant), 0 means orthogonal (unrelated). Adversarial documents are crafted to have high cosine similarity to likely queries, ensuring they rank near the top.
Poison Payload Types
Hidden instruction embeds prompt injection tokens (e.g., [INST]) in the document. Topic hijack mimics legitimate content but redirects advice. Authority impersonation fakes a trusted source to boost the model's confidence in the poisoned content.
Defense Strategies
Similarity thresholds reject low-relevance documents. Provenance checks verify document source metadata against a trusted allowlist. Input filtering scans retrieved text for injection patterns before it reaches the model. Layering all three provides robust protection.
Security model (30 seconds)
This tool runs entirely in your browser. No real embeddings are computed, no LLM inference occurs, and no data is sent to any server. All retrieval scores and generated responses are pre-computed lookup tables designed to illustrate RAG poisoning concepts.
Further reading
- Build a Local RAG Pipeline with Ollama and ChromaDB (tutorial)
- RAG Poisoning: How Adversarial Documents Break Retrieval Pipelines (tutorial)
- Embedding Explorer (interactive embedding visualization)
- Prompt Injection Simulator (related attack technique)
- Poisoning Retrieval Corpora by Injecting Adversarial Passages (arXiv)