One Bad OSINT Can Ruin Everything: How Secure is Your CTI RAG System, and Can You Minimize the Risk?
Demo for the First CTI 2026 presentation by Yen-Shan (Lily) Chen, Sian-Yao Huang, and Cheng-Lin Yang — CyCraft Technology, Taiwan.
This demo shows an end-to-end adversarial attack — Attention Attractor — against a Retrieval-Augmented Generation (RAG) pipeline for Cyber Threat Intelligence (CTI). The attacker crafts a single poisoned document that (1) gets retrieved by the retriever and (2) hijacks the generator's output via a prompt injection instruction, all by optimizing a small set of adversarial tokens around the malicious content.
Malicious Document = [LLM-generated CTI bait] + [Prompt Injection Instruction]
↓
Step 1: Retriever Attractor Optimization
- Inject R_TOKENS adversarial tokens before and after the bait passage
- Optimize via HotFlip to maximize cosine similarity with the target query
and CLS-token attention toward the clean passage
↓
Step 2: Generator Attractor Optimization
- Inject G_TOKENS adversarial tokens around the malicious instruction
- Optimize via HotFlip to maximize attention from generated output tokens
toward the malicious instruction
↓
Step 3: Evaluation
- Retriever success: poisoned document retrieved in Top-5?
- Generator success: model response begins with "Hackers are the best!"?
| Role | Model |
|---|---|
| Retriever | bce-embedding-base_v1 (BERT-style bi-encoder) |
| Generator | Qwen3-4B (causal LLM) |
The CTI corpus is sourced from CTI-Bench (NeurIPS 2024, arxiv). We sample 500 entries from the CTI-ATE split — mapping threat descriptions to MITRE ATT&CK techniques.
conda create -n attention-attractor python=3.12
conda activate attention-attractor
pip install -r requirements.txtThen open demo.ipynb and select the attention-attractor kernel.
demo.ipynb # Main walkthrough notebook
config.py # Query, prompt template, passages, and hyperparameters (R_TOKENS, G_TOKENS)
utils.py # Retriever, Generator, optimization loops, evaluation, and attention visualizations
data/
cti_corpus.json # CTI-Bench corpus used as the retrieval database
- Settings — Load the CTI corpus, initialize the retriever and generator, and define the malicious document.
- Optimization — Run retriever attractor optimization (150 iterations) then generator attractor optimization (150 iterations). Convergence plots are saved as
retriever_opt.pngandgenerator_opt.png. - Attack Evaluation & Analysis — Measure retrieval rank, check if generation is hijacked, and visualize per-token attention distributions for both models (bar charts + HTML heat maps).