Detection dimension · weight 15%
LLMmap Active Probing
What this dimension detects
LLMmap is an active-probing fingerprint technique. The original USENIX Security 2025 paper (Pasquini et al.) trains a deep contrastive classifier on 8 prompt families across 42 LLM versions and reports ~95% vendor identification accuracy. This release ships a heuristic approximation — not the trained classifier — and does NOT claim the paper's 95% number. Treat the dimension as a lower-bound signal.
Algorithm
We send up to eight probes from the LLMmap families, extract a small set of lexical / structural features per response (refusal-template family, hedging frequency, signature tokens, structural pattern), and match them against per-vendor templates. The classifier returns 'Unknown' rather than picking a vendor when probe coverage is < 6 or when the top vendor does not decisively beat the runner-up. Two probes (synthesis, conflict-handling) are off by default to avoid false positives in safety-tuned proxies.
Thresholds
| Condition | Verdict contribution |
|---|---|
| ≥ 6/8 probes covered + top score ≥ 0.18 + margin ≥ 0.04 | Vendor guess emitted |
| Otherwise | Unknown (open-set reject) |
Limitations
Heuristic only — not a trained classifier. Fingerprints the vendor (and approximate generation), not the exact model. Distinguishing GPT-5 from GPT-5-mini still requires logprobs or ITT.
References
- Pasquini et al. LLMmap: Fingerprinting Large Language Models. USENIX Security 2025. arXiv:2407.15847