Canary prompts are deterministic behavior probes with known or expected answers. In the current model, canary behavior is report-only; scored capability checks live in the separate capability-floor dimension.
Algorithm
Run the canary prompt set, compare responses with known-answer templates when available, and display misses or surprising alternatives. The diagnostic is useful for inspecting behavior but is excluded from the headline score.
Thresholds
Condition
Verdict contribution
Template hit
Diagnostic hit
Template miss or surprising alternative
Diagnostic miss
Any result
Score contribution remains 0
Limitations
Known-answer templates may be estimates or incomplete for a claimed model. Prompt wording and system prompts can change outputs. Use capability-floor for scored ground-truth grading.
Anysingle signal cannot provemalicious behavior. Proxies may show anomalies for legitimate reasons (regional routing, A/B testing, degradation strategies, cache optimization).
Token ratio deviation may result from ChatML wrapping, system prompt injection, or tokenizer version differences — not necessarily intentional inflation.
Model identity judgment is based on statistical fingerprint matching, not cryptographic proof. Quantization, fine-tuning, and post-processing can all alter fingerprints.
MMD distribution tests are sensitive to temperature, sampling parameters, and system prompts. Significant p-values mean distributional difference, not proof of substitution.
Logprobs unavailability is increasingly common (many providers disable it by default in 2025-2026) and does not by itself indicate deception.
ITT rhythm fingerprinting is an early-stage technique. Network jitter, TCP coalescing, and gateway buffering can produce false signals.
This tool generates reference-grade evidence chains, not legal conclusions. Do not make definitive accusations based solely on this report.
The wording in the report refers to statistical "deviations" or "signal inconsistencies". Please do not use this to make fraud or deception claims against any service provider.