Researchers have developed a method called Contamination Detection via output Distribution (CDD) to identify data contamination in small language models by analyzing the peakedness of sampled outputs. This approach has been tested on models with 70M to 410M parameters using controlled contamination experiments on datasets such as GSM8K, HumanEval, and MATH. The effectiveness of CDD depends on specific conditions, which have been studied to determine when this method succeeds or fails1. The study's findings provide insight into the limitations and potential applications of CDD in detecting contamination in small language models. The approach's success is critical in ensuring the reliability and security of language models, as contamination can compromise their performance and accuracy. So what matters to practitioners is that understanding the conditions under which CDD is effective can help them develop more robust and reliable language models.
No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models
⚡ High Priority
Why This Matters
We study the conditions under which this approach succeeds and fails on small language models ranging from 70M to 410M parameters.
References
- arXiv. (2026, March 3). No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models. *arXiv*. https://arxiv.org/abs/2603.03203v1
Original Source
arXiv AI
Read original →