leeky
softwareTraining data contamination detection library for black-box language models, implementing six testing methods to identify potential data leakage
══════════════════════════════════════════════════════════════════
A pioneering tool for detecting training data contamination in black-box language models, addressing critical concerns about benchmark integrity and model evaluation validity in the AI research community.
Problem Statement
As language models train on vast internet corpora, there’s growing concern about:
- Memorization of public benchmarks
- Contamination of evaluation datasets
- Inflated performance metrics
- Compromised research validity
leeky provides systematic methods to detect whether specific text appears in a model’s training data without access to model weights or training datasets.
Testing Methods
1. Recital Without Context
- Provides N initial tokens from source material
- Prompts for completion without context
- Generates M×K samples for analysis
- Detects verbatim memorization
2. Contextual Recital
- Similar to recital testing
- Includes explicit source context
- Tests semantic understanding vs memorization
- Higher sensitivity for partial matches
3. Semantic Recital
- Prompts for source-aware completion
- Tests deeper understanding
- Identifies paraphrased content
- Captures non-verbatim contamination
4. Source Veracity
- Yes/No verification of text origin
- Tests model’s source recognition
- Multiple prompt variations
- Statistical confidence scoring
5. Source Recall
- Prompts model to identify source
- Tests explicit source memory
- Validates against known origins
- Measures recall accuracy
6. Search Engine Method
- Leverages model as search tool
- Tests information retrieval
- Identifies training data presence
- Cross-validates other methods
Technical Implementation
Core Architecture
from leeky import ContaminationTester
tester = ContaminationTester(model="gpt-4")
# Test for contamination
results = tester.test_contamination(
text="Sample legal document text",
methods=["recital", "veracity", "recall"],
samples=100
)
# Analyze results
contamination_score = results.aggregate_score()
Supported Models
- OpenAI API models
- Hugging Face models
- Any text completion API
- Custom model interfaces
Scoring Methods
Quantitative Metrics
- Verbatim match percentage
- Semantic similarity scores
- Source recognition rates
- Statistical significance tests
Interpretation Guidelines
- High recital scores indicate memorization
- Veracity scores show recognition
- Recall scores confirm source awareness
- Combined scores provide confidence
Use Cases
Research Integrity
- Validate benchmark cleanliness
- Ensure fair model comparisons
- Detect contaminated evaluations
- Maintain scientific rigor
Model Auditing
- Check for proprietary data leakage
- Verify training data compliance
- Assess memorization risks
- Support responsible AI practices
Legal and Compliance
- Detect copyrighted content
- Verify data usage rights
- Support litigation discovery
- Enable regulatory compliance
Example Results
The tool has been tested on various sources:
- Legal documents (constitutions, contracts)
- Academic papers
- News articles
- Code repositories
- Proprietary datasets
Results demonstrate varying contamination levels across different model families and training approaches.
Best Practices
Testing Strategy
- Use multiple detection methods
- Generate sufficient samples
- Test diverse text types
- Validate with known contaminated data
Interpretation
- Consider false positive rates
- Account for common phrases
- Use statistical thresholds
- Document methodology
Impact on AI Development
leeky addresses fundamental challenges in:
- Evaluation Validity: Ensuring benchmarks measure true capabilities
- Research Reproducibility: Detecting compromised test sets
- Ethical AI: Preventing unauthorized data use
- Model Trust: Validating training data claims
Future Development
Planned enhancements:
- Additional detection methods
- Multi-modal contamination testing
- Automated benchmark validation
- Integration with evaluation suites
leeky represents essential infrastructure for maintaining integrity in AI research and development, providing the tools needed to ensure fair and valid model evaluation.