Ban Competitors Scanner
The BanCompetitors
Scanner is designed to identify and handle mentions of competitors in text generated by Large
Language Models (LLMs).
This scanner is essential for businesses and individuals who wish to avoid inadvertently promoting or acknowledging
competitors in their automated content.
Motivation
In the realm of business and marketing, it's crucial to maintain a strategic focus on one's own brand and offerings. LLMs, while generating content, might unintentionally include references to competing entities. This can be counterproductive, especially in marketing materials, business reports, or any content representing a specific brand or organization.
The BanCompetitors
Scanner addresses this issue by detecting and managing mentions of competitors.
How it works
The scanner uses a Named Entity Recognition (NER) model to identify organizations within the text. After extracting these entities, it cross-references them with a user-provided list of known competitors, which should include all common variations of their names. If a competitor is detected, the scanner can either flag the text or redact the competitor's name based on user preference.
Models:
Usage
from llm_guard.output_scanners import BanCompetitors
competitor_list = ["Competitor1", "CompetitorOne", "C1", ...] # Extensive list of competitors
scanner = BanCompetitors(competitors=competitor_list, redact=False, threshold=0.5)
sanitized_output, is_valid, risk_score = scanner.scan(prompt, output)
An effective competitor list should include:
- The official names of all known competitors.
- Common abbreviations or variations of these names.
- Any subsidiaries or associated brands of the competitors.
- The completeness and accuracy of this list are vital for the effectiveness of the scanner.
Considerations and Limitations
- Accuracy: The accuracy of competitor detection relies heavily on the NER model's capabilities and the comprehensiveness of the competitor list.
- Context Awareness: The scanner may not fully understand the context in which a competitor's name is used, leading to potential over-redaction.
- Performance: The scanning process might add additional computational overhead, especially for large texts with numerous entities.
Optimization Strategies
Benchmark
Environment:
- Platform: Amazon Linux 2
- Python Version: 3.11.6
Run the following script:
python benchmarks/run.py output BanCompetitors
Results:
Instance | Latency Variance | Latency 90 Percentile | Latency 95 Percentile | Latency 99 Percentile | Average Latency (ms) | QPS |
---|---|---|---|---|---|---|
AWS m5.xlarge | 3.09 | 780.28 | 804.74 | 824.31 | 719.37 | 116.77 |
AWS g5.xlarge GPU | 34.87 | 310.17 | 403.29 | 477.79 | 122.94 | 683.25 |