Skip to content

Getting started with LLM Guard

Each scanner can be used individually, or using the scan_prompt function.

Individual

You can import an individual scanner and use it to evaluate the prompt or the output:

from llm_guard.input_scanners import BanTopics

scanner = BanTopics(topics=["violence"], threshold=0.5)
sanitized_prompt, is_valid, risk_score = scanner.scan(prompt)
from llm_guard.output_scanners import Bias

scanner = Bias(threshold=0.5)
sanitized_output, is_valid, risk_score = scanner.scan(prompt, model_output)

Multiple

Info

Scanners are executed in the order they are passed to the scan_prompt function.

For prompt:

from llm_guard import scan_prompt
from llm_guard.input_scanners import Anonymize, PromptInjection, TokenLimit, Toxicity
from llm_guard.vault import Vault

vault = Vault()
input_scanners = [Anonymize(vault), Toxicity(), TokenLimit(), PromptInjection()]

sanitized_prompt, results_valid, results_score = scan_prompt(input_scanners, prompt)
if any(not result for result in results_valid.values()):
    print(f"Prompt {prompt} is not valid, scores: {results_score}")
    exit(1)

print(f"Prompt: {sanitized_prompt}")

For output:

from llm_guard import scan_output
from llm_guard.output_scanners import Deanonymize, NoRefusal, Relevance, Sensitive

vault = Vault()
output_scanners = [Deanonymize(vault), NoRefusal(), Relevance(), Sensitive()]

sanitized_response_text, results_valid, results_score = scan_output(
    output_scanners, sanitized_prompt, response_text
)
if any(not result for result in results_valid.values()):
    print(f"Output {response_text} is not valid, scores: {results_score}")
    exit(1)

print(f"Output: {sanitized_response_text}\n")

Note

You can set fail_fast to True to stop scanning after the first invalid result. This can help to reduce the latency of the scanning.