Permute-and-Flip (PF) Watermarkingο
The Permute-and-Flip (PF) watermarking scheme uses prefix-free coding principles to embed watermarks while maintaining text quality.
Example code: π View On GitHub
Author: Lean et al.
Theoryο
PF watermarking operates by creating a permutation of the vocabulary and selectively βflippingβ tokens based on cryptographic keys. The approach leverages prefix-free coding to ensure that the watermark can be detected without knowledge of the exact generation process.
The algorithm maintains a mapping between original tokens and their permuted counterparts, with the permutation determined by a cryptographic hash of the context. During generation, tokens are selectively replaced according to this mapping with probability \(\alpha\).
The sampling distribution is modified as:
where \(P_{orig}(t)\) is the original token probability, \(P_{perm}(t)\) is the probability under the permutation, and \(\alpha \in [0,1]\) controls the watermarking strength.
Detection analyzes the frequency of tokens that appear in positions consistent with the permutation pattern, using statistical tests to determine if the observed pattern deviates significantly from random.
Note
PF watermarking is designed to be robust against various text transformations while maintaining high text quality.
Paper referenceο
Lean, M., et al. (2024). Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs. arXiv preprint arXiv:2402.05864. https://arxiv.org/abs/2402.05864
Example codeο
import os
import sys
os.environ["VLLM_USE_V1"] = "1"
os.environ["VLLM_ENABLE_V1_MULTIPROCESSING"] = "0"
from vllm import LLM, SamplingParams
from vllm_watermark.core import (
DetectionAlgorithm,
WatermarkedLLMs,
WatermarkingAlgorithm,
)
from vllm_watermark.watermark_detectors import WatermarkDetectors
# Load the vLLM model
llm = LLM(model="meta-llama/Llama-3.2-1B")
# Create a PF watermarked LLM
wm_llm = WatermarkedLLMs.create(
llm,
algo=WatermarkingAlgorithm.PF,
seed=42,
ngram=2,
alpha=0.1, # Permutation probability
)
# Create PF detector with matching parameters
detector = WatermarkDetectors.create(
algo=DetectionAlgorithm.PF,
model=llm,
ngram=2,
seed=42,
alpha=0.1,
threshold=0.05,
)
# Generate watermarked text
prompts = ["Explain quantum computing in simple terms"]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95, max_tokens=64)
outputs = wm_llm.generate(prompts, sampling_params)
# Detect watermark
for output in outputs:
generated_text = output.outputs[0].text
detection_result = detector.detect(generated_text)
print(f"Generated: {generated_text}")
print(f"Watermarked: {detection_result['is_watermarked']}")
print(f"P-value: {detection_result['pvalue']:.6f}")
Notesο
Uses prefix-free coding principles for robust watermarking
Parameter \(\alpha\) controls the permutation probability
Designed to be robust against text transformations
Maintains high text quality with carefully tuned permutation strategies
Detection is based on statistical analysis of token permutation patterns