Randomized Gumbel Watermarking
A variant of the Gumbel watermark that introduces controlled randomness to enhance output diversity while maintaining detectability.
Example code: 📓 View On GitHub
Paper: Verma & Phan (2025) - Watermarking Degrades Alignment in Language Models
Overview
Standard Gumbel watermarking uses deterministic token selection (argmax), which limits response diversity across multiple generations with the same context. This variant addresses that limitation by introducing a second stage of randomness.
Key Difference
The standard Gumbel method computes:
The randomized variant instead uses:
After computing the transformed scores \(u_v^{1/p_v}\), instead of selecting the maximum deterministically, this method samples from a multinomial distribution where the scores serve as unnormalized probabilities. This introduces controlled randomness while preserving the watermark signal.
Properties
Distortion: Introduces slight distortion (no longer strictly distortion-free)
Diversity: Enhanced output diversity compared to standard Gumbel
Detectability: Maintains robust watermark detection (Verma & Phan, 2025)
Use case: Designed for applications requiring alignment recovery via best-of-N sampling
Paper Reference
Verma, A., Phan, N., & Trivedi, S. (2025). Watermarking Degrades Alignment in Language Models: Analysis and Mitigation. arXiv preprint arXiv:2506.04462. https://arxiv.org/pdf/2506.04462
Example Code
import os
os.environ["VLLM_USE_V1"] = "1"
os.environ["VLLM_ENABLE_V1_MULTIPROCESSING"] = "0"
from vllm import LLM, SamplingParams
from vllm_watermark.core import WatermarkedLLMs, WatermarkingAlgorithm
# Load the vLLM model
llm = LLM(
model="meta-llama/Llama-3.2-1B",
enforce_eager=True,
max_model_len=1024,
)
# Create a Randomized Gumbel watermarked LLM
wm_llm = WatermarkedLLMs.create(
model=llm,
algo=WatermarkingAlgorithm.OPENAI_DR,
seed=42,
ngram=2,
)
# Generate watermarked text with enhanced diversity
prompts = ["Write about machine learning applications"]
sampling_params = SamplingParams(temperature=1.0, top_p=0.95, max_tokens=64)
outputs = wm_llm.generate(prompts, sampling_params)
for output in outputs:
print(output.outputs[0].text)