OpenAI/Gumbel Watermarkingο
The Gumbel watermarking scheme by Aaronson (2023) leverages the Gumbel-Max trick for deterministic token selection while preserving the original sampling distribution.
Example code: π View On GitHub
Author: Scott Aaronson
Theoryο
The Gumbel watermark uses the Gumbel-Max trick for deterministic token selection by hashing the preceding \(h\) tokens with a key \(k\) to generate scores \(r_t\) for each token in the vocabulary at timestep \(t\).
The Classical Gumbel-Max Formulationο
In its classical form, the Gumbel-Max trick samples tokens according to:
where \(p_v\) is the probability of token \(v\) from the language model, and \(G_v \sim \text{Gumbel}(0, 1)\) are i.i.d. Gumbel-distributed random variables.
For watermarking, we replace the random Gumbel noise with deterministic pseudo-random values generated from a hash of the context. Using uniform random variables \(u_v \sim \text{Uniform}([0, 1])\), we can generate Gumbel noise via the transformation \(G_v = -\log(-\log(u_v))\). This gives us:
Implementation via an Equivalent Formο
The expression above can be simplified. Note that it is mathematically equivalent to:
This is what the implementation uses. We prove the equivalence below.
Mathematical Proof of Equivalenceο
Starting with the Gumbel-Max formulation:
We can manipulate this step by step:
Therefore, computing \(\text{arg max}_v [u_v^{1/p_v}]\) is equivalent to the Gumbel-Max trick. This simplification has several advantages:
Computational Efficiency: Avoids nested logarithms
Numerical Stability: The form \(u^{1/p}\) is more stable than \(-\log(-\log(u))\)
Implementation Simplicity: Fewer operations and clearer code
Implementation Detailsο
The algorithm proceeds as follows:
Generate deterministic random values: Hash the previous \(h\) tokens to obtain a seed, then generate \(|V|\) uniform random values \(\{u_v\}\)
Apply nucleus (top-p) sampling: Filter to keep only the top-p probability mass
Compute the transformed scores: For each token \(v\) in the filtered vocabulary, compute \(u_v^{1/p_v}\)
Token selection: Choose \(\text{arg max}_v [u_v^{1/p_v}]\)
This preserves the original sampling distribution while ensuring deterministic outputs.
Interpretationο
The transformation \(u_v^{1/p_v}\) balances probability and randomness:
For high-probability tokens (\(p_v\) large), \(1/p_v\) is small, so \(u_v^{1/p_v}\) remains close to \(u_v\)
For low-probability tokens (\(p_v\) small), \(1/p_v\) is large, so \(u_v^{1/p_v}\) can be significantly boosted if \(u_v\) is large
High-probability tokens tend to win without needing large \(u_v\) values, but occasionally a lower-probability token with a favorable \(u_v\) can prevail. This reproduces sampling from the original distribution.
The Gumbel-Max trick adds controlled randomness that, when combined with the true probabilities, produces samples from the correct distribution. The implementation achieves this without explicitly computing Gumbel random variables.
Detectionο
The detection score is computed as:
which follows a gamma distribution \(\Gamma(n, 1)\) under the watermark hypothesis, enabling statistical detection.
Note
This approach preserves the distribution while ensuring consistent output for fixed seeds, though it does reduce response diversity across different generations with the same context.
Paper referenceο
Aaronson, S. (2023). Gumbel Watermarking. Scott Aaronsonβs Blog. https://scottaaronson.blog/?p=6823
Example codeο
import os
import sys
os.environ["VLLM_USE_V1"] = "1"
os.environ["VLLM_ENABLE_V1_MULTIPROCESSING"] = "0"
from vllm import LLM, SamplingParams
from vllm_watermark.core import (
DetectionAlgorithm,
WatermarkedLLMs,
WatermarkingAlgorithm,
)
from vllm_watermark.watermark_detectors import WatermarkDetectors
# Load the vLLM model
llm = LLM(
model="meta-llama/Llama-3.2-1B",
enforce_eager=True,
max_model_len=1024,
)
# Create a Gumbel watermarked LLM
wm_llm = WatermarkedLLMs.create(
model=llm,
algo=WatermarkingAlgorithm.OPENAI, # Uses Gumbel watermarking
seed=42,
ngram=2,
)
# Create Gumbel detector with matching parameters
detector = WatermarkDetectors.create(
algo=DetectionAlgorithm.OPENAI_Z,
model=llm,
ngram=2,
seed=42,
payload=0,
threshold=0.05,
)
# Generate watermarked text
prompts = ["Write about machine learning applications"]
sampling_params = SamplingParams(temperature=1.0, top_p=0.95, max_tokens=64)
outputs = wm_llm.generate(prompts, sampling_params)
# Detect watermark
for output in outputs:
generated_text = output.outputs[0].text
detection_result = detector.detect(generated_text)
print(f"Generated: {generated_text}")
print(f"Watermarked: {detection_result['is_watermarked']}")
print(f"P-value: {detection_result['pvalue']:.6f}")
Notesο
Uses deterministic token selection via Gumbel-Max trick
Preserves original sampling distribution when hash length is large
Detection based on gamma distribution of accumulated scores
Trade-off between consistency and response diversity
See Alsoο
For applications requiring enhanced output diversity, see Randomized Gumbel Watermarking (Randomized Gumbel), which introduces controlled randomness at the cost of slight distortion.