Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
Systematization of Knowledge
Red-teaming has emerged as a critical technique for identifying vulnerabilities in real-world LLM implementations. In our latest paper we present a detailed threat model and provide a systematization of knowledge (SoK) of red-teaming attacks on LLMs. We create a taxonomy of attacks based on the entry points in the phases...
[Read More]