Reading Note: "Safety at Scale: A Comprehensive Survey of Large Model Safety"

Ma et al. "Safety at Scale: A Comprehensive Survey of Large Model Safety". arXiv preprint arXiv:2502.05206 (2025).

Intro

Range: Vision Foundation Models (VFMs), Large Language Models (LLMs), Vision-Language Pre-training (VLP) models, Vision-Language Models (VLMs), Diffusion Models (DMs), and large-model-based Agents.

Contributions:

Proposing a comprehensive taxonomy (10'): Adversarial, data poisoning, backdoor, jailbreak, prompt injection, energy-latency, membership inference, model extraction, data extraction, and agent-specific attacks.
Reviewing defense strategies and summarizing commonly used datasets and benchmarks.
Identifying and discussing open challenges: Comprehensive safety evaluations, scalable and effective defense mechanisms, and sustainable data practices.

PreviousReading Note: "Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming"NextReading Note: "Towards More Practical Threat Models in Artificial Intelligence Security"

Last updated 8 months ago

Was this helpful?