As artificial intelligence systems become deeply woven into the fabric of modern enterprise operations, a stark and concerning reality has emerged: the speed of AI adoption has dramatically outpaced our ability to secure it. Organizations are racing to leverage the transformative power of AI, yet they are doing so with security paradigms designed for a previous era of technology. These fragmented, siloed approaches are fundamentally insufficient to address the complex, dynamic, and often unpredictable risks inherent in today’s AI. This growing chasm between implementation and readiness creates a significant vulnerability, leaving businesses exposed to a new class of sophisticated threats that conventional cybersecurity models cannot comprehend, let alone defend against. The urgent need for a new, holistic framework—one that provides a unified language and a comprehensive strategy for managing the full spectrum of AI-related risks—has never been more critical for navigating this new technological frontier securely and responsibly.
Bridging the Gap Unifying Security and Safety
The Problem with Siloed Approaches
The current state of AI security is one of dangerous fragmentation, a reality highlighted by recent industry research which revealed that a mere 29 percent of companies feel adequately prepared to defend against AI-related threats. This lack of confidence stems from a fundamental mismatch between old security models and new technological realities. While many business leaders possess a strong grasp of traditional cybersecurity, they find the nuances of AI security to be a foreign and intimidating landscape. The technology’s capacity for emergent behavior, its poorly understood failure modes, and its unpredictable interactions with its environment create a risk profile that defies conventional analysis. This uncertainty is exacerbated by a reliance on a patchwork of disparate security resources. Frameworks like MITRE ATLAS, the NIST Adversarial Machine Learning taxonomy, and various OWASP Top 10 lists have provided valuable insights, but each addresses only a narrow slice of the overall risk. This piecemeal approach fails to deliver the cohesive, end-to-end understanding necessary for deploying enterprise-grade AI systems with confidence, leaving critical gaps in an organization’s defensive posture.
This siloed defensive strategy is particularly ineffective because adversaries do not confine their attacks to neatly defined categories or isolated stages of the AI lifecycle. A sophisticated threat actor might begin by subtly poisoning a dataset during the pre-processing stage, later exploit a vulnerability in the model’s interaction with an external tool, and ultimately cause the system to generate harmful, misleading, or malicious output. This entire attack chain cuts across what organizations often treat as separate domains: data integrity, runtime security, and content safety. By failing to see the interconnectedness of these areas, organizations are effectively blind to the full scope of the threat. An effective defense must mirror the integrated nature of the attacks themselves. A strategy that compartmentalizes risk—treating supply chain security as separate from agentic behavior, or content moderation as distinct from adversarial machine learning—is destined to fail. To adequately protect complex AI ecosystems, a new paradigm is required, one that dissolves these artificial boundaries and provides a unified view of the entire threat landscape.
A New Integrated Vision
The foundation of a modern and effective AI defense strategy is the deliberate dismantling of the artificial barrier that has long separated the disciplines of “AI security” and “AI safety.” Rather than viewing them as parallel but distinct tracks, this new paradigm treats them as two inseparable and complementary dimensions of a single, unified risk model. This holistic perspective is not merely a conceptual shift; it is a direct response to the operational realities of AI threats. In practice, a security compromise frequently serves as the catalyst for a safety failure. For instance, an attacker might use a technique like prompt injection—a clear security breach—to trick a large language model into bypassing its own ethical guidelines, resulting in the generation of harmful content, which constitutes a safety failure. Similarly, a security attack that corrupts a model’s training data could lead to biased or unreliable outputs, a critical safety concern. This intrinsic link between security exploits and safety outcomes necessitates an integrated approach to both assessment and mitigation.
To operationalize this unified vision, it is essential to work from clear and comprehensive definitions. Within this framework, AI Security is defined as the discipline focused on ensuring AI accountability and protecting AI systems from unauthorized use, availability attacks, and integrity compromises throughout the entire AI lifecycle. It encompasses the technical measures taken to harden the system against malicious actors. Complementing this is AI Safety, which is defined as the discipline focused on ensuring that AI systems behave in a manner that is ethical, reliable, fair, transparent, and consistently aligned with human values and intentions. It addresses the inherent risks of the AI’s behavior, even in the absence of a malicious attack. By weaving these two disciplines together, organizations can move beyond a purely defensive posture. They can begin to build AI systems that are not only resilient against external threats but are also inherently robust, responsible, and fundamentally worthy of the trust placed in them by users, customers, and society at large. This integrated approach is the only way to achieve true AI assurance in an increasingly complex world.
The Five Pillars of the New Framework
Holistic and Lifecycle Aware Design
A defining characteristic of this new security paradigm is its deep anchoring in the complete AI lifecycle, from initial conception to eventual retirement. It moves beyond the static, point-in-time assessments that characterize traditional security, acknowledging that AI risks are fluid and context-dependent. A vulnerability or threat that may seem insignificant during the data collection and preprocessing stages can morph into a critical exposure once the model is deployed and integrated into a broader operational ecosystem. For example, a minor flaw in data validation might be negligible in a sandboxed development environment, but it could become a catastrophic security hole when the live model is granted access to external APIs, databases, or other AI agents, allowing an attacker to exfiltrate sensitive information or manipulate critical business processes. By mapping potential threats and harms across this entire journey—from data sourcing and model training to deployment, monitoring, and fine-tuning—the framework empowers organizations to implement sophisticated, multi-layered, defense-in-depth strategies that anticipate and mitigate how risks emerge and transform at each distinct stage of an AI system’s existence.
This comprehensive approach is also designed to contend with the increasing complexity of modern AI systems, which are rapidly evolving from monolithic models into sophisticated, collaborative ecosystems. The framework explicitly accounts for the emergent risks associated with multi-agent orchestration, where multiple autonomous AI agents interact to achieve a common goal. It provides taxonomies for threats related to inter-agent communication protocols, shared memory architectures, and collaborative decision-making processes, which are novel attack surfaces largely invisible to older security models. Furthermore, the framework is built with the understanding that AI is now inherently multimodal. Threats are no longer confined to malicious text prompts; they can manifest through a variety of input types, including cleverly crafted images designed to trigger unintended behavior, manipulated audio commands that bypass security filters, corrupted code snippets that execute malicious payloads, or even subtle signals embedded in sensor data for cyber-physical systems. By treating these diverse attack pathways with consistent rigor, this paradigm ensures comprehensive security for the advanced multimodal systems being deployed in high-stakes environments like autonomous vehicles, medical diagnostics, and real-time financial monitoring.
A Framework for Everyone
For any security framework to be truly effective, it must be more than just a technical document; it must serve as a unifying tool that is accessible, understandable, and actionable for a wide range of stakeholders across an organization. This new paradigm is intentionally structured to function as an “audience-aware security compass,” using a hierarchical design that allows different teams to engage with it at a level of detail appropriate to their specific roles and responsibilities. At the highest level, executives and board members can focus on “attacker objectives,” which are broad categories of harm like goal hijacking or data privacy violations. These objectives map directly to core business concerns such as financial exposure, regulatory compliance, and reputational damage, allowing leadership to make informed, strategic decisions about risk appetite and resource allocation. Moving down a layer, security leaders and architects can concentrate on the “techniques” adversaries use to achieve those objectives, enabling them to design and implement overarching defensive strategies.
Further down the hierarchy, the framework provides the granularity needed by hands-on practitioners. AI engineers and data scientists can drill down into specific “subtechniques,” gaining a deep understanding of the precise mechanisms behind different attacks so they can build more robust and resilient models. At the most detailed level, AI red teams and threat intelligence analysts can explore specific “procedures,” which are concrete, real-world implementations of attacks. This allows them to build, test, and validate defenses against known and emerging threats. This multi-layered structure does more than just organize information; it creates a shared conceptual model and a common language for discussing AI risk. It fosters a critical alignment between AI developers, end-users, business leaders, security practitioners, and governance teams—an alignment that has been conspicuously absent in the industry. By enabling clearer communication and a unified understanding of threats, the framework empowers the entire organization to work cohesively toward the shared goal of secure and responsible AI adoption.
Inside the Threat Taxonomy
A Multi Layered Approach to Threats
At the core of this advanced security framework lies a meticulously detailed and multi-layered taxonomy of AI threats, engineered to provide a logical and traceable path from high-level adversarial motivations down to the specific, real-world implementations of an attack. This structure is methodically organized into four distinct and interconnected layers, creating a comprehensive map of the threat landscape. The top layer consists of Objectives, which represent the “why” behind an attack—the adversary’s ultimate goal, such as causing a denial of service, escalating privileges, or manipulating a system’s output. Below this are the Techniques, which describe the general “how” of an attack, outlining the overarching method used to achieve a particular objective. The next level of granularity is Subtechniques, which detail specific, nuanced variations of a technique, reflecting the different ways an attack can be executed. Finally, the most detailed layer is Procedures, which document concrete, observed-in-the-wild implementations of subtechniques, providing actionable intelligence for threat hunters and red teams.
This rich, hierarchical model provides unparalleled clarity and operational value for security teams. The framework identifies nineteen distinct attacker objectives, derived from a combination of observed real-world threats and forward-looking research into technically feasible attacks. These objectives include well-known exploits like jailbreaks and harmful content generation, as well as more sophisticated threats like communication compromise between AI agents and cyber-physical manipulation. To provide the specificity needed for effective defense, the taxonomy then maps these high-level goals to over 150 granular techniques and subtechniques. This includes methods such as direct and indirect prompt injections, multi-agent collusion, memory corruption in AI systems, sophisticated supply chain tampering, and the exploitation of tools and APIs connected to the AI. This level of detail is essential for understanding the complexity of modern AI ecosystems, where a single malicious prompt can propagate across multiple agents and services, or a single compromised dependency in the supply chain can backdoor an entire fleet of models.
Specialized and Embedded Taxonomies
A key innovation of this framework is the direct integration of a robust safety taxonomy within its broader security structure, effectively codifying the principle that security and safety are inseparable. This embedded taxonomy meticulously outlines twenty-five distinct categories of harmful content and behaviors, creating a direct and traceable link between a technical security exploit and its potential real-world impact. The categories cover a wide spectrum of harms, from the misuse of AI for cybersecurity attacks and the compromise of intellectual property to severe privacy violations, the generation of deceptive information, and other critical safety failures. By weaving these safety considerations directly into the threat model, organizations are empowered to move beyond simply preventing unauthorized access and begin to holistically manage the full range of potential negative outcomes, ensuring that their defenses are aligned with both technical resilience and ethical responsibility. This integration ensures that the ultimate consequence of an attack is never overlooked.
In addition to this core integration, the framework incorporates several specialized taxonomies designed to address specific, high-risk areas within the rapidly evolving AI ecosystem. Recognizing that modern AI agents frequently interact with external tools and other agents, it includes dedicated threat models for these communication channels. A comprehensive Model Context Protocol (MCP) taxonomy identifies 14 distinct threat types related to how large language models interact with external tools and prompts, while a corresponding Agent-to-Agent (A2A) taxonomy details 17 threat types that can arise during inter-agent communication, such as impersonation, tampering, or misuse of shared resources. Furthermore, acknowledging that the integrity of an AI system begins long before deployment, the framework features a specialized supply chain taxonomy that covers 22 distinct threats in this critical area, from compromised training data to malicious code in third-party libraries. These specialized taxonomies not only exist as standalone resources for deep analysis but are also integrated into security products and open-source tools, providing a practical, actionable foundation for securing the entire end-to-end AI pipeline.
Charting a Secure Path Forward in the Age of AI
The introduction of this integrated framework for AI security and safety represented one of the most complete and forward-looking approaches to securing artificial intelligence. Its primary finding was that effective AI security could not be achieved without a holistic, integrated, and lifecycle-aware strategy that decisively unified the traditionally separate domains of security and safety. By providing a shared vocabulary and a multi-layered, audience-aware structure, it equipped organizations with the essential clarity needed to navigate the increasingly complex and perilous AI threat landscape. The framework was never intended to be a purely theoretical construct; its principles and taxonomies were directly integrated into platforms like the Cisco AI Defense system, which connected identified threats to actionable indicators and concrete mitigation strategies. This practical application underscored a crucial call to action for the wider community to deepen its collective awareness and collaborate on strengthening defenses against this novel and rapidly evolving ecosystem of AI threats.
