Beyond GenAI governance: Extending your frameworks for agentic AI

Summary

Agentic AI creates new governance risks because harm emerges across multi-step actions, not individual model outputs, exposing gaps in traditional model risk management and GenAI controls.
Instead of building new models, firms must extend their existing frameworks to govern agent behavior, accountability, human intervention, and end-to-end decision paths.

Agentic AI introduces a governance risk that most mature GenAI programs weren’t designed to manage: systems that take approved actions across multiple systems in unsafe sequences. This risk rarely appears as a flawed output. It emerges when an AI system with valid permissions executes a chain of individually acceptable steps that leads to an unintended outcome.

Many financial institutions have built strong AI governance foundations through model risk management (MRM) frameworks and GenAI governance overlays that address risks unique to large language models. MRM frameworks establish inventories, risk tiering, validation, and ongoing oversight. GenAI governance extends that baseline with language-model risk controls, including hallucinations, bias, prompt manipulation, and third-party exposure. This maturity is a real advantage, but it also creates a false sense of coverage.

Where current frameworks fall short

The most common mistake is treating agentic AI as a more capable language model instead of a different control problem. Agentic systems plan, delegate, and act across tools and time horizons. Having those systems in place changes where risk emerges and introduces governance gaps that existing controls don’t fully address. It breaks assumptions embedded in both MRM and GenAI governance, requiring extensions focused on behavior, accountability, and action, not just model outputs.

How agentic AI changes governance approaches

agentic-ai-fs-graphics-26-05-27

4 practical ways to extend governance for agentic AI

1. Extend your model inventory to reflect actions, not architecture.
Agentic systems should be classified based on what they can do, not how they’re built. Tier agents by action scope, system access and rollback capability. Irreversible or high-stakes actions require higher tier thresholds. Document tier assignments and underlying risk rationale in the agent inventory.

2. Redesign human-in-the-loop structure around actions, not outputs.
Autonomous actions include low-stakes, fully reversible actions such as retrieving information, generating internal summaries, or pre-populating templates.

Supervisory actions involve moderate risk activities such as escalating customer communications, adjusting non-critical limits, or initiating standard workflows. The agent acts but the action is logged, routed to a human supervisor in near-real time, and reversible within a defined window.

Confirmation-required actions are high-stakes or irreversible outcomes like executing payments, restricting accounts, or initiating regulatory submissions. These actions should require explicit human confirmation before execution, regardless of agent confidence with kill-switch controls. Governance must also define how agents transition between these tiers in real time, not just how each tier is established.

3. Validate behavior, not just responses.
Validation must test decision paths across full workflow scenarios, including edge cases and adversarial conditions. Scenario-based testing and trajectory-focused red-teaming* are critical to ensure that agents behave as intended across sequences of actions and don’t drift following model or prompt updates.

4. Instrument for agentic observability within existing monitoring.
Agentic systems require additional monitoring capabilities, including full decision traces that capture every tool call, input observed, and action initiated across a session, as well as inter-agent communication logs in multi-agent deployments. This data should strengthen existing monitoring structures, not sit in parallel. The objective is to feed richer, more actionable signals into current risk and oversight processes instead of creating a separate agentic reporting layer.

Next steps to establish agentic AI governance

If your institution is extending existing MRM and GenAI governance for agentic AI, your priority isn’t redesign but targeted execution. That means your most important next steps are to:

Conduct an agentic inventory gap analysis. Map all active and planned agentic deployments against your existing model inventory. Identify systems that have been misclassified as GenAI models without agentic attributes as well as deployments that have been excluded entirely because they don’t fit existing model definitions.
Extend your risk tiering schema. Update it to include action set-based categories for agentic systems. Ensure that high-stakes action agents trigger enhanced governance pathways equivalent to your highest MRM tiers.
Define human-in-the-loop thresholds by action class. Establish clear intervention requirements for each agentic use case based on action severity and reversibility. Document these thresholds in formal use-case risk assessments, not only in technical specifications.
Embed scenario-based validation into approvals. Develop validation libraries that test agent behavior across full decision sequences. Incorporate trajectory testing and agentic red-teaming into existing model validation standards as a required condition for agentic deployment.
Update incident response for agentic failure modes. Expand incident response playbooks to address risks specific to agentic systems, including unauthorized actions, trajectory deviations, privilege escalation, and multi-agent accountability breakdowns.

Why agentic AI governance is now a firm-level judgment call

The April 2026 release of Supervisory Letter SR 26-2, Revised Guidance on Model Risk Management, sharpened expectations while leaving a clear gap. The guidance reinforces core MRM principles but explicitly excludes generative and agentic AI, describing them as “novel and rapidly evolving.” In effect, responsibility for agentic AI governance sits with institutions, not regulators.

For firms with mature governance programs, this creates both opportunity and obligation. The opportunity is to show that existing frameworks have been deliberately extended to address agentic-specific risks. The obligation is to activate those extensions before incidents expose any gaps. Institutions that govern agentic AI well won’t be those that build entirely new frameworks. They’ll be the ones that are precise about what their programs already cover, honest about what they don’t, and disciplined in closing the risks that agentic systems introduce.

Trajectory-focused red-teaming is an approach that evaluates the sequence of interactions with a system, rather than isolated actions or prompts. It examines how behavior can evolve over time across different steps or states, with the goal of identifying unsafe or undesirable trajectories before they lead to harmful outcomes.

Let us guide you

Guidehouse is a global AI-led professional services firm delivering advisory, technology, and managed services to the commercial and government sectors. With an integrated business technology approach, Guidehouse drives efficiency and resilience in the healthcare, financial services, energy, infrastructure, and national security markets.

Beyond GenAI governance: Extending your frameworks for agentic AI

Firms with established MRM and GenAI governance have an advantage, but agentic AI exposes gaps those frameworks weren’t designed to address.