Balancing autonomy and accountability in the new age of enterprise AI.
AI agents are here, and they don’t wait for committees.
These autonomous systems, powered by large language models (LLMs), don’t just generate text. They make decisions, take actions, and orchestrate enterprise workflows. The potential upside is massive. So is the risk. As these agents move from proof-of-concept to production, CIOs face a familiar dilemma: how to drive innovation without losing control.
We’ve seen this before. In the 1990s, the early internet unleashed a wave of poorly governed code, clashing standards, and patchwork security. It took years of trial, error, and regulation—and a few headline-grabbing failures—before real governance frameworks emerged. Today’s agent ecosystem feels strikingly similar, in both urgency and uncertainty. But this time, CIOs have the chance to lead with governance from the start, rather than clean up after the fact.
Unlike traditional, deterministic automation tools, agentic systems are goal-directed, adaptive and autonomous. Given a broad objective, like “optimize inventory” or “triage IT tickets”, an agent can decide how to act and which tools to use, chaining actions and adapting dynamically based on context.
As Accenture explains, these agents can plan and reason, not just follow scripts. They operate across systems, execute workflows, and—most critically—they take action without human oversight.
That autonomy is their superpower, but it also introduces unpredictability, opacity, and risk. An inaccurate output from a chatbot is a nuisance. An inaccurate output from an agent, such as a hallucinated answer, becomes a liability when it triggers a purchase order or edits a customer record. The stakes are higher.
To rein in complexity and make agents interoperable, major players are racing to define new protocols. These protocols aim to standardize how agents interact with tools, data, and each other. Two of the most prominent so far:
The takeaway? Protocols are necessary, but not sufficient. They enable scale, but no protocol makes agents safe out of the box. Guardrails must be layered in and CIOs should evaluate them with the same rigor they apply to any enterprise integration: through the lenses of security, architecture, and compliance.
Open-source agent frameworks are rapidly evolving, each promising faster development, more flexibility, and smarter behavior. But they come with tradeoffs.
LangGraph (a LangChain extension): Allows developers define agent behavior as a directed graph. This visual structure improves observability and state management, making it easier to audit how an agent thinks and acts.
CrewAI: Emphasizes team-based coordination, assigning each agent a defined role (e.g., “researcher” or “planner”) and mediating their collaboration on tasks. This role-based pattern mirrors enterprise IAM practices, though it requires careful scoping of what each agent is allowed to do.
SmolAgents: Developed by Hugging Face engineers and favors simplicity. It turns LLM output into code or actions in a few lines, ideal for fast prototyping but potentially lacking guardrails for enterprise-scale ops.
Mastra AI: Built in TypeScript, offers strong typing, long-lived memory, and modular workflows. It aligns well with modern web dev teams, but puts the burden of human oversight on the implementer.
Frameworks shape behavior and behavior, in agentic systems, is policy. Without governance layered in, those same frameworks that accelerate development can also accelerate failure. CIOs must treat every framework decision as a policy decision.
There’s precedent for this kind of chaos.
In the early web era, Netscape, Internet Explorer, and others raced to define their own standards. JavaScript was considered dangerous. SSL was optional. And so-called best practices were mostly just best guesses.
The same dynamics are at play now with AI agents. Frameworks are diverging. Protocols are in flux. And security models are inconsistent. Gartner cautions that agentic AI can proliferate without governance or tracking, making decisions that may not be trustworthy and relying on low-quality data. “As AI spreads, so do risks like bias, privacy issues and the need to align with human values.”
Waiting for the dust to settle isn’t a strategy. CIOs need to set internal standards now.
Just as the cloud era ushered in the need for real-time monitoring, the agent era demands new levels of visibility. Platforms like LangSmith are emerging to meet that need, offering debugging, evaluation, and telemetry purpose-built for agents.
LangSmith tracks every decision, tool call, and output step, giving teams insight into how agents behave and why. Think of it as Datadog for agents. CIOs can use it to inspect behavior, enforce policies, and flag anomalies, before a misbehaving agent causes harm.
And that’s essential. PwC cautions that the opacity of some generative AI systems can make it difficult to unravel why certain outputs are produced, underscoring the importance of transparency and traceability in AI governance.
You don’t need to choose between safety and speed. Forward-looking CIOs are enabling both:
Enterprise AI won’t be powered by one monolithic assistant. It’ll be shaped by a dynamic ecosystem of agents, working across functions, tools, and teams. That complexity demands a new kind of governance that is proactive, integrated, and continuous. Governance can be the great enabler— not the enemy of speed, but the guarantor of sustainable progress.
CIOs don’t need to wait for perfect standards. They need to lead. By embedding governance early, demanding observability, and setting policy at the protocol and framework level, they can unlock the upside of agents without inviting chaos.
Tim Fernihough
Senior Director, Standards & Compliance & IT Services, Orium