BoxWorks 2025: Protocols, Agents & the Dawn of Intelligent Content

Standards create freedom through constraint. Protocols enable spontaneity by removing friction. Infrastructure becomes invisible when most essential.

This year’s BoxWorks in San Francisco brought these paradoxes into sharp focus through two developer workshops.

The journey flows from the protocol to the practice. First, the foundation for how autonomous agents can cooperate, then the developer tools that bring that cooperation to life inside the enterprise. We saw the foundational “grammar” of multi-agent systems from Google: protocols like A2A and frameworks like ADK- meet the practical, in-field application of Box’s AI APIs for Hubs, Ask, and Extract. What’s emerging isn’t just another AI feature; it’s a coherent architecture for intelligent content—content that can reason, remember, coordinate, and comply.

Grammar for Intelligent Agents: Google’s A2A, ADK & MCP

Dr. Ali Arsanjani, PhD of Google’s Applied AI team laid out the essential spine for moving beyond a single, monolithic LLM to a system of distributed intelligence. The goal is to avoid “agent proliferation syndrome” by creating a small, comprehensible cast of agents that can reliably orchestrate complex work.

“Keep the cast small — five or six agents, seven plus or minus two is already too many.” - Dr. Ali Arsanjani

This approach requires a clear grammar for interaction, built on a few key pillars:

Orchestration with ADK: The open-source Agent Development Kit (ADK) provides deterministic workflows—sequence, parallel, and loop—critical for auditable processes in regulated industries.
A2A vs. MCP Protocols: These two protocols serve different needs. The Model Context Protocol (MCP) is for an agent to access internal resources and tools. The Agent-to-Agent (A2A) protocol is the handshake for agents to communicate across different companies. Think of A2A like the SIP protocol for phone calls—it’s a handshake, then a channel.
Callbacks as Guardrails: Callbacks are the essential mechanism for safety. ADK provides six hooks: before/after a model is called, before/after an agent is called, and before/after a tool is called. This allows for audit logging & redaction after tools respond, or human-in-the-loop approvals before a credit card is charged.
Memory Bank for Continuity: To move beyond stateless interactions, Memory Bank provides persistent storage. Instead of greeting you like a stranger, the agent recalls: “I see you called last week about contract X. Two issues resolved, one still pending.” This turns ephemeral sessions into institutional knowledge.

Prompts for Teams:

Which workflows demand a deterministic sequence for compliance?
What are 7-10 essential tools ones “Invoice Processing” agent truly needs?
Where is the critical moment for a human approval callback in a contract review process?

Putting Protocols to Work: Box AI Developer Workshop

In the standing room only session, Scott Hurrey from the Box developer relations team bridged the gap from protocol to practice, demonstrating how the Box AI API abstracts away the “messy middle” of building AI applications.

“Write extracted fields back as metadata so you can search by reality, not filenames.” - Scott Hurrey

The developer experience is centered on applying AI directly to content where it lives:

Ask API over Files and Hubs: The core RAG engine for asking questions against a single document or a Box Hub: a curated collection of up to 20,000 files. The best Hub isn’t the biggest - it’s the smallest one that answers 80% of a team’s questions.
Extract API for Structured Data: Pulls structured data from unstructured documents using either a Flexible string prompt or a Structured request tied to a Box metadata template. This transforms static files into a queryable database.

Pragmatic Cost Controls

Enterprise-grade agentic systems demand fiscal discipline. This requires controlling costs at both the application and infrastructure layer.

“Thinking budgets prevent runaway costs. Context cache means you don’t keep paying for the same million tokens.” - Dr. Ali Arsanjani

Box’s Extract API shows this discipline in practice: flexible extract costs 1 AI unit per page, enhanced costs 3 units per page, and the enhanced extract agent costs 7 units per page. This allows organizations to align cost with value and optimize across multiple models.

Strategic Implications: Shift to Intelligent Content Management

Connecting the foundational grammar with practical tools reveals a fundamental shift, validated by data from Box’s State of AI in the Enterprise 2025 report: 87% of organizations are piloting agents and 41% are testing autonomous operations. Early adopters see 37% productivity gains, yet only 24% have mature governance frameworks in place.

From Files to Functions: Every document becomes an API surface.
From RAG to Runbooks: Orchestration turns retrieval into auditable business processes.
From Sessions to Memory: Memory Bank transforms fleeting interactions into durable institutional knowledge.

We are moving from content at rest to content in motion. The work now is choreography: small casts of agents, well-documented tools, deterministic flows, and memory that recalls across sessions.

Of course. Here are the expanded appendices, providing in-depth guidance for each audience and incorporating relevant quotes from the sessions.

Appendix A: For Developers & DevRel

This new generation of agentic systems requires a shift in thinking from monolithic application logic to orchestrating distributed, intelligent components. Here are the key patterns and disciplines for building robust, efficient, and governable AI solutions.

1. Agent Design Patterns & Best Practices

Building a multi-agent system is an act of architectural design. The goal is clarity, not complexity. Adhere to these core principles to avoid building brittle, unpredictable systems.

The Orchestrator/Specialist Pattern: Not all agents are created equal. A primary “orchestrator” agent should be equipped with a powerful reasoning model (like Gemini Pro) capable of complex planning and delegation. Its sub-agents, however, can be specialists running on smaller, faster, and cheaper models (like Gemini Flash) for focused tasks like data extraction or simple Q&A. This tiered approach optimizes both cost and performance. “If you want the Mastermind orchestrator, I would use a Gemini 2.5 Pro thinking model, really deep thinking… But then, the sub agents you can use Gemini Flash… it’s very specialized. You don’t need planning and reasoning on every agent, right?” - Dr. Ali Arsanjani
Tool Hygiene is Non-Negotiable: An agent is only as good as the tools it can call. Limit each agent to 7-10 well-defined tools to reduce the model’s decision space and minimize errors. The most critical piece of code one will write is not the function itself, but its description. The LLM relies entirely on ones comments and docstrings to understand what a tool does, what inputs it requires, and what output to expect. “Put that in the description like a comment in the function, because remember, the LLM is going to read it… Describe the input and the output very clearly… what is the JSON output with these fields in this structure.” - Dr. Ali Arsanjani
Implement Callbacks for Granular Control: Callbacks are ones primary mechanism for injecting governance, security, and human oversight. Use the six available hooks (before/after for models, agents, and tools) to build safety directly into ones workflows. For example, a before-tool callback can redact PII before calling an external API, while an after-agent callback can trigger a human approval step in a Slack channel before executing a financial transaction. “Callbacks are extremely important for filtering information, for doing security checks, doing audit checks, and for compliance purposes… but the major thing that they are good for are human in the loop.” - Dr. Ali Arsanjani

2. Developer’s Cost Control Playbook

Agentic workflows can introduce variable costs. Proactive management is essential for deploying these systems economically at scale.

Set Thinking Budgets as a Circuit Breaker: An agent’s reasoning process consumes tokens. A thinking_budget sets a hard limit on this process, acting as a circuit breaker to prevent runaway costs and ensure predictable performance. This is ones primary defense against unexpected charges from complex or recursive agent loops.
Use Context Caching for Large Payloads: For workflows that repeatedly reference the same large document (e.g., analyzing a multi-million token contract or technical manual), Context Caching is a game-changer. One pays the ingestion cost for the large context window once, and subsequent calls in the session reference the cache at a fraction of the cost. “You use context cache, you pay for it once, and then with context caching, you can refresh it, and you basically don’t pay anything beyond that… versus the input token size, which is much larger.” - Dr. Ali Arsanjani
Orchestrate APIs to Reduce Token Consumption: For very large documents, one can design cost-effective workflows. Instead of running a costly extraction process over a 700-page document, use the ask API first to generate a concise summary of the relevant sections. Then, feed that summary text into the extract API’s optional content field. This ensures the model only processes the high-signal information, dramatically reducing ones per-page costs.

Box’s AI APIs

developer.box.com/guides/box-ai

Google’s Agent Development Toolkit & A2A Protocol

Anthropic’s Model Context Protocol **(**MCP) protocol

modelcontextprotocol.io

Appendix B: For Strategists

The transition to intelligent content is not just a technical upgrade; it’s a strategic imperative. Leaders who grasp the underlying principles will build a significant competitive advantage.

1. The Three Strategic Bets on Intelligent Content

Standards Create Freedom and Speed: Investing in protocols and standards is about reducing organizational friction. When agent handshakes (A2A) and data access (MCP) are standardized, teams can develop and deploy new cross-functional workflows in weeks, not quarters. This frees up ones best talent to focus on creating business value instead of fighting integration battles.
Curation is a Durable Competitive Moat: A generic LLM is a commodity. An AI that can reason accurately over ones company’s curated, proprietary, and high-quality data is a strategic asset. The work of building and maintaining a Box Hub is not an IT task; it is the act of creating a corporate brain. This curated knowledge base becomes a defensible moat that powers everything from hyper-personalized customer service to smarter product development.
Governance is an Accelerator, Not a Brake: In a world where AI can act autonomously, trust is the limiting factor for adoption. Deterministic workflows, auditable callbacks, and clear human-in-the-loop approval points are not just for compliance. They are the safety systems that give one the confidence to deploy agents for mission-critical work. Strong governance is what moves AI from a fascinating lab experiment to a reliable operational backbone.

Appendix C: For Content & Knowledge Workers

The rise of intelligent agents redefines the role of the subject matter expert. Your value is shifting from being the person who knows the answer to being the person who teaches the system how to find the answer. This is a moment of empowerment.1. Roles Shifting from Gatekeeper to Curator

In the past, colleagues came to find specific documents. In the new model, people and agents will come to the curated content collection one manages—the Box Hub—to get direct answers. Deep domain expertise is now essential for building and maintaining the “corporate brain” that the rest of the organization relies on. One becomes the expert ensuring the AI learns from the highest quality, most up-to-date information.

“That gives access to up to 20,000 curated files that we index that we keep in sync. All one has to do is get an answer when needed from the latest documentation that existed in that Hub.” - Scott Hurrey

2. Metadata is How One Teaches the Machine What Matters

An AI doesn’t know that an “MSA” is a master contract or that “Q3” refers to a specific time period. One teaches it by applying consistent, structured metadata. When adding a new file, think beyond the filename. Use automated tools like Box’s Extract API to tag the document with its essential properties: the client name, the effective date, the project ID, the renewal date. This is how one transforms a chaotic folder of files into an intelligent, queryable asset. This is how one enables everyone to “search by reality, not filenames.”

3. One is the Essential Human in the Loop

AI is designed to automate the 80% of work that is repetitive and predictable. This frees one to focus on the 20% that requires nuance, creativity, and critical judgment. The new role is to be the expert who is automatically brought in when an agent faces ambiguity or when a decision carries significant financial or legal risk. One is not being replaced; one is being escalated to for the work that matters most.

“If you have human in the loop… it will allow you to effectively call a person… it has to come to me to charge my credit card and to double check for approval. So that’s the human in the loop. You can use callbacks in order to implement human approval.” - Dr. Ali Arsanjani

4. Demand Traceability to Build Trust

To confidently rely on AI, you must be able to trust its outputs. Get in the habit of using features like citations. When an agent provides an answer, click the citation to see the exact passage in the source document it used. This isn’t about catching the AI in a mistake; it’s about a collaborative process of verification. By checking its work, you build your own confidence in the system and help identify areas where the underlying content needs to be clarified or improved.

Part 1 of 4 in the BoxWorks 2025 series. Follow for upcoming coverage of expo insights, enterprise Q&A & strategic vision.

x.com/schwentker/status/1967385707426787603

bsky.app/profile/schwentker.bsky.social/post/3lytklc2uz22t