Inside Box’s AI Strategy: Aaron Levie on Agents, Interoperability, and the Future of Work

In this fast-paced and insight-packed episode, Box CEO Aaron Levie joins the pod to unpack how one of the most established enterprise content platforms is reinventing itself around AI. From building intelligent agents on top of 20 years of enterprise data, to navigating model interoperability and the coming age of agent-to-agent ecosystems, Aaron delivers a masterclass in what it really takes to go from SaaS to AI-native.
We cover everything from Box’s internal AI culture and their refusal to join the model arms race, to his vision for agent-based interoperability and the pitfalls of chaining probabilistic systems. Whether you’re building the next AI workflow tool or wondering how incumbents can stay relevant, this is one conversation you don’t want to miss.
“95% of enterprise data is underutilized. AI agents let us finally activate it.” — Aaron Levie
Tune in for sharp insights, candid strategy, and a look into the AI-powered future of enterprise software. PS: Also check out Box's origin story.
Part One: The State of AI in the Enterprise
🔁 From Hype to Adoption: Where Are We, Really?
Aaron frames the state of enterprise AI adoption using the “Crossing the Chasm” model:
“We’re in the ascent on the early pragmatist to pragmatist side.”
That means we’ve moved beyond early enthusiasts to practical, risk-managed implementations.
But—and this is key—not all AI use cases are at the same point. You have to think about AI adoption category by category:
Use Case | Adoption Stage |
---|---|
AI coding (e.g., GitHub Copilot) | ✅ Fully in pragmatist growth mode |
Document RAG (retrieval-augmented generation) | 🚀 Early pragmatist, climbing fast |
AI outbound sales reps | 🔬 Still early adopters only |
Multi-step agents with chain-of-thought | ⚠️ Experimental, error-prone |
Levie says:
“You can’t just say ‘AI’ as a monolith. You need to know what problem and what workflow you’re applying it to.”
🧪 Proof-of-Concept vs. Deployment: What's Actually Live?
Enterprise interest is high. Executives are leaning in, experimenting actively:
“Most companies I meet with are like, ‘How many use cases can I apply this to?’”
However, a reality check is also needed:
- Many deployments are still in pilot or proof-of-concept phase.
- The biggest blockers aren’t belief—but technical feasibility, data quality, and governance.
For instance:
“If you just deploy a really good agent across a 5,000-person company... they’ll start surfacing corporate secrets by accident because of bad permissions.”
This is why search, indexing, and data governance are still critical AI infrastructure problems.
⚖️ Why AI ≠ Cloud (But It Rhymes)
Levie contrasts AI adoption with the last big transformation: cloud computing. With cloud, there was years of resistance from banks and enterprises.
With AI?
“There’s no philosophical resistance like with cloud. The energy is different—everyone wants in.”
Why the difference?
- ChatGPT gave employees and execs a personal “aha” moment.
- A new generation of employees expects AI tools as part of how they work.
- Top-down (CEO) and bottom-up (end-user) pressures are aligned—unlike in previous tech waves.
📊 Can You Charge More Just Because It's AI?
Levie is crystal clear on this:
“You won’t grow faster because there’s a magical AI premium on your software.”
There is no pricing multiplier just for having AI features. Instead, growth comes from expanding into new workflows and new TAM.
✅ You can charge more if:
- AI unlocks new use cases that weren’t previously software-driven.
- AI automates previously manual processes (e.g., contract review, sales prep).
❌ You cannot charge more if:
- You're just layering AI on top of existing features and expecting higher price points.
- You assume users will pay a premium “because it’s AI.”
“Wall Street got ahead of themselves for three months thinking you could just raise prices because it seems cool. That was ill-fated.”
🔁 Product & GTM Adjustments in the AI Era
AI doesn’t just change products—it shifts go-to-market motions and business models:
- Land-and-expand motions are supercharged with agents.
- You can enter more verticals without building full vertical SaaS apps (if agents are modular).
- Pricing may shift to usage-based or task-based in some contexts, but Box still monetizes primarily via seat-based pricing plus add-ons.
🌍 Why Open, Not Walled Garden, Wins
Levie repeatedly emphasizes openness and composability:
“We want to be the best place for companies to manage content—but we don’t need to own the interface. Just let the agents talk.”
That includes:
- Supporting multiple foundation models (OpenAI, Anthropic, Gemini, etc.)
- Offering API-first agent infrastructure for internal or external developers
- Enabling agent-to-agent interoperability across SaaS products
He compares this to the rise of REST APIs 20 years ago—agent-to-agent communication may follow the same pattern.
⚠️ The Real Bottlenecks
Even if enterprises are hyped, the following issues are still big friction points:
- Search quality and relevance: Bad search = bad AI.
- Governance and permissions: Especially in agent workflows.
- Agent chaining errors: Compounding probabilistic logic can go sideways fast.
- Lack of reviewable output: If users can’t validate, trust breaks.
“AI will only be as good as the data you feed it... and we still live in a world of messy search.”
Part Two: AI Strategy and Product at Box
Box’s AI strategy is a top-down, system-level orchestration of AI services, layered on top of two decades of enterprise content management. It's not about building foundational models, but about activating underutilized enterprise data using composable agents, powerful developer tools, and targeted interfaces.
Aaron Levie sums it up:
“What gets me excited about agents is the expansion of what people can do with software — solving use cases we just never ended up prioritizing before.”
🏗️ The Box AI Architecture: A Layered System
1. The Foundation: Decades of Enterprise Content
Box already powers content for over 100,000 organizations — this includes:
- Versioned documents
- Permission layers
- Governance and compliance frameworks
- In-browser preview tech
This existing infrastructure becomes the launchpad for AI workloads. For example, Box’s file viewer — originally designed to render documents — already extracted text and PDF previews, which is now reused for embedding generation and document parsing.
“That conversion engine we built to view Word docs in your browser? That’s now how we extract text for embeddings.”
2. AI Platform Layer
Box has built a robust AI platform that includes:
- Text Extraction: From Office files, PDFs, contracts, etc.
- Embedding Generation: Based on selected corpora, not applied universally yet due to cost (hundreds of billions of files).
- Vector Storage: Enabling semantic search and retrieval.
- Model Abstraction Layer: Integrates multiple LLMs (OpenAI, Anthropic, Gemini); customers can plug in their own API keys.
- Agent Framework: Users can configure model instructions + tools = primitive agent setup.
“We don’t care if the customer uses the UI or just APIs. It’s all about making our platform AI-native and developer-ready.”
3. Box Hubs: Targeted, Scoped RAG Interfaces
Hubs are a UI and data construct where customers can query specific document sets (e.g., HR policies, earnings reports).
Why this approach works:
- Data scope is curated: Customers upload relevant docs.
- User queries are topic-bound: HR questions in the HR Hub.
- Reduces hallucination, increases precision.
“We’re cheating in two ways: the data is constrained and the intent is constrained. That eliminates 95% of the traditional RAG problems.”
4. Box AI Studio + API Access
- AI Studio: GUI for configuring custom agents (models, tools, prompts).
- API Suite: Enables external developers to:
- Query Hubs
- Run agents on documents
- Pull summaries, extracted data, or filtered results
- Integrate Box AI into their own apps
“You can say: ‘Here’s the hub ID, here’s the agent I want to use — now go.’”
This setup positions Box as a headless content AI infrastructure, not just a SaaS tool.
🧠 Agent Use Cases: Not Just Replacing People — Enabling the Impossible
Levie emphasizes: AI’s real power is not just replacing manual labor, but unlocking work that previously wasn’t possible due to time, cost, or resource constraints.
Some examples:
- Legal: Parse and extract contract clauses at scale → inform renewal strategy
- Sales: Pull insights from sales decks and proposals → improve personalization
- M&A: Review thousands of diligence docs → detect risks and duplications
- Support: Query product manuals and technical docs → power self-service bots
“For 95% of enterprise data, there’s value we just never tapped into. Agents let us unlock that value.”
🤝 Open by Design: Box’s Multi-Model Philosophy
Box does not train its own LLMs, nor does it fine-tune models. Instead, it:
- Supports leading LLMs (OpenAI, Anthropic, Gemini, with Meta and xAI planned)
- Allows customers to plug in their own models via API keys
- Builds only lightweight scaffolding (like embedding improvements or prompt templates)
“We considered building a model for about 10 minutes... why would we want to enter that war?”
“If someone ships a breakthrough model, we want to adopt it — not be locked into something we built six months ago.”
Even fine-tuning is discouraged internally. Levie’s team focuses on model flexibility, not lock-in or over-customization.
🧪 Innovations: E-RAG and Future Direction
Box has already developed internal enhancements like E-RAG — an “Enhanced RAG” that improves entity extraction and retrieval precision for enterprise docs. These are infrastructure accelerators, not product silos.
“The goal is not to outdo the model companies. It’s to scaffold around the best models with the best content context.”
🔮 Vision: Enterprise Agent Ecosystem
In the long term, Box envisions a composable agent ecosystem, where:
- Agents in Box talk to agents in Salesforce, Workday, Snowflake, etc.
- Each system retains domain authority, but interoperates via shared agent protocols
- Developers choose between traditional APIs or agent-level handoffs
“It’s like REST APIs 20 years ago. We’re going to see the same explosion in agent-to-agent communication.”
✅ Strategic Takeaways
Area | Strategic Choice |
---|---|
Foundation Model Strategy | Stay out of model wars. Plug into best-in-class. |
Product Approach | Build orchestration, not algorithms. Enable use cases, not raw infrastructure. |
Customer Flexibility | API-first, model-agnostic, scoped AI interfaces. |
Innovation Focus | Enhanced retrieval, smart agents, practical workflows. |
Long-Term Vision | Agent-based ecosystem across the enterprise stack. |
Part Three: Building an AI Culture at Box
Box didn’t just add AI features — they went all in. The company undertook a full cultural shift, turning AI from a product initiative into a company-wide operating principle. This required mindset shifts, structural changes, and leadership alignment across every department.
“We’ve told everybody in the company that we want to use AI to be as productive as possible and aggressively use it across the business.”
🧭 Step 1: Founder-Led Conviction
Aaron Levie experienced his “ChatGPT moment” just like millions of others — but as a CEO and founder, it triggered immediate strategic action.
“The moment I realized I could copy-paste a document and ask questions... I thought, ‘This will change how we work forever.’”
Despite having access to playgrounds like GPT-3 before, it wasn’t until the ChatGPT interface simplified the experience that the lightbulb went off. He jumped into “founder mode,” evangelizing the opportunity internally — but this wasn’t a solo act.
🔗 Leadership lock-in:
- CTO (from a Box-acquired company) aligned quickly and took ownership.
- CPO and engineering leads joined the charge.
- The initial pitch was modest — “just some API integrations” — but momentum quickly snowballed.
“We said it would be lightweight... now it’s the biggest team in the company.”
🛠️ Step 2: Organizational Rollout
Once the core AI team was formed, Box started to infuse AI thinking into the rest of the company, department by department.
How they did it:
- Dedicated AI team built foundational tooling (e.g., Hubs, Studio, model orchestration).
- Weekly internal all-hands demos showcased how employees across functions were using AI in their workflows.
- Encouraged bottom-up experimentation across product, marketing, support, legal, etc.
This wasn’t just a centralized R&D effort — it became everybody’s job.
“Every sales rep, every support rep, every engineer — everyone now plays a role in our AI strategy.”
🧠 Step 3: Normalize Everyday AI Usage
Rather than treat AI as a high-level strategic layer, Box embedded it into daily workflows:
Internal tools widely adopted:
- Box AI (their own product): Used by teams to summarize, query, and create content inside Box Notes and Hubs.
- GitHub Copilot: Became widespread among engineering teams.
- Claude: Integrated for various creative and research tasks.
- Cursor: Rolled out for VS Code users.
“By volume, Box AI might be our most used AI tool internally, maybe even more than Copilot.”
AI at Box wasn’t reserved for product managers or engineers — support teams, sales reps, and marketers were using it too.
🌱 Step 4: Cultivate AI Fluency Company-Wide
Toby from Shopify posted a memo about mandating AI usage company-wide — Aaron praised it and said Box took a similar approach.
“Toby’s memo hit all the right notes — we’re doing much of the same: internal demos, shared learnings, pushing people to experiment.”
Key aspects of Box’s internal AI culture:
- AI education is informal but constant — via demos, Slack sharing, and public wins.
- Permission to play — employees are encouraged to test, prompt, experiment.
- No AI priesthood — everyone, from entry-level to execs, is expected to build fluency.
⚖️ Cultural Tension: Founder Excitement vs. Organizational Skepticism
Levie is self-aware about the risks of overhyping trends. As a founder who regularly gets excited about new tech, he knew he had to prove this wasn’t “just another VR moment.”
“You have to do this filtering: is this just founder hype, or a company-level pivot?”
That’s why early traction, prototypes, and use cases were critical. He positioned AI as a “code red” opportunity — not a shiny new toy, but a shift in how the company would work, build, and sell going forward.
🔄 AI Work = Everyone’s Work
AI is now part of the fabric of Box’s internal strategy. Every department has a role to play in building, testing, or adopting AI:
Function | AI Role |
---|---|
Engineering | Build internal tools, integrate external models, test developer agents |
Product | Design UX around Hubs, agents, and enterprise workflows |
Sales | Use Hubs to extract customer insights, generate proposals |
Marketing | Leverage Box AI + Claude for content ideation and automation |
Support | Build answer bots from internal product documentation |
Legal/Compliance | Ensure data governance, permissions, and safe agent deployment |
This wasn't a departmental initiative. It was organizational transformation.
🧩 Summary: The Playbook for AI Culture at Box
Principle | How Box Made It Happen |
---|---|
Founder-led urgency | Levie and execs pushed hard after ChatGPT’s launch |
Central team, open APIs | Built platform components, but encouraged usage across org |
“Use AI aggressively” mandate | Company-wide permission and expectation to integrate AI |
Internal demos > top-down lectures | Weekly all-hands featured cross-functional AI wins |
Empowerment, not control | No gatekeeping — anyone can try tools, give feedback |
Phased scaling | Started lean, now Box’s AI org is the largest internal team |
Part Four: The Future of Enterprise Agents
Aaron Levie believes the future of enterprise AI isn’t just about better models — it’s about how systems talk to each other. The next big wave, in his view, is not just using AI in isolation, but embedding agents deeply into the enterprise software fabric so they can collaborate across tools, vendors, and data systems.
“We imagine a world where agents can run around and talk to each other. And Box is one of those agents.”
🧩 The Big Shift: From Monoliths to Modular Agent Ecosystems
Today, enterprises often build siloed AI features into their platforms — chatbots here, summaries there, smart filters elsewhere. But that model won’t scale. Levie believes we’re heading toward:
- Composable software systems, where AI agents act on behalf of users and apps.
- Horizontal data access, where content and knowledge live across tools (Box, Salesforce, Workday, etc.).
- Agent-to-agent workflows, where one app doesn’t own the entire workflow, but participates in it.
“We don’t need to own the interface — we just want to be the best place for content to be used in these workflows.”
🧠 Conceptual Framework: Agents as System-Level Actors
Levie sees agents as a natural abstraction on top of software APIs. Instead of writing brittle integrations, imagine this:
- A Box Agent handles content queries: “Give me all contracts signed in the past year.”
- A Salesforce Agent adds metadata: “These deals closed above $1M.”
- A Workday Agent tags employees involved in those deals.
Each of these agents communicates asynchronously, respects permissions, and works on scoped tasks.
He references the analogy of REST APIs in the early 2000s:
“It’s following the same curve we saw with APIs 20 years ago. Agents will become the new API surface.”
🏗️ Enabling the Future: What Box Is Building
To prepare for this ecosystem, Box is doing two things:
1. Plug-and-Play Agent Infrastructure
- Every Box Hub can be accessed via API.
- You can query a Hub with an agent of your choice.
- Box AI doesn’t require its own frontend — you can build your own.
“We don’t care if the customer comes through our UI or just uses the agent in their own system.”
2. Agent Development APIs
- Box offers developer APIs that let third-party systems:
- Trigger agents
- Query content via Box Hubs
- Integrate into workflows outside of Box (e.g., internal dashboards or vertical SaaS)
This enables composable automation — agents that can be assembled like LEGO blocks to complete work across platforms.
🛑 Interop Challenges: Where Things Break Today
While the vision is compelling, Levie is realistic about the technical and operational challenges:
- Search reliability: If an agent pulls the wrong data (due to poor retrieval), every step after is tainted.
- Permissions and access control: AI can surface sensitive data never meant to be discoverable.
- Agent chaining errors: With each handoff, probabilistic errors can compound — leading to hallucinated outputs.
- Lack of shared standards: There's no universal "language" or protocol for how agents should describe capabilities or hand off tasks.
These are non-trivial engineering challenges that need to be solved before agentic systems can scale reliably.
“The moment the AI finds the wrong thing, you're path-dependent on that error. Now everything downstream is wrong.”
🔧 Agent Standards: The Role of MCP and Beyond
Levie calls out Anthropic’s MCP (Multimodal Communication Protocol) initiative as a potential catalyst:
“God bless Anthropic for putting MCP out early... someone needed to plant the flag.”
MCP proposes a standard protocol that agents could use to describe their capabilities, communicate context, and pass along tasks — like an API contract, but for LLM agents.
He’s optimistic, but also acknowledges:
- Competing frameworks will emerge.
- It may take 2–3 years before robust agent handoff becomes mainstream.
🛣️ Timeline and Adoption Curve
Levie sees the future playing out much like the evolution of cloud APIs:
Milestone | Equivalent in Agent Ecosystem |
---|---|
REST APIs become common | Agents define and expose capabilities |
API documentation becomes standard | Agents describe skills, inputs, outputs |
API calls become orchestrated | Agents chain workflows intelligently |
“Two years from now, I think we’ll feel totally comfortable that all our software can talk to each other agentically.”
⚖️ Developer Tradeoffs: API vs. Agent
As this new paradigm emerges, developers will face key architectural choices:
- Should I write direct API calls for precision and cost control?
- Or use agents to orchestrate tasks in a more flexible but fuzzy way?
Levie’s prediction:
“A new generation of vibe coders will think the world is all MCPs... but we’ll need to be thoughtful about when you use agents vs. raw APIs.”
He warns against over-agentifying tasks that would be simpler with deterministic APIs. A hybrid model — agent for intelligence, API for precision — will likely dominate.
🧠 Strategic Positioning: Box’s Role in the Ecosystem
Box isn’t trying to be everything — it’s focused on being the best content agent in a network of interoperable systems:
- They won’t build vertical SaaS apps (e.g., contract management or legaltech).
- But they will expose content agents to help those tools access Box data reliably.
- They welcome deep integrations, even with “competing” AI stacks.
“There’s always going to be more developers outside your company than inside — don’t rely only on internal innovation.”
(Referencing Bill Joy’s quote about Sun Microsystems)
🧭 Summary: The Interoperable Agent Future
Pillar | Strategy |
---|---|
Architecture | Build agents on top of content and workflows — not UIs or monoliths |
Openness | Support third-party models, developer APIs, and plug-ins |
Vision | Become a modular agent in a larger enterprise AI ecosystem |
Interoperability | Bet on open protocols like MCP for agent-to-agent handoff |
Pragmatism | Use agents where appropriate, retain API-first options where better |
Also in the podcast:

Member discussion