From Theory to Practice: How Capital One is Building the Future of Banking with Agentic AI

At NVIDIA’s GTC 2025 conference, Capital One showcased its impressive journey in AI adoption, presenting two complementary perspectives that reveal how the financial giant is transforming customer service and building competitive advantage through proprietary AI development. As a technology-first company that’s changed from the inside out, Capital One has leveraged its decade-long digital transformation to position itself at the forefront of AI innovation in banking.

The Journey: From Analytics to Agentic AI

“Organizations for most organizations, your data advantage is your AI advantage,” explained Prem Natarajan, PhD, EVP, Chief Scientist, and Head of Enterprise AI at Capital One during his fireside chat with NVIDIA’s Jennifer St. John-Foster. “It doesn’t mean data alone will take you to your AI destination, but it’s near impossible to get to your AI destination without proprietary data.”

This philosophy has been fundamental to Capital One’s approach since its founding. The bank’s deep reverence for data drove first a wave of analytics, then the adoption of statistical machine learning, and now advanced AI implementations that are transforming both customer and employee experiences.

What makes the current wave of AI different? Natarajan highlighted two key aspects:

Modern transformer-based learning techniques can consume “massive, unprecedented amounts of data in a computationally tractable way,” enabling much better predictions
Generative capabilities open an entirely new spectrum of use cases beyond prediction

These advantages have enabled Capital One to progress from using generative AI for knowledge retrieval and customer interactions to developing agentic AI solutions that actually take actions for customers.

Building an Agent Servicing Platform from the Ground Up

While many enterprises are still exploring potential AI use cases, Capital One has already rolled out its proprietary Agent Servicing tool to approximately 20,000 customer service agents across multiple business units. This tool, presented by Tamara Sigler, MVP, Head of Servicing Strategy, and Alfy Samuel, Director of AI Foundations, demonstrates Capital One’s practical application of AI in a regulated environment.

“Customer service agents have a very tough job,” Sigler explained. “Their first job is to listen and understand what the customer needs, then figure out how to solve that problem.” While experienced agents can often work from memory, the complexity of financial products means they frequently need to look up information—a process that historically created delays and frustration.

Capital One’s Agent Servicing tool tackles this challenge through a Retrieval Augmented Generation (RAG) framework that grounds AI responses in internal knowledge content. The system combines advanced hybrid search with generative AI to provide agents with a natural, knowledge-grounded experience.

The technical architecture is impressive. As Samuel detailed, the RAG service orchestrates between a retrieval service built on AWS OpenSearch and an LLM inference service deployed on SageMaker backed by NVIDIA A100 GPUs. The retrieval service supports multiple optimization techniques, including keyword, vector, and hybrid search methods with reranking models.

Capital One’s Agent Servicing Tool: Scalable LLM Deployment Architecture w/ SageMaker & Kubernetes

“We support semantic search, hybrid search. Additionally, we have multiple embedding models, read angle models, as well as personalization on top, which are all applicable to different business use cases,” Samuel explained. This sophisticated retrieval approach ensures that agents receive the most relevant information for their specific queries in different business contexts.

A key advantage of this architecture is its scalability and efficiency. The model inference service uses a GitOps-based workflow via Jenkins to apply configurations on Kubernetes clusters. This enables rapid deployment and optimization for non-functional requirements including performance, latency, reliability, and scalability.

AWS Enterprise Platform Enables Agent Servicing Experiences

Competitive Advantage Through Open Weights Models

A strategic decision that distinguishes Capital One’s AI approach is its commitment to open weights models. While many financial institutions partner with AI vendors using proprietary closed models, Capital One recognized early that this approach would limit its ability to leverage its data advantage.

“We made a very clear bet on open source,” Natarajan explained. “We say open source, but what we really mean is open weights.” This philosophy enables Capital One to fine-tune models with its proprietary data, creating differentiated value that closed systems would prohibit.

The company has closely tracked the performance of open source models and observed that “with each new release, the gap with the state of the art was being cut dramatically.” Additionally, the computational performance of these models has improved exponentially—Natarajan noted a staggering 1000x increase in efficiency over just 20 months.

This approach requires significant in-house expertise. Capital One has invested in customizing models, optimizing inference, and building robust guardrails—all essential in the highly regulated financial services industry.

Measurable Results: Improving Agent and Customer Experience

The business impact has been substantial. The Agent Servicing tool has demonstrably improved search relevance, with successful searches (defined as finding a relevant article within the top five links) increasing from 84% with the legacy system to 93% with the AI-powered solution.

This 9% improvement means agents spend less time searching for information and more time assisting customers. As Sigler noted, “If you’ve ever been waiting on the phone and been put on hold, oftentimes it’s because the agent is searching for information. Being able to give access to that in the top five search results means agents don’t have to scroll deeper into the UI.”

The solution balances three key characteristics:

Fast: Meeting operational SLAs for response time
Accurate: Returning better results that make information easier to find
Flexible: Enabling simple learning and adaptation for thousands of agents

The AI Flywheel: Data as a Competitive Advantage

What makes Capital One’s approach particularly powerful is its recognition of the “AI flywheel” effect. Customer interactions produce data, which drives analytics that improve services, leading to better customer experiences, more engagement, and ultimately more data.

Traditional manual analytics can only cover a fraction of customer interactions. AI allows companies to analyze the full surface area of those interactions. More importantly, as Natarajan explains, “These models will actually learn from these interactions and keep improving themselves. We’re already seeing evidence of that in some places. When that happens, your improvements are…this flywheel starts turning by itself.”

This virtuous cycle doesn’t just improve existing services—it also helps identify opportunities for entirely new products and services. “You’re not just talking about this flywheel; you’re talking about a much bigger flywheel, a much heavier flywheel, but that still moves faster and faster over time,” Natarajan said.

Technical Implementation Details

For the technically minded, Capital One’s Agent Servicing tool incorporates several cutting-edge optimizations:

Unified Tech Stack on NVIDIA Triton Inference Server: Providing fast service with support for multiple backends
Speculative Decoding: Delivering 3x faster inference
KV Cache Compression: Enabling very long contexts and efficient inference
8-bit Floating Point (FP8): Offering 30% faster inference and 2.2x throughput
Structured Output: Providing server-side support for predictable output

The deep collaboration with NVIDIA has been instrumental in these optimizations. “One of the things that’s key when you’re building user-facing applications…is latency,” Natarajan emphasized. “The latency of that engagement is a key driver of the volume and quality and satisfaction of user engagement.” Through their partnership with NVIDIA, Capital One achieved an 8-10x improvement in latency.

Retrieval Service Supports Multi Models Optimized for Scale & Low Latency

The Human Element: AI with Guardrails

Despite all this technological sophistication, Capital One maintains a strong focus on the human element. “Human-in-the-loop AI allows us to take a scientific, yet human-based approach to improving our operations,” was a key message in Sigler and Samuel’s presentation.

As a financial services company, Capital One must balance innovation with risk management. “Everything we do, we start by thinking about risk, and we end by testing for risk before we put it out into the market,” Natarajan explained. This includes internal guardrails and having humans validate model outputs before they reach customers.

Importantly, this human oversight isn’t seen as a limitation but as an accelerator of the AI flywheel. Human feedback helps improve models and accelerates the enhancement of customer experiences.

Looking Forward: From Specialized Models to Unified Intelligence

Capital One views its Agent Servicing tool as “a proving ground for many other Capital One applications that will leverage the same capabilities.” The infrastructure and approaches developed here will enable innovation across the organization.

Looking further ahead, Natarajan sees agentic AI as today’s computational embodiment of how we achieve automation, but he envisions a future where specialized models may converge: “I wonder whether the real longer-term trajectory there is towards a single kind of model that is able to instantiate specialization in what’s required in the moment that it is required.”

Key Lessons for Enterprise AI Adoption

Capital One’s presentations offered valuable insights for other enterprises embarking on AI adoption:

Data Readiness: “If your data is not ready, you’re not at the starting point of your AI journey.”
Customer Focus: “Think about how is it improving employee experience? How is it improving the customer experience? And the other goodness will follow.”
Technology Strategy: Decide whether to build on open source/open weights (requiring significant investment) or take another approach.
Strategic Commitment: Rather than pursuing use-case-by-use-case ROI, make a thoughtful strategic commitment to AI as a transformative force.

Conclusion

Capital One’s AI journey illustrates how a technology-first approach, combined with a deep understanding of data and customer needs, can transform even heavily regulated industries like banking. By building proprietary AI capabilities grounded in open weights models, the company has created differentiated value that improves both employee and customer experiences.

As AI continues to evolve, Capital One’s foundation—built on a decade of technological transformation—positions it well to continue leading innovation in financial services.

This article is based on presentations by Capital One leaders at NVIDIA GTC 2025, including “AI at Scale: Lessons from Capital One’s Agentic AI Adoption” with Prem Natarajan and “How Capital One Built its Own Generative AI Agent Servicing Tool” with Tamara Sigler and Alfy Samuel.