
The research was always there.
Now anyone can ask it.
Building an AI knowledge agent for product teams at IBM Security.

A product manager needs to make a critical roadmap decision. He remembers research from two years ago that's directly relevant — but not which study, not which report, not which folder it lived in.
A designer needs to defend a decision to a skeptical stakeholder. She knows past research supports her position — but can't surface it quickly enough to matter.
These weren't edge cases. They were Tuesday.
Research insights were being generated continuously — studies completed, findings documented, patterns emerging across projects. But the knowledge lived in static reports and a rigid query tool that could only return what it had been explicitly told to surface.
The data was there. The decisions died.
This case study is about changing that.
Three bullets and a link.
IBM Security had a research repository — a query tool built on Airtable, used across the security portfolio. Researchers logged each completed study: the name, date, product area, product, a link to the full report, and a bulleted list of what they determined were the top three insights.
On paper, it was a knowledge base. In practice, it was a filing system with a search bar.
Querying it required knowing what you were looking for. Most people didn't know the study name — they knew a product area and a vague memory of something relevant. So searches amounted to a product name plus a few keywords, returning a snippet: study name, date, researcher, product area, product, a link, and those three bullets.
Three bullets. Chosen by the researcher. Mandatory. Final. A study might have uncovered eight significant findings — but only three made the list, selected by one person, on one day, with no way to anticipate what a colleague might need two years later.
Whatever didn't make the list stayed in the report. Locked behind a link most people didn't have time (or will) to open.
When the tool failed to provide what was needed — and it often did — people Slacked the researcher directly. If they still worked there. If they even remembered.
That was the system. It wasn't broken. It was just built for a world before AI made something better possible.
Intelligence in waiting.
The research sitting inside those reports represented years of direct contact with real users — their frustrations, their mental models, their unmet needs, their moments of clarity. That knowledge had the potential to inform decisions already being made — and ones nobody ever saw coming.
But potential isn't the same as impact. Knowledge that can't be reached doesn't influence decisions. It just ages.
The opportunity wasn't simply to build a better search tool. It was to transform how institutional knowledge moved through a product organization — from a static archive that rewarded persistence and personal connections, to a living resource that anyone could access, in the moment they needed it, with the full depth of what the organization actually knew.
Better access meant better informed decisions. A PM who could instantly surface what users said about a pain point two years ago makes a different roadmap call than one who relies on memory or skips the reference entirely. A designer who can pull cross-study evidence in minutes walks into a stakeholder meeting with a fundamentally stronger position.
And there was a second beneficiary: the researchers themselves. Every Slack message asking "do you remember what we found about X?" was a fragment of deep work interrupted — a tax on the people best positioned to generate new knowledge rather than retrieve old knowledge. An agent that answered those questions autonomously gave researchers back the focus to do the work that actually mattered.
The goal wasn't convenience. It was compounding intelligence — so that knowledge gained could pay dividends long after the findings that generated it were complete.
From archives to answers.
The build started with a question that shaped every decision that followed: what does this agent actually need to do?
The answer wasn't "search research reports." Search already existed. The Airtable tool was search — constrained, rigid search, but search nonetheless. What was needed was something fundamentally different: an agent that could reason across a corpus of knowledge, synthesize findings from multiple studies simultaneously, and return answers that reflected the full depth of what the organization knew — not just the three bullets someone chose to enter years ago.
That distinction — retrieval versus synthesis — drove every architectural decision.
The corpus
The foundation was thirteen research reports spanning IBM Security's Data Security portfolio — studies covering compliance workflows, data classification, user onboarding, privacy management, and security posture monitoring among others. Before ingestion, each report was reviewed and scrubbed of personally identifiable participant information, internal project codenames, and any unreleased roadmap references. What remained was the research intelligence itself — findings, patterns, user behaviors, mental models, pain points, and opportunities.
The approach
Rather than building a document retrieval system that returns files, the agent was configured to read across the entire corpus and generate synthesized responses grounded in the actual content. A question about user attitudes toward compliance doesn't return a link to a study — it returns a synthesized answer drawn from every relevant study in the corpus, with the underlying sources identified.
This matters because the most valuable insights rarely live in a single study. Patterns emerge across studies. Contradictions between studies are often more instructive than the studies themselves. An agent that reasons across the corpus rather than within a single document surfaces that layer of intelligence — the layer that the Airtable tool could never reach.
The design decisions
Three deliberate choices shaped how the agent behaves:
Synthesis over retrieval — the agent generates answers, it doesn't return documents. This was the core design principle and everything else followed from it.
Citation over assertion — every response identifies which studies informed the answer. This wasn't just about accuracy — it was about trust. An enterprise knowledge tool that makes claims without sources isn't useful, it's dangerous. Users needed to be able to verify, drill deeper, and build on what the agent returned.
Transparency over confidence — this agent was built to surface insights from the research corpus, not to simulate one. When the corpus provides sufficient grounding, responses draw exclusively from it, with studies cited. When coverage is thin, the agent supplements with general knowledge — powered by Claude, Anthropic's underlying model — and discloses that distinction explicitly. Users always know whether they're reading from the research or from outside it. In a knowledge context, the provenance of an answer matters as much as the answer itself.
Conversation over query — the agent employs the Flipped Interaction Pattern, opening by asking about the user's underlying goal rather than waiting passively for input. Understanding the decision someone is trying to make produces more targeted and useful responses than answering the literal question in isolation. This mirrors how a skilled researcher actually operates — diagnosis before prescription.
The platform
The agent was built and deployed in Relevance AI — a no-code platform purpose-built for document-grounded agents. This choice was deliberate. Enterprise AI adoption fails most often not because the technology doesn't work, but because the solutions built require engineering resources to maintain, iterate, and scale. A tool that a business user can own, adjust, and build on without technical support is fundamentally more sustainable than a bespoke engineering project.
Relevance AI was chosen because it mirrors the deployment reality of enterprise AI — accessible, configurable, and built for the people closest to the problem to operate independently.
Evaluation
© 2026 4humans. All rights reserved.
AI should be designed 4humans. Not the other way around.


