Launching an Agentic RAG Chatbot with MongoDB and Dataworkz
Leverage agentic RAG using MongoDB and Dataworkz to enhance customers’ shopping experiences with a personalized chatbot.
Use cases: Gen AI, Personalization
Industries: Retail
Products: MongoDB Atlas, Atlas Vector Search
Partners: Dataworkz
Solution Overview
The retail industry is always at the forefront of trends, quickly embracing innovation. With the rise of technologies like generative AI and agents, retailers have enthusiastically adopted them for various use cases. A few examples include offering real-time assistance, personalizing customer experiences, and enhancing search results. This is transforming the way brands connect with customers in meaningful ways.
Recent studies reveal that AI-powered chatbots boost online sales in the United States by nearly 4% year-over-year, reinforcing the idea that AI is not just a trend but a lasting driver of growth in retail. Today, we will explore agentic RAG (retrieval augmented generation), a cutting-edge technology that can further revolutionize this retail space. We will also take a closer look at how a solution that leverages MongoDB and Dataworkz combines operational and unstructured data to drive transformative customer experiences.
Before we jump in let’s understand a few concepts that help us set the context for this solution.
What Is an Agent?
An agent is an artificial computational entity with an awareness of its environment and associated data within the context. These agents can be used to interact in the case of ecommerce or perform a portion of complex tasks as needed.
Retrieval-Augmented Generation (RAG)
Retrieval-augmented generation is an approach used to augment large language models (LLMs) with proprietary data so that they can generate more accurate and context-aware responses while also reducing hallucinations.
Figure 1. Conventional RAG
Agentic RAG
Agentic RAG takes this a step further by introducing an AI agent-based implementation of RAG. In this model, different tools and functions can be accessed by the agent, enabling it to go beyond information retrieval and generation—it allows it to plan. Agents can determine if they need to retrieve specific information or not, decide which tool to use for the retrieval, and formulate queries. These capabilities are crucial as it enables the agent to pull information from multiple data sources, handling complex queries that require more than one source to formulate the response.
You can start to picture the enormous potential it has across industries to enhance customer interactions and streamline processes.
Figure 2. Agentic RAG
For more details see Dataworkz Agentic RAG.
Figure 3. Conventional vs Agentic RAG comparison
Impact on the Retail Industry
Generative AI is revolutionizing the retail industry. In 2024, one quarter of consumers used AI in their shopping experience, representing a 4% increase over the previous year. With such expansion, the potential for innovation remains vast.
Some generative AI use cases across retail include:
Building customer-support chatbots: Imagine a real-time, GenAI-powered support chatbot that is able to assist customers at any point in time and is context aware of the business’s policies as well as the user’s history and preferences.
Personalizing product recommendations: Providing the customer with recommendations based on their specific likes, needs, and past orders will make the shopping experience more targeted to the shopper making it more enjoyable and likely to end with a purchase.
Creating dynamic marketing content: Generating fresh, relevant content for the customer by creating dynamic promotions, emails or messages specific for the customer. This leads to higher engagement, increased sales, and better customer retention.
All of the examples above show how, in everyday life, MongoDB and Dataworkz are already supporting retailers to build impactful customer experiences.
Reference Architectures
Introducing Dataworkz
Dataworkz is an enterprise-grade RAG-as-a-service platform that transforms how organizations build and deploy AI applications. At its core is an advanced, agent-based architecture combined with graph-optimized retrieval, enabling large enterprises to launch sophisticated RAG applications in hours, not months.
The platform eliminates the need for specialized AI teams through an intuitive no-code builder that automatically implements best practices in RAG development. Unlike traditional approaches that lock you into early architectural decisions, Dataworkz enables rapid experimentation—you can test different retrieval strategies, prompt variations, and model combinations in a controlled environment before committing to production.
From data ingestion to production deployment, Dataworkz handles the entire RAG lifecycle with enterprise-ready features for scaling, debugging, and operations.
Key differentiators:
Graph-optimized knowledge retrieval for complex relationships.
Agent-based architecture for sophisticated reasoning.
No-code builder with built-in best practices.
Full lifecycle support from experimentation to production.
BYO (bring your own) flexibility—LLM, embedding model, vector database.
Enterprise-grade security and scalability.
Trusted by Fortune 500 companies, Dataworkz delivers production-ready RAG applications without the traditional overhead of building and maintaining complex AI infrastructure.
Leveraging MongoDB with Dataworkz
By leveraging the strengths of both MongoDB and Dataworkz, organizations can create powerful, scalable, and accurate generative AI applications while minimizing development complexity and time-to-market.
Dataworkz streamlines the process of creating RAG pipelines by providing a point-and-click experience to extract unstructured data in any format, configure a chunking strategy, and create vector embeddings. Dataworkz RAG builder also allows developers to choose different retrieval mechanisms—lexical, semantic, or graph, with different thresholds to build the context for answering a user question.
The integration also enables real-time data processing and analytics, which is essential for generative AI applications that require up-to-date information. This capability ensures that AI models have access to the most current data, improving the accuracy and relevance of generated responses.
Agentic RAG Architecture
To build something transformational, conventional RAG systems, which primarily utilize vector search techniques, must be integrated with up-to-date information from operational databases.
MongoDB Atlas Vector Search provides built-in support for vector embeddings, which eliminates the need for a separate vector database, simplifying the architecture and reducing complexity.
That’s where agentic RAG shines. With Dataworkz agents, retailers can provide governed access to MongoDB collections, which are configured as tools. In many cases, customers have an API layer that abstracts the underlying collections—Dataworkz can be integrated with REST API or GraphQL.
Additionally, any RAG pipeline configured in Dataworkz can be made available as a tool to an agent. This gives agents the ability to understand unstructured data in a SharePoint site, a confluence wiki page, or markdown depending on the user’s question.
Existing retailers with applications that leverage MongoDB as their data platform will benefit from Dataworkz’s close integration with MongoDB and leap-frog their AI adoption to incorporate agentic RAG into their solutions.
Dataworkz agents can access multiple data sources and use reasoning LLMs to decide which tool to use to answer a user’s question. An agent can dynamically switch between different MongoDB collections or even multiple databases to bring in structured data such as shipping status, customer profiles and preferences, and order histories. Additionally, third party solutions like ERP (Epicor), CRM (Salesforce), and more can be integrated through APIs exposed via the providers. Internal services exposed as GraphQL is another popular type of integration. Together, these tools enable the agent to understand the user’s questions in their context and provide very personalized responses relevant to their experience and their data.
The way a Dataworkz Agent works is:
Dataworkz agent framework partitions use cases into scenarios, such as being able to answer questions about store policies, provide recommendations, or search for an order. Scenarios have a curated set of tools relevant to the scenario. When a user interacts with the agent, a classifier selects the appropriate scenario.
The agent then employs a reasoning LLM to come up with a plan to answer the question. Given the user’s question, the conversation context, memory, and the tools (such as access to a MongoDB collection) in the scenario, the LLM comes up with a sequence of steps to retrieve information to answer the question. This process is iterative; after each iteration, it’s a good idea to conclude by determining if enough context has been retrieved via the tools or if more iterations, planning, and execution are required to build more context.
Finally, the context constructed in step 2 is used by the agent to generate the necessary response or perform any action, or it might decide that it actually needs to ask the user a clarifying question.
Dataworkz brings the same ease of use to agents as it has to building RAG applications. Instead of requiring coding with libraries or frameworks, Agents are created by simply describing their intended functionality. They can be built declaratively, tested using built-in tools, and seamlessly deployed via an API when ready. This approach makes agents far more accessible and significantly more cost-effective to develop and deploy.
Figure 4. Dataworkz & MongoDB Agentic RAG architecture
Building the Solution
Building this solution can be broken down in four major steps:
Replicate the Demo Database
You can either use a publicly hosted, pre-populated cluster or provision your own cluster within your Atlas account and populate your database with the required data for the demo. If you choose to bring your own MongoDB, a data dump can be found inside the repository to quickly replicate the database with all the necessary data and metadata with one quick mongorestore command.
Create Dataworkz RAG Application
Sign up to Dataworkz and create a RAG app for the e-commerce policies. You can use the PDF inside the repository to use as the e-commerce policy document. This is the unstructured data that can be made available as a tool to a Dataworkz Agent.
Create the Dataworkz Agent
Connect to your MongoDB cluster provisioned at the previous step. Follow this guide to learn the configuration required by the Dataworkz connector to access MongoDB. After that, create a MongoDB tool and a Dataworkz agent and test the agent.
Configure Your Application’s Frontend
Obtain the demo code by cloning the GitHub repository to your local machine, configure the environment variables and install the dependencies. Finally, run the app locally at http://localhost:8080.
For complete implementation details, including code samples, configuration files, and tutorial videos, visit the GitHub repository.
Key Learnings
Understanding agentic RAG: Incorporating agents will broaden the possibilities of what can be done with the conventional RAG architecture. Adding a layer of decision-making enables agents to plan, take action and utilize their tools to ensure a higher degree of contextual understanding and operational efficiency.
Technical integration: By combining the strengths of MongoDB and Dataworkz, retailers gain the power to build stronger features—whether it’s offering a more personalized experience or providing their customers with real-time smart assistance. Their seamless integration will speed up the development process and allow teams to focus on creating more differentiated features that will boost your position as leaders in the market.
Future of retail with AI: Applying RAG architecture into retail use cases will enhance the customer’s shopping journey by delivering more context-aware support and personalized content.
Rapid prototyping and iteration: Agentic RAG is an iterative process of testing and validation, and it requires quick turnaround times in order to be able to move rapidly through prototyping and production. The ability to configure multiple versions of the different components of the agentic RAG pipeline, understand and measure their impact, and promote to production in a governed fashion is a key consideration in choosing a platform.
Technologies and Products Used
MongoDB Developer Data Platform
Partner Technologies
Authors
Angie Guemes, MongoDB
Sachin Hejip, Engineering at Dataworkz