Building a Scalable and Efficient Q&A LLM Agent: The Oraczen Approach

Raghavendra Prasad

•

Oct 15 2024

At Oraczen, we are dedicated to delivering innovative AI solutions that help organizations harness the power of data. Our latest project involves building a highly efficient Q&A LLM (Large Language Model) agent capable of providing precise, grounded answers using a large and diverse set of internal and external documents, including URLs. The challenge lay not only in managing the vast amount of data but also in identifying the best approach to store, retrieve, and structure the knowledge for optimal performance.

The Challenge: Choosing Between Vector Databases and Knowledge Graphs

Knowledge Graphs: High Accuracy but High Cost

Knowledge graphs have long been recognized as a powerful tool for structuring complex relationships within data. In theory, they would allow our agent to quickly and accurately retrieve answers based on predefined relationships.

However, several issues became apparent early on:

No Defined Ontology: Without a clear ontology or access to domain experts, building a meaningful graph from scratch was both difficult and time-consuming.

Cost of Building: Using LLMs to identify relationships between entities across such a large dataset would have been expensive and required many iterations to get right.

Wasted Efforts: Even if we successfully constructed a graph, large portions of it might remain unused if users didn’t ask questions that required knowledge from those areas. The upfront investment in building a comprehensive knowledge graph wasn’t justified, given the unpredictable nature of user queries and the lack of clear guidance on how to structure the data.

In light of these challenges, it's worth noting that about 45% of firms are currently testing or implementing generative AI technologies, highlighting the increasing reliance on AI solutions across industries.

Vector Databases: Quick Setup, Complex Retrieval

On the other hand, vector databases provided a more immediate solution. We could easily embed documents and URLs as vectors and experiment with different chunking and embedding strategies. This allowed us to start retrieving relevant data quickly and adapt our methods based on user interactions. However, while constructing the vector database was straightforward, the retrieval process turned out more complex with a need for a slightly more elaborate Agentic RAG Strategy.

Less Direct Retrieval: Unlike a knowledge graph, vector embeddings don’t inherently capture relationships between data points. So, retrieval often required more eLess Direct Retrieval: Unlike a knowledge graph, vector embeddings don’t inherently capture relationships between data points. So, retrieval often required more elaborate processes to extract the right data as context for the LLM agent.

Efficiency vs. Precision: While a vector database could pull relevant documents, we found that it sometimes missed nuanced connections that would make an answer feel more accurate or well-grounded.

A Hybrid Approach: Best of Both Worlds

To balance the strengths of both methods, we developed a hybrid approach that allowed us to start quickly with vector embeddings while gradually introducing the benefits of a knowledge graph.

Start with Vector Embeddings: We initially built a vector database to store embeddings of all documents and URLs. This allowed us to quickly start retrieving relevant content for user queries by comparing vector similarity.

Dynamic Graph Construction: Instead of building a static knowledge graph upfront, we created a system that constructs a graph dynamically based on user interactions. When a query is processed, the top_k documents from the vector search are analysed, and a knowledge graph is built around those documents. This targeted graph construction allowed us to dive deep into specific areas of the data without committing to a full graph upfront.

Ontology Discovery Through Interaction: User queries and the generated responses themselves became a source of insight into the ontology of the data. As users repeated questions or explored certain areas more deeply, the system learned to optimize the relationships and structure of the graph.

Cost Efficiency: By dynamically building only the portion of the graph relevant to the current query, we minimized computational costs and reduced the overhead of maintaining a fully connected graph across all documents. This approach ensured that we didn’t waste resources on parts of the graph that might never be used.

The global AI market is projected to reach $146.1 billion by 2024, indicating significant growth opportunities for organizations leveraging AI technologies.

Continuous Learning and Optimization

Our approach also has the benefit of evolving over time. Each user interaction acts as a trigger for refining and expanding the knowledge graph. Repeated questions signal opportunities to optimize relationships between entities, ensuring that the system becomes smarter and more efficient with each interaction. Over time, as users interact more with the system, dynamic graph construction results in faster retrieval of more accurate answers.The vector database continues to provide flexibility for handling novel or unexpected queries while enhancing precision through an evolving knowledge graph.

Conclusion: Delivering Intelligent and Scalable Solutions

By combining the strengths of vector databases with the dynamic construction of knowledge graphs, Oraczen successfully built a scalable and cost-efficient Q&A LLM agent capable of delivering accurate, grounded answers.

By combining the strengths of vector databases with dynamic construction of knowledge graphs, Oraczen successfully built a scalable and cost-efficient Q&A LLM agent capable of delivering accurate, grounded answers. This hybrid approach enables us to meet evolving user needs while optimizing both cost and performance. As generative AI adoption continues to rise—with nearly 40% of U.S. adults engaging with generative AI technologies.

—we at Oraczen remain committed to pushing the boundaries of AI-driven solutions that adapt and improve over time.

‍

You Should Also Read

Why Adopt a Platform-Based Approach to Generative AI for Enterprise Success?

Raghavendra Prasad

•

Nov 15 2024

From Raw Data to Real Insights: The Power of LLMs in Enterprise AI

Oraczen AI Team

•

Aug 19 2024