Retrieval Augmented Generation (RAG) has revolutionized open-domain question answering, enabling systems to produce human-like responses to a wide array of queries. At the heart of RAG lies a retrieval module that scans a vast corpus to find relevant context passages, which are then processed by a neural generative module — often a pre-trained language model like GPT-3 — to formulate a final answer.
While this approach has been highly effective, it’s not without its limitations.
One of the most critical components, the vector search over embedded passages, has inherent constraints that can hamper the system’s ability to reason in a nuanced manner. This is particularly evident when questions require complex multi-hop reasoning across multiple documents.
Vector search refers to searching for information using vector representations of data. It involves two key steps:
- Encoding data into vectors
First, the data being searched is encoded into numeric vector representations. For text data like passages or documents, this is done using embedding models like BERT or RoBERTa. These models convert text into dense vectors of continuous numbers that represent the semantic meaning. Images, audio, and other formats can also be encoded into vectors using appropriate deep learning models.
2. Searching using vector similarity
Once data is encoded into vectors, searching involves finding vectors similar to the vector representation of the search query. This relies on distance metrics like cosine similarity to quantify how close two vectors are and rank results. The vectors with the smallest distance (highest similarity) are returned as the most relevant search hits.
The key advantage of vector search is the ability to search for semantic similarity, not just literal keyword matches. The vector representations capture conceptual meaning, allowing more relevant yet linguistically distinct results to be identified. This enables a higher quality of search compared to traditional keyword matching.
However, transforming data into vectors and searching in high-dimensional semantic space also comes with limitations. Balancing the tradeoffs of vector search is an active area of research.
In this article, we’ll dissect the limitations of vector search, exploring why it struggles to…