As the saying goes, “Out of sight, out of mind”. This writer introduces citizen programmers to “retrieval-augmented generation” and vector searching
According to a Forbes article, nearly 90% of users in a survey will not return to a site if they have a bad experience there. While site reliability engineers are traditionally focused on the “five 9s” of ensuring a website remains up and accessible 99.999% of the time, that uptime aspect is apparently only a part of a positive user experience.
What else can cause a user to click away from a site and never return? Answer: not being able to discover what they were looking for.
Trying to search for something and not being unable to find it quickly and efficiently may be one of a web user’s most disappointing experiences. You want to build a site where that rarely happens. However, users make it very hard. Oftentimes, they do not know exactly what they are looking for. They have a picture in their mind of what they want but lack the precise terms, and their search ends up being submitted with keywords such as: “the thing that tightens screws.”
If a human hears such keywords, the response will be to return an index of screwdrivers:
- Articles about tightening techniques
- Blog posts on different types of screws
- Tools that have nothing to do with screwdrivers
This example happens all the time, every single day, countless times a day. However, the search engine on a website may not be so intuitive as to provide a good user experience to bring clarity even when a user lack this communication skill.
Now, let us consider another approach to search that offers possibilities not feasible with traditional keyword searching of databases alone.
Vectorizing search with machine learning
In search engine programming, one approach to enhance the accuracy and relevance of search results is a machine learning method that transforms textual data into high-dimensional vectors. This approach can capture the semantic relationships between words and phrases.
Called “vector search”, the approach is different from traditional keyword-based search (which relies on exact matches) in that it can “understanding” the context and meaning behind queries. Vector search delivers nearest-neighbor results, without requiring a direct match. Leveraging “retrieval-augmented generation (RAG)”, this approach converts text, images, audio, and video into mathematical representations and performs semantic searching that can overcome some of the challenges that can be encountered with generative AI large language models.
A simple vector search example
The process of converting textual data into numerical representations (to capture the meaning of words and phrases) is called “embedding”. This allows models to measure similarities between terms based on their usage and context in large datasets.
Embedding can lead to more nuanced and context-aware search functionalities, potentially advancing information retrieval and artificial intelligence. For example, a dataset containing the string “Your text string goes here” can be converted into vectors by assigning numerical values to each word, allowing a better machine understanding of relationships and similarities.
Each vector represents the semantic meaning of a keyword, allowing the search functionality to “understand” and retrieve relevant information based on context rather than just exact keyword matches.
To start off, the search engine converts user queries into vector representations using a simple dataset, comparing them against vectors already in the dataset. The vector search identifies that the query’s context and semantics are similar to “Your text string goes here”, allowing the engine to return the most relevant result based on the similarity of the vectors.
This process transforms uncertain and unclear user queries into a search with more certainty and clarity.
Storing and retrieving vector embeddings
Vector search is a crucial tool for websites that require quick and cost-effective storage and retrieval of vector embeddings.
As a site’s data grows, so do the vector embeddings, making any solution highly scalable. However, a generic database solution will not be suitable for vector search needs, as a special type of database is needed to handle high-dimensional embeddings efficiently, support rapid similarity searches, and optimize storage for large volumes of vectors.
A database designed specially for vector search ensures the search system remains performant and responsive, providing relevant results in real-time even as data scales. Such a “vector database” is expected to offer advanced indexing capabilities, support multiple data types, and integrate with popular AI frameworks and embedding generation tools.
Additionally, it should provide a quality search experience in offline environments, known as delivering computing “on the edge”.