Relevant As-You-Type Suggestions
Add fast and relevant as-you-type suggestions to your application, incorporating user context and domain-specific ranking factors.
Use cases: Content Management, Catalog
Industries: Retail
Products: Atlas, Atlas Vector Search
Solution Overview
The quicker a user can navigate to desired, relevant content, the sooner they can leverage that knowledge, buy that product, help that customer, and make those critical decisions. Picture getting to “The Matrix” by only typing “matr,” or finding replacement air filters for the ones you bought a few months ago by typing only “fi” in the search box, and your previous order is the top suggestion. These are examples of as-you-type suggestions. Vector Search and full-text search are great at matching content semantically and fuzzily, when there is a complete query or very close word matches. But as-you-type functionality can return relevant results with even fewer characters and where there's even more distance between the inputted text and the target keyword. Only a lexical-based solution that facilitates partial matching like as-you-type suggestions can provide this level of relevant, context-sensitive results.
As-you-type functionality — also known as autocomplete, autosuggest, typeahead, search-ahead, and predictive search — often refers to low-level character matching, as opposed to a purpose-built, comprehensive solution. As demonstrated here, we use “as-you-type suggestions” to refer to a complete solution encompassing tunable relevancy, filtering, and highlighting.
With this solution, you’ll be able to add fast and relevant as-you-type suggestions to your application, incorporating user context and domain-specific ranking factors.
Reference Architectures
This as-you-type suggestion solution is architecturally straightforward
— as a user types, requests are sent to Atlas Search, which returns
relevant results. The heart of the architecture is a specialized
entities
collection and the corresponding queries.

Figure 1. As-you-type solution architecture
Data Model Approach
Each suggestion presented to the user represents a unique entity of your domain. A requirement of this solution is that entities must be modeled as individual documents in a specialized collection tuned for as-you-type suggestibility.
It's often the case that your main collection represents one type of entity as documents, and other domain entities as metadata fields or embedded documents. For example, let’s take the sample movies data available within Atlas: as a user types, movie titles certainly should be suggested. But what about cast member names? Can I find movies starring Keanu Reeves by typing only "kea"? What about documentaries by only typing "doc"?
It’s a simple model with the following basic schema:
_id:
unique id for this collection in the form<type>-<natural id>
.type:
entity/object type, e.g. movie, brand, person product, and category.name:
the name or title of the entity, which would generally be unique per type.
It’s important that entity documents have stable, unique identifiers, as
the entities will be regularly refreshed from the main collection.
Assigning a type
to each entity allows for filtering (only suggest cast
members, say, in an actor-specific lookup), grouping (organize the
suggestions by type), or boosting by type (movie titles could have a
higher weighting than cast member names).
Modeling entities directly as individual documents allows each to carry optional metadata fields to assist in ranking, displaying, filtering, or grouping them.
At the heart of this solution, the straightforward document model feeds
the name
field through a sophisticated index configuration, which slices
and dices the values in a multitude of ways suitable for querying in
several ways. The power of this solution comes from the synergy of
multiple indexing and querying strategies.
{ "_id":"title-The Matrix", "name":"The Matrix", "type":"title" }
Building the Solution
First, identify the entities in your data that are to be suggestible. In the movies scenario, these would include movie titles, cast member names, and perhaps genres and director names too.
The basis of this as-you-type suggestion system can be achieved in a few steps:
Create an “entities” collection and populate it using the schema modeled above. As often as warranted, refresh the “entities” collection.
Create an Atlas Search “entities_index” using an index configuration as described below.
Craft a robust set of query clauses, along with any pertinent boosting factors, within a $search-using aggregation pipeline.
Importing entities
While there are multiple ways to populate the “entities” collection, one straightforward way to populate it is with a short and sweet aggregation pipeline run on the main collection to bring in the unique titles across all movies:
[ { $group: { _id: "$title", }, }, { $project: { _id: {$concat: [ "title", "-", "$_id" ]}, type: "title", name: "$_id" } }, { $merge: { into: "entities" } } ]
The $project
converts each unique movie title into the necessary
“entities” schema. Because this collection types each document, the type
is encoded as a prefix of the generated _id
and appended with the actual
movie title creating a reproducible identifier for each unique title.
Including type
in the entity identifiers allows different types of
entities with the same name to be independent from one another (there
could be a movie named “Adventure” as well as the “Adventure” genre).
And finally, the handy $merge
stage adds all new titles and leaves the
existing ones untouched.
The resulting title-typed document for “The Matrix” comes out simply as:
{ "_id":"title-The Matrix", "name":"The Matrix", "type":"title" }
Each entity type potentially needs its own technique for merging into
the “entities” collection, as in the case of the "genre" and "cast"
entities, which need to be unwound from their nested arrays using
$unwind
.
This cast-specific entities import brings in “Keanu Reeves” as:
{ "_id":"cast-Keanu Reeves", "name":"Keanu Reeves", "type":"cast", "weight": 6.637 }
Indexing entities
The name
field is indexed in a multitude of ways, which will
facilitate partial matching and ranking at query time.

Figure 2. Multiple indexing strategies
Atlas Search index configuration enables a single document field to be indexed in a multitude ("multi") of ways (the feature is called “multi”-analyzers).
The type
field is indexed as both a token
field, for equals
or in
filtering, and a stringFacet
field to provide a means to get counts
across the results of each entity type.
Any other fields added beyond _id
, type
, and name
are handled by the
index definition, either through dynamic mapping or the static
definitions you provide. In this example, weight
is custom and handled
dynamically as a numeric type.
Searching for suggestions
The resulting specialized search index provides the foundation for
as-you-type queries. The name
field is indexed in a number of ways and
matched against users typing with various tunable query operators. The
idea is to throw the query operators against these differently analyzed
mappings and see what sticks — the more ways they stick, the higher the
suggestion is ranked. Each of the query clauses can be independently
boosted and summed giving a relevancy score for the matching entity.
These scores could be further boosted by other factors such as an
optional entity weight
field.

Figure 3. Example query and relevancy scoring computation
Generally, the behavior of a user selecting a suggestion is to then perform a targeted traditional search for the selected item, which would in turn surface all matching items.
[ { $group: { _id: "$title", }, }, { $project: { _id: {$concat: [ "title", "-", "$_id" ]}, type: "title", name: "$_id" } }, { $merge: { into: "entities" } } ]
Visit the GitHub repo to this solution.
Key Considerations
Model suggestible entities as documents with a specialized index configuration: This could be done as described above in a separate collection containing entities from any source. Or, if your main collection models all suggestible entities as top-level documents already, an index can be created or an existing one augmented to use the index configuration techniques described here.
Craft clever queries: Leverage the index structure, generating rich and nuanced queries to match entities and rank suggestions as desired.
Technology and Products
Author
Erik Hatcher, MongoDB