Searchers express their intent through queries, but queries themselves are brittle representations.
In cases when queries vary significantly but still map to the same intent, recognizing the equivalence may require modeling a query as a bag of relevant documents.
We can implement this approach by first representing the documents as vectors and then aggregating these vectors, e.g., by taking their mean. This or a similar process allows us to represent a query as a vector by way of a set of associated documents. We call this the “bag-of-documents” model.
Implementation details aside, the bag-of-documents model is a practical, intuitive way to represent queries as search intents and combat negative affects of assymetric semantic search. Ref. [[Symmetric & Asymmetric Semantic Search]].
If we can map a query map to its relevant documents, then we can map a document to the queries for which it is relevant. All that we have done is to invert the direction of the mapping. In effect, we are representing the document as the collection of query intents that it can satisfy. For example, a pair of Levi’s men’s jeans is relevant to queries that include “jeans”, “mens jeans”, “levis”, etc.