For ranking, cosine similarity is an important ranking factor, but it usually needs to be combined with other factors — particularly, query-independent factors that reflect [desirability](https://dtunkelang.medium.com/precision-recall-and-desirability-0384a11d916b). At best, cosine similarity can be used as the only query-dependent factor, summarizing relevance in a single number. However, it is difficult to establish an absolute cosine similarity threshold to guarantee relevance. In general, we need to take cosine similarity with a grain of salt when we use it to measure relevance. A large difference in cosine similarity usually indicates a meaningful gap in relevance, but a small one may simply be noise, reflecting the inherent limits of our vector representation. There can also be systematic bias from misaligned embeddings (e.g., cosine similarity favoring shorter document strings). Moreover, it can be unclear how to best combine cosine similarity with other ranking factors. Introducing it into a hand-tuned model with linear weights is probably a bad idea since the behavior of cosine similarity is hardly linear (e.g., a similarity of 0.5 is not half as good as a similarity of 1.0). So it is a good idea to use a [[Learning to Rank (LTR)]] approach that plays well with [nonlinearity](https://en.wikipedia.org/wiki/Nonlinear_system), such as the tree-based [XGboost](https://en.wikipedia.org/wiki/XGBoost).