The Okapi BM25 ranking formula is used to score the relevance of a document to a given search query. It is commonly employed in information retrieval systems. It is defined as:
This formula calculates a score representing the relevance of the document to the query, taking into account the frequency of query terms in the document and their overall distribution in the document collection. The script make output that represents the BM25 scores for each document in the provided collection based on the given query, "quick brown fox." For example we get:
Document 1: "The quick brown fox jumps over the lazy dog."
BM25 Score: 1.3076713878207966
Document 2: "A quick brown dog outpaces a lazy fox."
BM25 Score: 0.9219687396718056
Document 3: "The lazy dog sleeps all day."
BM25 Score: 0.0
As you can see, Document 1 contains all the terms from the query ("quick," "brown," and "fox"), so it has the higher score. On the opposite, Document 3 has a BM25 score of 0.0 because it doesn't contain any of the terms from the query.