|
xapian-core
1.5.1
|
Xapian::Weight subclass implementing the traditional probabilistic formula. More...
#include <weight.h>
Public Member Functions | |
| TradWeight (double k=1.0) | |
| Construct a TradWeight. | |
| Public Member Functions inherited from Xapian::BM25Weight | |
| BM25Weight (double k1, double k2, double k3, double b, double min_normlen) | |
| Construct a BM25Weight. | |
| std::string | name () const |
| Return the name of this weighting scheme, e.g. | |
| std::string | serialise () const |
| Return this object's parameters serialised as a single string. | |
| BM25Weight * | unserialise (const std::string &serialised) const |
| Unserialise parameters. | |
| double | get_sumpart (Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterm, Xapian::termcount wdfdocmax) const |
| Calculate the weight contribution for this object's term to a document. | |
| double | get_maxpart () const |
| Return an upper bound on what get_sumpart() can return for any document. | |
| double | get_sumextra (Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const |
| Calculate the term-independent weight component for a document. | |
| double | get_maxextra () const |
| Return an upper bound on what get_sumextra() can return for any document. | |
| BM25Weight * | create_from_parameters (const char *params) const |
| Create from a human-readable parameter string. | |
| Public Member Functions inherited from Xapian::Weight | |
| Weight () | |
| Default constructor, needed by subclass constructors. | |
| virtual | ~Weight () |
| Virtual destructor, because we have virtual methods. | |
Additional Inherited Members | |
| Static Public Member Functions inherited from Xapian::Weight | |
| static const Weight * | create (const std::string &scheme, const Registry ®=Registry()) |
| Return the appropriate weighting scheme object. | |
| Protected Types inherited from Xapian::Weight | |
| enum | stat_flags { COLLECTION_SIZE = 0 , RSET_SIZE = 0 , AVERAGE_LENGTH = 4 , TERMFREQ = 1 , RELTERMFREQ = 1 , QUERY_LENGTH = 0 , WQF = 0 , WDF = 2 , DOC_LENGTH = 8 , DOC_LENGTH_MIN = 16 , DOC_LENGTH_MAX = 32 , WDF_MAX = 64 , COLLECTION_FREQ = 1 , UNIQUE_TERMS = 128 , TOTAL_LENGTH = 256 , WDF_DOC_MAX = 512 , UNIQUE_TERMS_MIN = 1024 , UNIQUE_TERMS_MAX = 2048 , DB_DOC_LENGTH_MIN = 4096 , DB_DOC_LENGTH_MAX = 8192 , DB_UNIQUE_TERMS_MIN = 16384 , DB_UNIQUE_TERMS_MAX = 32768 , DB_WDF_MAX = 65536 , IS_BOOLWEIGHT_ = static_cast<int>(0x80000000) } |
| Stats which the weighting scheme can use (see need_stat()). More... | |
| Protected Member Functions inherited from Xapian::Weight | |
| void | need_stat (stat_flags flag) |
| Tell Xapian that your subclass will want a particular statistic. | |
| Weight (const Weight &) | |
| Don't allow copying. | |
| Xapian::doccount | get_collection_size () const |
| The number of documents in the collection. | |
| Xapian::doccount | get_rset_size () const |
| The number of documents marked as relevant. | |
| Xapian::doclength | get_average_length () const |
| The average length of a document in the collection. | |
| Xapian::doccount | get_termfreq () const |
| The number of documents which this term indexes. | |
| Xapian::doccount | get_reltermfreq () const |
| The number of relevant documents which this term indexes. | |
| Xapian::termcount | get_collection_freq () const |
| The collection frequency of the term. | |
| Xapian::termcount | get_query_length () const |
| The length of the query. | |
| Xapian::termcount | get_wqf () const |
| The within-query-frequency of this term. | |
| Xapian::termcount | get_doclength_upper_bound () const |
| An upper bound on the maximum length of any document in the shard. | |
| Xapian::termcount | get_doclength_lower_bound () const |
| A lower bound on the minimum length of any document in the shard. | |
| Xapian::termcount | get_wdf_upper_bound () const |
| An upper bound on the wdf of this term in the shard. | |
| Xapian::totallength | get_total_length () const |
| Total length of all documents in the collection. | |
| Xapian::termcount | get_unique_terms_upper_bound () const |
| A lower bound on the number of unique terms in any document in the shard. | |
| Xapian::termcount | get_unique_terms_lower_bound () const |
| An upper bound on the number of unique terms in any document in the shard. | |
| Xapian::termcount | get_db_doclength_upper_bound () const |
| An upper bound on the maximum length of any document in the database. | |
| Xapian::termcount | get_db_doclength_lower_bound () const |
| A lower bound on the minimum length of any document in the database. | |
| Xapian::termcount | get_db_unique_terms_upper_bound () const |
| A lower bound on the number of unique terms in any document in the database. | |
| Xapian::termcount | get_db_unique_terms_lower_bound () const |
| An upper bound on the number of unique terms in any document in the database. | |
| Xapian::termcount | get_db_wdf_upper_bound () const |
| An upper bound on the wdf of this term in the database. | |
Xapian::Weight subclass implementing the traditional probabilistic formula.
This class implements the "traditional" Probabilistic Weighting scheme, as described by the early papers on Probabilistic Retrieval. BM25 generally gives better results.
TradWeight(k) is equivalent to BM25Weight(k, 0, 0, 1, 0), and since Xapian 2.0.0 TradWeight is actually implemented as a subclass of BM25Weight. In earlier versions is was a separate class which was equivalent except it returned weights (k+1) times smaller.
|
inlineexplicit |
Construct a TradWeight.
| k | A non-negative parameter controlling how influential within-document-frequency (wdf) and document length are. k=0 means that wdf and document length don't affect the weights. The larger k is, the more they do. (default 1) |
References Xapian::BM25Weight::BM25Weight().