Archive for the ‘Information Retrieval Evaluation ’ Category

On the Relationship between Retrievability and Effectiveness

October 04th, 2011


Most of the work done in the retrievability domain is focused on either analyzing the retrieval models' retrieval bias or proposing different retrieval strategies for increasing the retrievability of documents. However, there is a little known that what is the relationship between the retrievability and the effectiveness measures (precision, recall, MAP etc). In this research, we examine the relationship between two goals of IR, retrievability and effectiveness, to determine how far both measures are correlated to each other. One important aspect of retrievability based evaluation is that it can be analyzed or estimated without resource to relevance judgments available. Thus provides an attractive alternative for providing the automatic ranks of retrieval models. It can be further used for tuning the retrieval models effectiveness by either changing their parameter values or ranking features so that the retrieval models can perform better on a given collection. We specifically examine to what extend the retrieval bias of retrieval models effect the effectiveness of retrieval model. That is if a retrieval model contains less retrieval bias than the other models, then does it also mean that it is more effective than the others in terms of effectiveness for only searching the relevant information?



To analyze the relationship between the effectiveness and the retrieval bias of retrieval models we use the 1.2 million patent documents from the TREC Chemical Retrieval Track 2009. As we are specifically interested in the relationship between effectiveness measures (such as precision, recall, MAP) and retrievability, we will compute the retrievability for only the subset of 35,000 documents. We select this subset randomly from the 1.2 million documents. While it would be possible to compute the retrievability scores for all documents (apart from being almost prohibitively expensive due to the enormous amount of queries that can be generated over the entire vocabulary) it would seem unfair to use more information for retrievability measure (thus begin more robust) than for effectiveness measures. Note, however, that all queries are of course processed over the entire collection of 1.2 million documents. Queries will be generated from only the 35,000 documents.

The Patent Number of 35,000 documents is available here.

Download Document IDs

Retrievability Scores

The retrievability scores of 35,000 subset are calculated with 8 standard retrieval models. These include

(1) Normalized-TFIDF.

(2) TFIDF.

(3) OKAPI-BM25.

(4) Language Modeling (Bayesian Smoothing).

(5) Language Modeling (Jelinek Mercer).

(6) Language Modeling (TwoStage Smoothing).

(7) Language Modeling (Absolute Discounting).

(8) SMART Retrieval Model.

Each line of retrievability scores file is separated with 6 fields. First five fields represent retrievability scores with five different rank cut-off level c=50, c=100, c=150, c=200, and c=250. The last field represents the total number of queries that are used for retrieving documents without considering any rank cutoff level.


Download Retrievability Scores


Evolving Effective Retrieval Model using Genetic Programming and Retrievability

We further show that this principle may even be applied to learn a completely new retrieval model using a genetic programming approach, where the fitness function is guided solely by the evolving retrieval model's retrieval bias. This offers a promising approach for fine tuning and optimizing retrieval models for a specific collection and their characteristics without having to invest enormous amounts of effort into manually labeling the relevant documents of a collection, assessing their relevance.


We perform experiments for this research with up to 100 generations, and with 50 individuals per generation.


Download Genetic Programming Log