CLEF-IP 2009 was organized by Matrixware GmbH and Information Retrieval Facility (IRF). These pages are not updated anymore.

CLEF-IP 2009 Download Area

Jump to: Documents | Data | Topics and Qrels | Submitted runs

Documents in the Track

To understand what happened during this track please refer to the following publications:

  • CLEF 2009 Working Notes
  • G. Roda, J. Tait, F. Piroi and V. Zenz, "CLEF-IP 2009: Retrieval Experiments in the Intellectual Property Domain", in Peters, C., Di Nunzio, G.M., Kurimo, M., Mostefa, D., Penas, A. and Roda, G. (eds) Multilingual Information Access Evaluation I. Text Retrieval Experiments 10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009, pages 385-409, Springer LNCS, Volume 6241, 2010
  • IRF-TR-2009-00001: Florina Piroi, Giovanna Roda, Veronika Zenz, CLEF-IP 2009 Evaluation Summary, July 2009, (download 1, download 2)
  • IRF-TR-2010-00002: Florina Piroi, Giovanna Roda, Veronika Zenz, CLEF-IP 2009: Evaluation Summary - Part Two, July 2010, (download 1, download 2)

Other documents related to this track:

 back to top



The CLEF-IP 2010 corpus is an extract of the MAREC data collection.

Creative Commons License MAREC by IRF is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Permissions beyond the scope of this license may be available at

The necessary type definition documents: dtds.7z (46 K)

CLEFIP-1985-1990.tgz 3.5G MD5 Hash
CLEFIP-1991-1994.tgz 3.8G MD5 Hash
CLEFIP-1995-1998.tgz 4.2G MD5 Hash
CLEFIP-1999-2000.tgz 2.0G MD5 Hash

Note: The IE 6 and 7 browsers will give you an error when trying to save/download files greater than 4Gb. Use another browser instead, or a download manager that allows resuming the download when network errors occur (e.g. FileZilla, wGetGUI, etc.).
In any case, do not open the file on the fly, but right click on the link and choose 'Save link as ...' .

Alternative downloads for the the 1995-1998 data
These files are made available for those who use FAT32 file systems or encounter problems downloading the CLEFIP-1995-1998.tgz. If you have already successfully downloaded the CLEFIP-1995-1998.tgz you don't need to download these files.

CLEFIP-1995-1996.tgz 2.0G MD5 Hash
CLEFIP-1997-1998.tgz 2.2G MD5 Hash

Here you have a field by field description and content examples of the XML files in the collection.

 back to top


For each task of the Clef-IP 09 track, we provide 4 sets of different sizes of topics: XLarge, Large, Medium, Small.

Here you can download the track guidelines for these tasks.

Main Task Topics

XLarge: 10,000 topics topics_CLEFIP09_Main_XL.tar.gz (~ 152M )
Large: 5,000 topics topics_CLEFIP09_Main_L.tar.gz (~ 76M )
Medium: 1,000 topics topics_CLEFIP09_Main_M.tar.gz (~ 16M )
Small: 500 topics topics_CLEFIP09_Main_S.tar.gz (~ 7.6M )

Language Task Topics with Bibliographical Data

This version of the topics can be downloaded as one big tarball file. (~ 88 Mb)

 back to top

Training Topics: Small Set

Guidelines for using the small training set of topics
Additional notes on the small training set.

topics_CLEFIP09_Main_ts.txt (topics)
relass_CLEFIP09_Main_ts.txt (relevance assessments)

Training Topics: Large Set

Guidelines for using the large training set of topics

topics_CLEFIP09_Main_lts_NEW.tar.gz (~ 7.5 M)
topics_CLEFIP09_EN_lts_NEW.tar.gz ( ~ 114 K )
topics_CLEFIP09_DE_lts_NEW.tar.gz ( ~ 128 K )
topics_CLEFIP09_FR_lts_NEW.tar.gz ( ~ 128 K )

The archives contain the following files:

  • one topic_CLEFIP09_task type_lts_NEW.txt that contains the training topics using the new format;
  • one relass_CLEFIP09_task type_lts.txt containing the relevance assessments for the training topics (note that they are the same files as for the previous topic format);
  • many xml files relevant for the topics.
  • Differently from the description in the 'Guidelines' above, the new topic format has a (much) shorter description field which, now, contains only the name of the xml file with the data to be searched/indexed.

     back to top

    Relevance Assessments

    The relevance assessments for the language tasks are contained in the ones for the Main tasks.


     back to top

    Submitted Runs

    Note: these are the original submitted runs. Some clean-up and preparation is needed before running trec_eval on them. 650Mb 650Mb 650Mb 650Mb 315Mb

     back to top

    Disclaimer: this data was specifically assembled for the CLEF-IP track. Please note that in order to use this data you must have signed the CLEF campaign End-User Agreement or the CLEF-IP 09 License Agreement.