CLEF-IP 2010 was organized by and Information Retrieval Facility (IRF). These pages are rarely updated.

CLEF-IP 2010 Download Area

Jump to: Documents | Data | Topics and Qrels | Submitted Runs

Documents related to CLEF-IP 2010

To understand what happened in CLEF-IP 2010 please refer to:

  • CLEF 2010 Notebook Papers
  • Florina Piroi, John Tait, CLEF-IP 2010: Retrieval Experiments in the Intellectual Property Domain, October 2010, Tech.Rep IRF-TR-2010-00005.
    (download 1, download 2)
  • Florina Piroi, CLEF-IP 2010: Prior Art Candidates Search Evaluation Summary, July 2010, IRF-TR-2010-00003.
    (download 1, download 2)
  • Florina Piroi, CLEF-IP 2010: Classification Task Evaluation Summary, August 2010, IRF-TR-2010-00004.
    (download 1, download 2)

Other documents related to CLEF-IP 2010:

Data


Licensing

The CLEF-IP 2010 corpus is an extract of the MAREC data collection.

Creative Commons License MAREC by IRF is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Permissions beyond the scope of this license may be available at mailto:marec@fandan.net.


CLEF-IP 2010 Corpus

You can download the data corpus either in one shot, as three big archive volumes, or in two parts, each split into smaller volumes.

The necessary type definition documents: dtds.7z (46 K)

Note: If you are a registered PatOlympics 2010 participant, and already have downloaded the data, you don't need to download the files below.

One archive (three volumes)

clef-ip-2010.7z.001 3.8G MD5 Hash
clef-ip-2010.7z.002 3.8G MD5 Hash
clef-ip-2010.7z.003 1.9G MD5 Hash

Two archives

PART 1
clef-ip-2010_1.7z.001 1G MD5 Hash
clef-ip-2010_1.7z.002 1G MD5 Hash
clef-ip-2010_1.7z.003 1G MD5 Hash
clef-ip-2010_1.7z.004 1G MD5 Hash
clef-ip-2010_1.7z.005 1G MD5 Hash
clef-ip-2010_1.7z.006 1G MD5 Hash
clef-ip-2010_1.7z.007 1G MD5 Hash
clef-ip-2010_1.7z.008 85Mb MD5 Hash
PART 2
clef-ip-2010_2.7z.001 1G MD5 Hash
clef-ip-2010_2.7z.002 1G MD5 Hash
clef-ip-2010_2.7z.003 429Mb MD5 Hash

Note: It is recommended to use a download manager that allows resuming the download when network errors occur (e.g. FileZilla, wGetGUI, etc.).
Do not open the file on the fly, but right click on the link and choose 'Save link as ...' .

Unzipping the data

You can download 7-Zip for your system from the project's website.

Windows users can unzip using the context menu (right click on the first archive file).

Linux users can use p7zip or 7za:

    7za x file.7z.001

The unzipping will continue automatically with the rest of the files of the archive.

 back to top

Topics and Qrels

Prior Art Candidates Search

clef-ip-2010_PACTopics.7z 2000 topics
clef-ip-2010_PACTopics-small.7z 500 topics
Qrels for the PAC topics PAC-Qrels.zip (includes qrels for the PACs topic set)

Important: A number of 64 topics have the wrong relevance judgements. Please exclude them from your evaluations.

Classification Task

clef-ip-2010_CLSTopics.7z 2000 topics
Qrels for the CLS topics CLS-Qrels.zip

Training Topics

clef-ip-2010_PACTraining.7z 300 topics

As an additional training set for the PAC task, you can use the last year's data. Contact us for more information.

 back to top

Disclaimer: This data was specifically assembled for the CLEF-IP track. Please note that in order to use this data you must have signed the CLEF-IP 2010 License Agreement. Contact us for more information.