From communications with participants here are some answers to questions you surely have about this task. If you have further questions, please let us know.
A 'passage' is any child element of the abstract, description, or of the claims. They could be <p> elements, but also <claim> or <heading> elements.
We are aware that headings are not particularly informative, but could not exclude them apriori if the portions of patent text indicated as relevant in the search reports covered them.
So a relevant passage to a given set of claims is one or more children elements of the abstract, description or claim tags. When the whole abstract, description, or claims are considered relevant, we chose to list all children of the corresponding XML elements. Participants should do the same.
Manually, extracted from the search reports.
The qrels are a as-close-as-possible representation of the relevant portions of text indicated as relevant in the search reports. Look here for an example of an European Search Report.
At this stage, we only have used X and Y level citations from the search reports. In the previous CLEF-IPs, such X/Y level citations were considered equally (highly) relevant in the 'Prior Art'-like tasks (level 2 in the qrels).
There is no passage ranking in the training relevance judgements. We ask you, however, to rank your retrieved passages.
We are aware that the paragraphs marked as relevant, as a consequent of being indicated as such in the search reports, are not always spot on. Most likely, a system will return a child element highly ranked and the rest of its siblings much lower on the list. The evaluation metric will take this into account and will not be a simple MAP in the way we are used to.
Looking closer at search reports, we can interpret the data listed in them as: "for the set of claims C in the patent application PTop, the existing patent application PRel is relevant with the following passages being very relevant: page 3, line 15 - page 4 line 26; page 6, lines 12-35."
Assuming that the respective relevant lines correspond to paragraphs p1-p7 and p10-p12 in the PRel document, then, the relevance judgements will look like:
top-C Q0 PRel p1 top-C Q0 PRel p2 top-C Q0 PRel p3 top-C Q0 PRel p4 top-C Q0 PRel p5 top-C Q0 PRel p6 top-C Q0 PRel p7 top-C Q0 PRel p10 top-C Q0 PRel p11 top-C Q0 PRel p12While we are, still discussing how exactly it is best to score results, we think that a system which returned paragraphs p2 and p10, for instance, as the top two should get an almost perfect score, indicating that the system has brought the user to the two sections which the examiner considered useful.