R2C2 run submission instructions
Dry run submission instructions version 20250912
A few toy run files (for both PR and AC subtasks) can be found here
PR runs
NUMBER OF RUNS
We allow up to 4 PR runs from each team.
FILE NAMES
[TeamName]-{PG|PO}-[RunNumber]
where
PG means that the participating group generated their own passages from the documents while
PO means that the participanting group used the passages provided by the organisers; and
[RunNumber] is an integer in the [1-4] range.
For example, if your [TeamName] is WASEDA and used the organisers’ passages, your third PR run should be named
WASEDA-PO-3.
FILE FORMAT
Character encoding: UTF-8.
Your PR run file should look like this (Please check the above toy run files):
[qID];[PassageRank];[docID];[PassageText]
:
For each question, up to 20 passages may be returned.
Therefore, the range of [PassageRank] is [1-20].
[docID] is the ID of the document in the corpus from which the passage was extracted.
For example:
D001;1;docIDabababa;This is a passage
D001;2;docIDcdcdcdc;This is also a passage
:
D004;1;docIDefefefe;Another passage blah blah
:
D004;20;docIDxyxyxy;blah blah blah
D005:1;docIDyzyzyz;blahblahblahblah
:
HOW TO SUBMIT THE RUNS
Please create a zip file called
[TeamName]-PR.zip
e.g.
WASEDA-PR.zip
that contains all of your PR runs.
Please send it by email to ntcir19r2c2org@list.waseda.jp . We will send you a confirmation of receipt.
AC runs
NUMBER OF RUNS
We allow up to 4 AC runs from each team.
FILE NAMES
[TeamName]-AC-[RunNumber]
where
[RunNumber] is an integer in the [1-4] range.
For example, if your [TeamName] is WASEDA, your third AC run should be named
WASEDA-AC-3.
FILE FORMAT
Character encoding: UTF-8.
Your AC run file should look like this (Please check the above toy run files):
<D001>
[Answer];[Confidence]
[NuggetNum];[PRrunname];[PassageRank];[Nugget]
:
[NuggetNum];[PRrunname];[PassageRank];[Nugget]
</D001>
:
<D005>
:
</D005>
[Answer] is the textual answer to the question;
[Confidence] is the confidence score for that answer, an integer in the [0-100] range;
[NuggetNum] is N is that line represents your N-th nugget.
[PRnunname] and [PassageRank] form a PassageKey and represents a unique passage from the PR runs.
For example, if the nugget was extracted from the third passage of a PR run called
WASEDA-PO-1, then [PRrunname]=WASEDA-PO-1 and [PassageRank]=3.
For example, the content of the question element for D004 might look like this:
<D004>
Anthony Mackie;90
1;WASEDA-PO-1;1;I am a nugget
2;WASEDA-PO-1;1;I am also a nugget
3;WASEDA-PO-1;2;nugget nugget nugget
4;THUIR-PG-1;1;Peking duck nugget
5;THUIR-PG-1;3;yum yum yum
</D004>
In this example, the first two nuggets were extracted from passage “WASEDA-PO-1;1″;
the third one from passage “WASEDA-PO-1;2″.
Similarly, the fourth nugget was extracted from passage “THUIR-PG-1;1″
and the fifth from passage “THUIR-PG-1;3″.
Note that one run can thus utilise passages from multiple PR runs.
If your system failed to return answer for a question,
you can leave the content of the question element empty, for example,
HOW TO SUBMIT THE RUNS
Please create a zip file called
[TeamName]-AC.zip
e.g.
WASEDA-AC.zip
that contains all of your AC runs.
Please send it by email to ntcir19r2c2org@list.waseda.jp . We will send you a confirmation of receipt.
Back to the R2C2 top page