Last updated: March 1, 2021.
The We Want Web with CENTRE (WWW-4) Task has been accepted by the NTCIR-16 Programme Committee! We will be running an ENGLISH subtask only. (January 16, 2021)
|April 2021||Announcement of the new corpus/Task registrations open|
|Oct 1 2021||Topics released; task registrations due|
|Nov 15 2021||Run submissions due|
|Dec 2021-Jan 2022||Relevance assessments|
|Feb 1, 2022||Evaluation results released|
|Feb 1, 2022||Draft task overview paper released|
|Mar 1, 2022||Draft participant paper submissions due|
|May 1, 2022||All camera-ready paper submissions due|
|Jun 2022||NTCIR-16 Conference in NII, Tokyo, Japan|
Please cite the NTCIR-14 WWW-2 and NTCIR-15 WWW-3 Overview papers (see Papers section) whenever you publish a paper using this test collection.
Document collection: clueweb12-B13 (To obtain this corpus, please visit the clueweb12 webpage and follow the procedure. You only need to pay for the hard disk and the shipment. Once your organisation has obtained a clueweb licence, you may optionally utilise the clueweb online service.)
WWW-2 (0001-0080) + WWW-3 (0101-0180) English topics (If you are in a country where Box is blocked, use this link instead.)
Relevance assessments for the above 160 topics (NTCIR-15 WWW-3 version) (If you are in a country where Box is blocked, use this link instead.)
BM25 Baseline run generated by organisers using the ClueWeb12 Batch Service. (If you are in a country where Box is blocked, use this link instead.)
Tsukuba’s KASYS-E-CO-NEW-1 was a top performing run at the NTCIR-15 WWW-3 English subtask. As described in their paper (KASYS at the NTCIR-15 WWW-3 Task), this is a BERT-based run. For the WWW-4 task, Tsukuba will use exactly the same system to process the new WWW-4 topics on the new target corpus. Hence this “revived” run represents the current state of the art.
This is a good old adhoc web search task. Please process the NTCIR-16 WWW-4 test topics with your proposed system. Your run will be compared to not only other runs but also the state of the art from NTCIR-15 WWW-3, namely the REV run (see above). Can you outperform it and become the new state of the art?
Reproducibility is crucial for advancing the state of the art as a research community. Please read the KASYS paper and try to reproduce their system! We will evaluate REP runs in terms of how similar they are to the REV run, and whether the effects over the BM25 baseline are preserved.
Number of runs: Each team can submit up to 5 NEW runs plus 1 REP run. (Plus 1 REV run for University of Tsukuba.)
The integer alone should serve as a unique identifier for each team’s run.
WASEDA-CO-NEW-1, …, WASEDA-CD-NEW-5, WASEDA-CO-REP-6
WASEDA-CD-REP-1, WASEDA-CO-NEW-2, …, WASEDA-CD-NEW-6.
CO means only the content fields of the topic file are used as input.
DE means only the description fields the topic file are used as input.
CD means both are used as input.
As in the previous WWW tasks, this is the typical TREC run format, except for the first line in the file.
The first line of the run file should be of the form:
<SYSDESC>[insert a short English description of this particular run]</SYSDESC>
<SYSDESC>BM25F with Pseudo-Relevance Feedback</SYSDESC>
The rest of the file should be of the form:
[TopicID] 0 [DocumentID] [Rank] [Score] [RunName]
Each run file should contain the results for all WWW-4 topics. Please do not include more than 1000 documents per topic. The document lists in your run files will be evaluated as fully ordered lists, by processing the ranked document IDs as is.
nDCG, Q-measure, nERR, and iRBU at document cutoff 10 will be computed using NTCIREVAL.
Reproducing the run: the REP run’s document rankings for the WWW-4 topics will be compared against the REV run’s document rankings, using Kendall’s tau, etc.
Reproducing the effect: the REP run’s effect over BM25 for the WWW-4 topics will be compared against the REV run’s effect over BM25, using Effect Ratio, etc.
See the NTCIR-15 WWW-3 overview paper (See Papers section) for more details.
Zhumin Chu (Tsinghua University, P.R.C.)
Nicola Ferro (University of Padua, Italy)
Yiqun Liu (Tsinghua University, P.R.C.)
Maria Maistro (University of Copenhagen, Denmark)
Tetsuya Sakai (Waseda University, Japan)
Ian Soboroff (NIST, USA)
Sijie Tao (Waseda University)