Last updated: November 5, 2021.
How to submit the runs? Please see below. (November 5, 2021)
We have provided a vanilla BM25 baseline run to the participants. (November 4, 2021).
Test topics are here! The task organisers will serve as gold assessors! (November 1, 2021)
THE NEW CORPUS IS READY! It contains 82,451,337 htmls based on a Common Crawl data from 2021. Register to our task and send an email to email@example.com to obtain it! (September 9, 2021)
The We Want Web with CENTRE (WWW-4) Task has been accepted by the NTCIR-16 Programme Committee! We will be running an ENGLISH subtask only. (January 16, 2021)
|Nov 1 2021||Topics released; task registrations due|
|Dec 15 2021||Run submissions due|
|Dec 2021-Jan 2022||Relevance assessments|
|Feb 1, 2022||Evaluation results released|
|Feb 1, 2022||Draft task overview paper released|
|Mar 1, 2022||Draft participant paper submissions due|
|May 1, 2022||All camera-ready paper submissions due|
|June 14-17, 2022||NTCIR-16 Conference in NII, Tokyo, Japan|
Please cite the NTCIR-14 WWW-2 and NTCIR-15 WWW-3 Overview papers (see Papers section) whenever you publish a paper using this test collection.
Document collection: We are constructing a new target corpus based on a Common Crawl data from 2021. Details TBA.
WWW-2 (0001-0080) + WWW-3 (0101-0180) English topics (If you are in a country where Box is blocked, use this link instead.)
Relevance assessments for the above 160 topics (NTCIR-15 WWW-3 version) (If you are in a country where Box is blocked, use this link instead.)
BM25 Baseline run generated by organisers using the ClueWeb12 Batch Service. (If you are in a country where Box is blocked, use this link instead.)
Tsukuba’s KASYS-E-CO-NEW-1 was a top performing run at the NTCIR-15 WWW-3 English subtask. As described in their paper (KASYS at the NTCIR-15 WWW-3 Task), this is a BERT-based run. For the WWW-4 task, Tsukuba will use exactly the same system to process the new WWW-4 topics on the new target corpus. Hence this “revived” run represents the current state of the art.
This is a good old adhoc web search task. Please process the NTCIR-16 WWW-4 test topics with your proposed system. Your run will be compared to not only other runs but also the state of the art from NTCIR-15 WWW-3, namely the REV run (see above). Can you outperform it and become the new state of the art?
Reproducibility is crucial for advancing the state of the art as a research community. Please read the KASYS paper and try to reproduce their system! We will evaluate REP runs in terms of how similar they are to the REV run, and whether the effects over the BM25 baseline are preserved.
Each team can submit up to 5 NEW runs plus 1 REP run. (Plus 1 REV run for University of Tsukuba.)
The integer alone should serve as a unique identifier for each team’s run.
WASEDA-CO-NEW-1, …, WASEDA-CD-NEW-5, WASEDA-CO-REP-6
WASEDA-CD-REP-1, WASEDA-CO-NEW-2, …, WASEDA-CD-NEW-6.
CO means only the content fields of the topic file are used as input.
DE means only the description fields the topic file are used as input.
CD means both are used as input.
As in the previous WWW tasks, this is the typical TREC run format, except for the first line in the file.
The first line of the run file should be of the form:
<SYSDESC>[insert a short English description of this particular run]</SYSDESC>
<SYSDESC>BM25F with Pseudo-Relevance Feedback</SYSDESC>
The rest of the file should be of the form:
[TopicID] 0 [DocumentID] [Rank] [Score] [RunName]
Each run file should contain the results for all WWW-4 topics. Please do not include more than 1000 documents per topic. The document lists in your run files will be evaluated as fully ordered lists, by processing the ranked document IDs as is.
Please submit a single zip file named [TEAMNAME].zip (e.g. WASEDA.zip)
to the task organisers (firstname.lastname@example.org). Please include “WWW-4 submission” in the subject of your email message. We will send you a confirmation of receipt.
nDCG, Q-measure, nERR, and iRBU at document cutoff 10 will be computed using NTCIREVAL.
Reproducing the run: the REP run’s document rankings for the WWW-4 topics will be compared against the REV run’s document rankings, using Kendall’s tau, etc.
Reproducing the effect: the REP run’s effect over BM25 for the WWW-4 topics will be compared against the REV run’s effect over BM25, using Effect Ratio, etc.
See the NTCIR-15 WWW-3 overview paper (See Papers section) for more details.
Zhumin Chu (Tsinghua University, P.R.C.)
Yiqun Liu (Tsinghua University, P.R.C.)
Chen Nuo (Waseda University, Japan)
Yujing Li (Waseda University, Japan)
Junjie Wang (Waseda University, Japan)
Tetsuya Sakai (Waseda University, Japan)
Sijie Tao (Waseda University)
Nicola Ferro (University of Padua, Italy)
Maria Maistro (University of Copenhagen, Denmark)
Ian Soboroff (NIST, USA)