NTCIR We Want Web with CENTRE Task


Last updated: March 1, 2021.
Twitter: @ntcirwww


Updates

The We Want Web with CENTRE (WWW-4) Task has been accepted by the NTCIR-16 Programme Committee! We will be running an ENGLISH subtask only. (January 16, 2021)


Important Dates

April 2021 Announcement of the new corpus/Task registrations open
Oct 1 2021 Topics released; task registrations due
Nov 15 2021 Run submissions due
Dec 2021-Jan 2022 Relevance assessments
Feb 1, 2022 Evaluation results released
Feb 1, 2022 Draft task overview paper released
Mar 1, 2022 Draft participant paper submissions due
May 1, 2022 All camera-ready paper submissions due
Jun 2022 NTCIR-16 Conference in NII, Tokyo, Japan

Data

Training data: NTCIR-14 WWW-2 and NTCIR-15 WWW-3 English test collections

Please cite the NTCIR-14 WWW-2 and NTCIR-15 WWW-3 Overview papers (see Papers section) whenever you publish a paper using this test collection.

Document collection: clueweb12-B13 (To obtain this corpus, please visit the clueweb12 webpage and follow the procedure. You only need to pay for the hard disk and the shipment. Once your organisation has obtained a clueweb licence, you may optionally utilise the clueweb online service.)

WWW-2 (0001-0080) + WWW-3 (0101-0180) English topics (If you are in a country where Box is blocked, use this link instead.)

Relevance assessments for the above 160 topics (NTCIR-15 WWW-3 version) (If you are in a country where Box is blocked, use this link instead.)

BM25 Baseline run generated by organisers using the ClueWeb12 Batch Service. (If you are in a country where Box is blocked, use this link instead.)

Test data (Topics, corpus, BM25 baseline run)

TBA


Run Types

REV (revived) run – for University of Tsukuba only

Tsukuba’s KASYS-E-CO-NEW-1 was a top performing run at the NTCIR-15 WWW-3 English subtask. As described in their paper (KASYS at the NTCIR-15 WWW-3 Task), this is a BERT-based run. For the WWW-4 task, Tsukuba will use exactly the same system to process the new WWW-4 topics on the new target corpus. Hence this “revived” run represents the current state of the art.

NEW (standard adhoc) runs

This is a good old adhoc web search task. Please process the NTCIR-16 WWW-4 test topics with your proposed system. Your run will be compared to not only other runs but also the state of the art from NTCIR-15 WWW-3, namely the REV run (see above). Can you outperform it and become the new state of the art?

REP (reproduced) runs

Reproducibility is crucial for advancing the state of the art as a research community. Please read the KASYS paper and try to reproduce their system! We will evaluate REP runs in terms of how similar they are to the REV run, and whether the effects over the BM25 baseline are preserved.


Runs

Run File Format

Number of runs: Each team can submit up to 5 NEW runs plus 1 REP run. (Plus 1 REV run for University of Tsukuba.)

File names
[TEAMNAME]-{CO,DE,CD}-{NEW,REP}-[1-6]
The integer alone should serve as a unique identifier for each team’s run.
e.g.
WASEDA-CO-NEW-1, …, WASEDA-CD-NEW-5, WASEDA-CO-REP-6
or
WASEDA-CD-REP-1, WASEDA-CO-NEW-2, …, WASEDA-CD-NEW-6.

CO means only the content fields of the topic file are used as input.
DE means only the description fields the topic file are used as input.
CD means both are used as input.

File format
As in the previous WWW tasks, this is the typical TREC run format, except for the first line in the file.

The first line of the run file should be of the form:
<SYSDESC>[insert a short English description of this particular run]</SYSDESC>
e.g.
<SYSDESC>BM25F with Pseudo-Relevance Feedback</SYSDESC>

The rest of the file should be of the form:
[TopicID] 0 [DocumentID] [Rank] [Score] [RunName]
e.g.
0201 0 1 27.73 WASEDA-CO-NEW-1
0201 0 2 25.15 WASEDA-CO-NEW-1
:

Each run file should contain the results for all WWW-4 topics. Please do not include more than 1000 documents per topic. The document lists in your run files will be evaluated as fully ordered lists, by processing the ranked document IDs as is.

Submitting runs

TBA


Evaluation Measures

Retrieval Effectiveness
nDCG, Q-measure, nERR, and iRBU at document cutoff 10 will be computed using NTCIREVAL.

Reproducibility

Reproducing the run: the REP run’s document rankings for the WWW-4 topics will be compared against the REV run’s document rankings, using Kendall’s tau, etc.

Reproducing the effect: the REP run’s effect over BM25 for the WWW-4 topics will be compared against the REV run’s effect over BM25, using Effect Ratio, etc.

See the NTCIR-15 WWW-3 overview paper (See Papers section) for more details.

Organisers

www4org@list.waseda.jp

Zhumin Chu (Tsinghua University, P.R.C.)
Nicola Ferro (University of Padua, Italy)
Yiqun Liu (Tsinghua University, P.R.C.)
Maria Maistro (University of Copenhagen, Denmark)
Tetsuya Sakai (Waseda University, Japan)
Ian Soboroff (NIST, USA)
Sijie Tao (Waseda University)


Papers


Past Rounds

NTCIR-15 WWW-3
NTCIR-14 WWW-2
CENTRE (at CLEF 2018-2019, NTCIR-14, and TREC 2018)
NTCIR-13 WWW-1