Task definition slides updated August 20, 2019.
WWW-3 (and DialEval-1) task definitions in Japanese presented on September 10, 2019.
The topic files (80 WWW-2 topics + 80 WWW-topics) are now available! (March 25, 2020) Chinese English
(If you are in a country where Box is blocked, try these links instead: Chinese English)
The Chinese and English baseline runs and their HTML files are now available! (April 13, 2020) download (password-protected)
(If you are in a country where Box is blocked, try this instead: download (password-protected))
You can also download the WWW-1 topics and qrels from the subtask pages.
The NTCIR-14 We Want Web task organisers and the CENTRE (CLEF NTCIR TREC Reproducibility) organisers have joined forces to quantify technological advances, replicability, and reproducibility in web search!
The task consists of Chinese and English subtasks. This page contains general information about the task. If you are interested in the Chinese subtask, please also visit our Chinese subtask page. If you are interested in the English subtask (which addresses replicability and reproducibility as well as the usual adhoc web search), please also visit our English subtask page.
Please note that all runs are required to process all 160 topics – 80 WWW-2 test topics plus the new 80 WWW-3 test topics.
May 4, 2020 | Task registrations due [DONE] |
May 31, 2020 | Run submissions due [DONE] |
June-July 2020 | Relevance assessments [DONE] |
Aug 31, 2020 | Evaluation results released [DONE] |
Sep 20, 2020 | Draft participants papers due [DONE] |
Oct 1, 2020 | Task organisers’ feedback to participants [DONE] |
Nov 1, 2020 | All camera ready papers due |
Dec 8-11, 2020 | NTCIR-15 Conference |
To register, please do both (A) and (B):
(A) Register online at the NTCIR registration page
(B) Send an email to www3org@list.waseda.jp
with the following information so that we can send you the training data and the download password asap:
- Team Name
- Principal investigator’s name, affilication, email address
- Names, affiliations, email addresses of other team members
- Subtasks that you plan to participate: Chinese, English, or BOTH
Chinese subtask data: please visit the WWW-3 Chinese subtask page.
English subtask data: please visit the WWW-3 English subtask page.
Each team is allowed to submit up to 5 Chinese runs and 5 English runs.
Runs should be generated automatically; no manual intervention is allowed.
The name of the zip file for uploading should be of the form
[TEAMNAME].{zip,gz}.
Note that this file should contain no more than 10 runs (up to 5 Chinese and 5 English runs).
Each run file should be named as follows:
[TEAMNAME]-{C,E}-{CO,DE,CD}-{REV,REP,NEW}-<priority>
e.g.
WASEDA-E-CD-NEW-1
Run file names should NOT have the “.txt” suffix.
{C,E}: C means Chinese subtask; E means English subtask.
{CO,DE,CD}: CO if your run used only the CONTENT field in the topic file (“title” in TREC parlance); DE if your run used only the DESCRIPTION field; CD if your run used both.
{REV,REP,NEW}: REV for revived runs from Tsinghua (English only), REP for replicated/reproduced runs (English only), NEW for original runs (Chinese and English).
priority: an integer between 1 and 5, indicating which runs should be prioritised for the inclusion into the pools for relevance assessments. (But hopefully we will include all submitted runs into the pool.)
The format is the same as those in previous WWW tasks. This is the typical TREC run format, except for the first line in the file.
The first line of the run file should be of the form:
<SYSDESC>[insert a short English description of this particular run]<SYSDESC>
e.g.
<SYSDESC>BM25F with Pseudo-Relevance Feedback]<SYSDESC>
The rest of the file should be of the form:
[TopicID] 0 [DocumentID] [Rank] [Score] [RunName]
e.g.
0001 0 clueweb12-0006-97-23810 1 27.73 WASEDA-E-CD-NEW-1
0001 0 clueweb12-0009-08-98321 2 25.15 WASEDA-E-CD-NEW-1
:
Note that the run files should contain the results for 160 topics (80 WWW-2 test topics + 80 WWW-3 test topics).
In each run file, please do not include more than 1000 documents per topic.
(The pool depth is expected to be around 20-30.
The measurement depth will be 10: nDCG@10, Q@10, ERR@10 will be used for evaluation.)
Your runs will be evaluated as fully ordered lists, by processing the ranked document IDs as is, using NTCIREVAL.
Please submit exactly one zip file (see above) as an email attachment to www3org@list.waseda.jp
with the email subject “WWW-3 run submission
by the above deadline.
Late submissions cannot be accepted as we need to create pool files right after the deadline.
INQUIRIES: www3org@list.waseda.jp