last updated: 15th November, 2022.
This page contains information and data re: corrected NTCIR-14 WWW-2, NTCIR-15 WWW-3, and NTCIR-16 WWW-4 English run results.
For details, please read the WWW CORRECTED EVALUATION RESULTS paper [see Papers section].
Many apologies to those who have already used the WWW-2, WWW-3, and WWW-4 test collections.
If you are using any of the data provided below for your research, please cite the “WWW CORRECTED EVALUATION RESULTS paper” AND the relevant task overview paper(s) [see Papers section].
80 WWW-2 + 80 WWW-3 topics (unchanged)
50 WWW-4 topics (unchanged)
Corrected WWW-2 English qrels file and score matrices (with raw run files)
Corrected WWW-3 English qrels file and score matrices (with raw run files)
Corrected WWW-4 English Gold qrels files and score matrices, with unchanged Bronze-All qrels file and score matrices (with raw run files)
This is a corrected version of the WWW3E8 data set, which contains eight different qrels files for 160 topics (80 WWW-2 topics + 80 WWW-3 topics).
The (corrected) qrels file for the NTCIR-15 WWW-3 was obtained by merging these eight different qrels files.
If you are using this data set for your research, please cite the CORRECTED VERSION of the TOIS paper [i.e. the arxiv paper - see Papers section].
Sakai et al.: Corrected Evaluation Results of the NTCIR WWW-2, WWW-3, and WWW-4 English Subtasks, arXiv:2210.10266, 2022.
Mao et al.: Overview of the NTCIR-14 We Want Web Task, Proceedings of NTCIR-14, pp.455-467, 2019. pdf
Sakai et al.: Overview of the NTCIR-15 We Want Web with CENTRE (WWW-3) Task, Proceedings of NTCIR-15, pp.219-234, 2020. pdf
Sakai et al.: Overview of the NTCIR-16 We Want Web with CENTRE (WWW-4) Task, Proceedings of NTCIR-16, pp.234-245, 2022. pdf
Sakai et al.: Relevance Assessments for Web Search Evaluation: Should We Randomise or Prioritise the Pooled Documents? (CORRECTED VERSION), arXiv:2211.00981, 2022.
Sakai et al.: Relevance Assessments for Web SearchEvaluation: ShouldWe Randomise or Prioritise the Pooled Documents? ACM TOIS, 40(4), Article 76, 2022. open access pdf CORRIGENDUM
Sakai et al.: WWW3E8: 259,000 Relevance Labels for Studying the Effect of Document Presentation Order for Relevance Assessors, Proceedings of ACM SIGIR 2021, pp.2376–2382, 2021. Note that Table 1 (inter-assessor agreements) in this resource paper is incorrect. The correct values can be found in Table 1 of the CORRECTED VERSION of the TOIS paper.