Last Updated: 17 July 2024
The Neural Cross-Language Information Retrieval (NeuCLIR) track is a TREC shared task that studies the impact of neural approaches in cross-language information retrieval and generation.
You can participate in the shared task by submitting a retrieval or generation system for evaluation. Once completed, the track provides reusable test collections for future investigations.
This year, we continue the Cross-Language (CLIR) and Multi-lingual (MLIR) news and technical document retrieval tasks. We also introduce a new Report Generation task.
Details on each of the 2024 tasks are provided below:
Skip to Important Dates.
See the CLIR & MLIR Guidelines for full task details.
In the Cross-Language Retrieval (CLIR) task, systems receive queries in one language (English) and retrieve from a corpus of news articles written in another language (Chinese, Persian, or Russian).
ir_datasets
or TREC website.
See the CLIR & MLIR Guidelines for full task details.
Technical language poses a particular problem for cross-language retrieval systems, so the Cross-Language Technical Document task focuses on testing this phenomenon in particular. Systems receive queries in one language (English) and retrieve from a corpus of technical abstracts written in another language (Chinese).
ir_datasets
or TREC website.
See the CLIR & MLIR Guidelines for full task details.
The Multilingual Retrieval (MLIR) task provides systems with queries in one language (English) and a corpus of documents written in multiple languages (Chinese, Persian, and Russian). The task is to retrieve and produce a ranked list from all three languages. The queries are written in a way that there should be relevant documents in more than one language.
ir_datasets
or TREC website.
See the Report Generation Guidelines for full task details.
The Cross Language Report Generation task asks the system to generate an English report with citations to documents in one of the news collection used in the CLIR task (see guideline for destils on length and citation requirements) based on a report request (example report request). The reports will be evaluated based on the information included in the text and the appropriateness of the citations (example evalaution data).
ir_datasets
or TREC website.
neuclir/1/multi
(Persian, Russian, and Chinese)csl
(Chinese Technical Abstracts)
In alphabetical order:
NeuCLIR previously ran in 2022 and 2023. You can find the previous versions of the task below: