2024 TREC NeuCLIR Track

Last Updated: 17 July 2024

The Neural Cross-Language Information Retrieval (NeuCLIR) track is a TREC shared task that studies the impact of neural approaches in cross-language information retrieval and generation.

You can participate in the shared task by submitting a retrieval or generation system for evaluation. Once completed, the track provides reusable test collections for future investigations.

🎉 NeuCLIR is back for a third year!

This year, we continue the Cross-Language (CLIR) and Multi-lingual (MLIR) news and technical document retrieval tasks. We also introduce a new Report Generation task.

Details on each of the 2024 tasks are provided below:

Cross-Langauge Retrieval
Cross-Langauge Technical Documents
Multilingual Retrieval
🆕 Cross-Language Report Generation

Skip to Important Dates.

Official Tasks

🔍 Task: Cross-Language Retrieval (CLIR)

See the CLIR & MLIR Guidelines for full task details.

In the Cross-Language Retrieval (CLIR) task, systems receive queries in one language (English) and retrieve from a corpus of news articles written in another language (Chinese, Persian, or Russian).

Access collection, past queries, and qrels on ir_datasets or TREC website.

Learn more from overview papers of and .

🔍 Task: Cross-Language Technical Documents

See the CLIR & MLIR Guidelines for full task details.

Technical language poses a particular problem for cross-language retrieval systems, so the Cross-Language Technical Document task focuses on testing this phenomenon in particular. Systems receive queries in one language (English) and retrieve from a corpus of technical abstracts written in another language (Chinese).

Access collection, past queries, and qrels on ir_datasets or TREC website.

Learn more from the overview paper of .

🔍 Task: Multilingual Retrieval (MLIR)

See the CLIR & MLIR Guidelines for full task details.

The Multilingual Retrieval (MLIR) task provides systems with queries in one language (English) and a corpus of documents written in multiple languages (Chinese, Persian, and Russian). The task is to retrieve and produce a ranked list from all three languages. The queries are written in a way that there should be relevant documents in more than one language.

Access collection, past queries, and qrels on ir_datasets or TREC website.

Learn more from overview paper of .

Pilot Task

🔍 Task: Cross-Language Report Generation

See the Report Generation Guidelines for full task details.

The Cross Language Report Generation task asks the system to generate an English report with citations to documents in one of the news collection used in the CLIR task (see guideline for destils on length and citation requirements) based on a report request (example report request). The reports will be evaluated based on the information included in the text and the appropriateness of the citations (example evalaution data).

Access document collection from ir_datasets or TREC website.

Read our for our vision on report generation.

Track Information

Important Dates

March 2022

Document Collections Released

CLIR:
neuclir/1/fa (Persian)
neuclir/1/ru (Russian)
neuclir/1/zh (Chinese)

MLIR:
neuclir/1/multi (Persian, Russian, and Chinese)

Technical:
csl (Chinese Technical Abstracts)
25 March 2024

Track Guidelines Released

CLIR & MLIR Guidelines

Report Generation Guidelines
June 2024

CLIR/MLIR Topics and Report Requests released on NIST website
July 2024

CLIR Technical Topics released on NIST website
6 August 2024

CLIR Technical Task Submissions due to NIST
13 August 2024

CLIR/MLIR News Task and Report Generation Task Submissions due to NIST
October 2024

Results distributed to participants
November 2024

TREC 2024

Changes from 2023

The technical documents pilot has been promoted to a full task.
Later submission deadline!
A pilot of a query-driven report generation task (Generative IR) from the multilingual document set.

Organizers

In alphabetical order:

Dawn Lawrie, Johns Hopkins University, HLTCOE
Sean MacAvaney, University of Glasgow
James Mayfield, Johns Hopkins University, HLTCOE
Paul McNamee, Johns Hopkins University, HLTCOE
Douglas W. Oard, University of Maryland
Luca Soldaini, Allen Institute for AI
Eugene Yang, Johns Hopkins University, HLTCOE

Contact

Mailing List (for the latest announcement and news)
TREC Slack #neuclir-2024 channel (once registered for TREC)
Twitter @neuclir
Any questions: neuclir-organizers@googlegroups.com

Previous Years

NeuCLIR previously ran in 2022 and 2023. You can find the previous versions of the task below:

2023: Past Track
2022: Past Track