NTCIR-7 ACLIA Home Page
Contents
Welcome to the NTCIR-7 ACLIA (Advanced Cross-lingual Information Access) wiki! This task cluster is also known as "NTCIR-7 Cluster 1 Complex Cross-Lingual Question Answering (QA) / Information Retrieval (IR) for QA for newswire documents".
What's new
- 2008-06-20: Schedule has been updated
- 2008-06-13: Evaluation methods are posted
- 2008-06-11: Submission formats are linked to the front page
- 2008-05-12: Schedule has been updated
- 2008-04-08: We would like to encourage all of participants including IR4QA to use EPAN tool to create training data during the month of April and we will distribute the data in May for your training purposes
- 2008-03-11: The tool called EPAN is ready for participants to create the training data.
2008-02-09: The corpus information for ACLIA tasks is in: TaskDefinition.
- 2008-02-09: The User's agreement for Xinhua corpus for Simplified Chinese task is now availabel from the NTCIR-7 page.
Please visit: http://ntcir.nii.ac.jp/index.php/User-Agreement.html.
- 2007-11-15: Registration due has been extended to Dec 27.
2007-11-09: NII presented Introduction to NTCIR-7 at NII, Tokyo, Japan. Forward inquiries to <ntc-secretariat at nii.ac.jp>
- 2007-10-05: First announcement of ACLIA. Please be patient until we add more contents (e.g. Call for Participation, task definition, Japanese/Chinese pages, etc.)
Roadmap: Advanced Cross-lingual Information Access (ACLIA)
The ultimate goal in cross-lingual information access is to answer any type of question or to retrieve any type of information needs in any language with responses drawn from multilingual corpora. If the information needs are very simple ones (e.g. factoid question), then the answer can be a simple word or phrases. If the information needs are more complex, then the answers may come from multiple documents. The candidate answers from different corpora could be merged or possibly summarized. Then the answer will be presented to the user. Alternatively, candidate answers from different languages could be translated into the user’s native language. For example, if we retrieve answers from Chinese, Japanese and Korean corpora and translate them into English, we can compare the nature of answers drawn from different geographic and cultural contexts.
In ACLIA, our research direction is moving towards these goals and it is our hope that the QA and IR research groups can collaborate to achieve these goals in NTCIR-7:
Goal for Cross-lingual Question Answering: Since NTCIR has already evaluated factoid questions in the past, the ACLIA at NTCIR-7 will promote progress towards the longer-term research goals by introducing complex questions. In future evaluations (beyond NTCIR-7) we are planning to combine factoid and complex question answering; we are also planning to introduce multilingual answer merging as one form of cross-lingual QA evaluation.
Goal for Cross-lingual Information Retrieval: Since NTCIR has evaluated CLIR for four times in the past (NTCIR-3,4,5,6), in ACLIA for NTCIR-7, we would like to motivate the CLIR to be evaluated as a component of complex cross-lingual question answering (CCLQA). Since the research in CLIR in NTCIR has matured, it would be interesting to find out which IR technique would help CCLQA. We are designing XML input/output specification of each module, so that the CLIR component can be integrated into CCLQA for evaluation. At the same time, we can evaluate the CLIR component itself.
NTCIR-7 Task Overview
Current research in QA is moving beyond factoid CLQA, so there is a significant motivation to evaluate more complex questions in order to move the research forward. The task is novel in that it evaluates cross-lingual QA on complex questions (i.e. events, biographies/definitions, and relationships) for the first time. Although the QAC4 task in NTCIR-6 evaluated monolingual QA on complex questions, no evaluation at NTCIR, TREC or CLEF has evaluated cross-lingual QA on complex questions (CCLQA). One goal in ACLIA is to develop effective CLQA evaluations for complex questions. We will evaluate end-to-end systems and also conduct module-based evaluations for question type analysis, document retrieval and answer extraction.
Why (CL)IR researchers should consider participating in ACLIA: Since document retrieval is an essential part of CLQA, we would like the CLIR community to participate in ACLIA to evaluate their CLIR systems as modules and as part of end-to-end QA systems. See more in detail.
Although our main focus is cross-lingual complex QA and IR for QA, we also intend to accept monolingual complex QA runs (e.g. J-J, C-C), which will be evaluated along with the cross-lingual runs from E-J and E-C.
NTCIR-7 Task Definition
The task in ACLIA is asking complex questions in English and getting answers in Chinese (Simplified, Traditional) or Japanese. The target corpus will consist of newspaper articles.
The task will evaluate research on four types of questions: events, biographies, definitions, and relationships. Each participant is expected to register for one or more of the following tasks.
- English to Japanese CLQA (with Japanese to Japanese as a subtask);
- English to Chinese CLQA (Simplified or Traditional, with Chinese to Chinese as a subtask);
- English to Japanese CLIR (embedded in E-J CLQA);
- English to Simplified Chinese CLIR (embedded in E-C CLQA);
In order to combine a CLIR module with a CLQA system for module-based evaluation, we will use an XML input/output format. The following diagram shows an example CLQA data flow with provision for embedded CLIR.
For more details, go to TaskDefinition
Submission Formats
Submission Formats are found in TaskDefinition#Format
The EPAN interface is used to release and submit the data. We send the detail instructions by email for each release and submission. If you don't receive email from the ACLIA mailing list, please send email to the ACLIA organizers.
Evaluation
Evaluation methods are found in TaskDefinition#Evaluation
Task Organizers
If you have any questions, please send email to <aclia-organizers at mailman.srv.cs.cmu.edu>:
ACLIA task organizers:
Teruko Mitamura <teruko AT cs.cmu.edu> (Carnegie Mellon University)
Eric Nyberg <ehn AT cs.cmu.edu> (Carnegie Mellon University)
Embedded CLIR coordinators:
Tetsuya Sakai (NewsWatch, Inc.)
Fred Gey (UC Berkeley)
IR for QA coordinators:
Noriko Kando (National Institute of Informatics)
- Donghong Ji (Wuhan University)
Japanese CLQA coordinators:
Tsuneaki Kato (Tokyo University)
Tatsunori Mori (Yokohama National University)
Simplified Chinese CLQA coordinators:
Chin-Yew Lin (Microsoft Research Asia)
Ruihua Song (Microsoft Research Asia)
Traditional Chinese CLQA coordinators:
Chuan-Jie Lin (National Taiwan Ocean University)
ACLIA advisors:
Noriko Kando (National Institute of Informatics)
Kui-Lam Kwok (Queens College)
Schedule
What |
When |
First call for participation |
2007-10-05 |
Registration Due |
2007-12-27 |
Document Set Release |
2008-01 |
Dry Run |
2008-01 ~ 2008-06 |
Formal Run |
2008-06-23 ~ 2008-08-01 |
Release TOPICS to CCLQA, IR4QA |
June 23 at 5pm (EST) |
Submit Answer type analysis for IR4QA group |
From June 26 at 10am (EST) until June 30 at 10am (EST) |
Release Answer type analysis to IR4QA group from CCLQA group |
June 30 at 5pm (EST) |
Submit CCLQA End-to-end results(E-J, E-CS, E-CT) |
July 7 by 5pm (EST) |
Submit IR4QA results(Both with or without the Use of Answer type analysis from CCLQA) |
July 14 by 5pm (EST) |
Submit Monolingual QA results (J-J, CS-CS, CT-CT) |
July 14 by 5pm(EST) |
Release IR4QA results to CCLQA groups(Both with and without the use of Answer type analysis) |
July 21 by 5pm (EST) |
Submit CCLQA results using IR4QA results |
August 1 by 5pm (EST) |
Task Overview Partial Release |
2008-09-01 |
Evaluation Results Return |
2008-09-01 |
Paper for the Proceedings Due |
2008-10-01 |
Camera-ready Paper for the Proceedings Due |
2008-11-14 |
Final Meeting |
2008-12-16 ~ 2008-12-19 |
Misc
Contact <aclia-organizers AT mailman.srv.cs.cmu.edu> for general questions about ACLIA.
RelatedLink has a list of ACLIA-related links
AboutWiki contains links to documentations for how to create an account and edit this wiki.
