LT4HALA

LT4HALA 2024

--Home-- --CFP-- --EvaLatin-- --EvaHan-- --Program-- --Organization--

EvaLatin

Introduction
Important Dates
Data
Evaluation
How to participate

INTRODUCTION

The LT4HALA 2024 workshop will also be the venue of the third edition of EvaLatin, the evaluation campaign totally devoted to the evaluation of NLP tools for Latin. The campaign is designed with the aim of answering two questions:

How can we promote the development of resources and language technologies for the Latin language?
How can we foster collaboration among scholars working on Latin and attract researchers from different disciplines?

EvaLatin 2024 edition will have 2 tasks, i.e. Dependency Parsing and Emotion Polarity Detection. Shared test data and an evaluation script will be provided to the participants who will choose to participate in either one or all tasks.

EvaLatin 2024 is organized by Rachele Sprugnoli, Federica Iurescia and Marco Passarotti.

IMPORTANT DATES

22 December 2023: guidelines available
Evaluation Window I - Task: Dependency Parsing
- 1 February 2024: test data available
- 8 February 2024: system results due to organizers
Evaluation Window II - Task: Emotion Polarity Detection
- 12 February 2024: test data available
- 19 February 2024: system results due to organizers
11 March 2024: reports due to organizers
22 March 2024: short report review deadline
1 April 2024: camera ready version of reports due to organizers

DATA

Dependency parsing

The dependency parsing task is based on the Universal Dependencies framework. No specific training data was not released but participants are free to make use of any (kind of) resource they consider useful for the task, including the Latin treebanks already available in the UD collection. In this regard, one of the challenges of this task is to understand which treebank (or combination of treebanks) is the most suitable to deal with new test data. Test data includes both prose and poetic texts.

1 February 2024 –> Download test data for the Dependency Parsing task.

1 February 2024 –> Check the updated version of the guidelines - v1.1 with information about Dependency Parsing test data and the baseline.

Emotion polarity detection

Even for the emotion polarity detection task, no training data will be released but the organizers provide an annotation sample, a manually created polarity lexicon and annotation guidelines. Also in this task, participants will be free to pursue the approach they prefer, including unsupervised and/or cross-language ones (which promise to be the most efficient, given the lack of training data for Latin for this task). Test data will be poetic texts from different time periods.

12 February 2024 –> Download test data for the Emotion Polarity Detection task.

12 February 2024 –> Check the updated version of the guidelines - v1.2 with information about Emotion Polarity Detection test data and the baseline.

EVALUATION

Scorer for the Dependency Parsing task: https://github.com/CIRCSE/LT4HALA/blob/master/2024/conll18_ud_eval.py
Scorer for the Emotion Polarity Detection task: https://github.com/CIRCSE/LT4HALA/blob/master/2024/scorer-emotion.py
Gold data for the Dependency Parsing task: https://github.com/CIRCSE/LT4HALA/blob/master/2024/data_and_doc/EvaLatin_2024_Syntactic_Parsing_test_data_gold.zip
Gold data for the Emotion Polarity Detection task: https://github.com/CIRCSE/LT4HALA/blob/master/2024/data_and_doc/EvaLatin_2024_EmotionPolarityDetection_gold_data.zip

HOW TO PARTICIPATE

Participants will be required to submit their runs and to provide a technical report that should include a brief description of their approach, focusing on the adopted algorithms, models and resources, a summary of their experiments, and an analysis of the obtained results. Technical reports will be included in the proceedings as short papers: the maximum length is 4 pages (excluding references) and they should follow the LREC-COLING 2024 official format. Reports will receive a light review (we will check for the correctness of the format, the exactness of results and ranking, and overall exposition). Reports should be submitted using the START submission page of the workshop.

Participants are allowed to use any approach (e.g. from traditional machine learning algorithms to Large Language Models) and any resource (annotated and non-annotated data, embeddings): all approaches and resources are expected to be described in the systems’ reports.

Back to the Main Page