LT4HALA

LT4HALA 2024

--Home-- --CFP-- --EvaLatin-- --EvaHan-- --Program-- --Organization--

CALL FOR PAPERS

Website: https://circse.github.io/LT4HALA/
Submission page: https://softconf.com/lrec-coling2024/lt4hala2024/
Date: Saturday, May 25 2024 (post-conference workshop)
Place: co-located with LREC-COLING 2024, May 20-25, Torino, Italy (and online)

DESCRIPTION

LT4HALA 2024 is a one-day workshop that seeks to bring together scholars who are developing and/or are using Language Technologies (LTs) for historically attested languages, so to foster cross-fertilization between the Computational Linguistics community and the areas in the Humanities dealing with historical linguistic data, e.g. historians, philologists, linguists, archaeologists and literary scholars. LT4HALA 2024 follows LT4HALA 2020 and 2022 that was organized in the context of LREC 2022 and LREC 2022, respectively. Despite the current availability of large collections of digitized texts written in historical languages, such interdisciplinary collaboration is still hampered by the limited availability of annotated linguistic resources for most of the historical languages. Creating such resources is a challenge and an obligation for LTs, both to support historical linguistic research with the most updated technologies and to preserve those precious linguistic data that survived from past times.

Relevant topics for the workshop include, but are not limited to:

handling spelling variation,
detection and correction of OCR errors,
creation and annotation of linguistic resources,
deciphering,
morphological/syntactic/semantic analysis of textual data,
adaptation of tools to address diachronic/diatopic/diastratic variation in texts,
teaching ancient languages with LTs,
NLP-driven theoretical studies in historical linguistics,
NLP-driven analysis of literary ancient texts,
evaluation of LTs designed for historical and ancient languages,
Large Language Models for the automatic analysis of ancient texts.

SHARED TASKS

LT4HALA 2024 will also host:

the third edition of EvaLatin, an evaluation campaign entirely devoted to the evaluation of NLP tools for Latin. The third edition of EvaLatin will focus on two tasks (i.e. dependency parsing and emotion polarity detection). Dependency parsing will be based on the Universal Dependencies framework. No specific training data will be released but participants will be free to make use of any (kind of) resource they consider useful for the task, including the Latin treebanks already available in the UD collection. In this regard, one of the challenges of this task will be to understand which treebank (or combination of treebanks) is the most suitable to deal with new test data. Test data will be both prose and poetic texts from different time periods. Also for the emotion polarity detection task, no training data will be released but the organizers will provide an annotation sample, a manually created polarity lexicon and annotation guidelines. Also in this task, participants will be free to pursue the approach they prefer, including unsupervised and/or cross-language ones (which promise to be the most efficient, given the lack of training data for Latin for this task). Test data will be poetic texts from different time periods.
the third edition of EvaHan, the evaluation campaign for the evaluation of NLP tools for Ancient Chinese. EvaHan 2024 will focus on two tasks: Ancient Chinese sentence segmentation and sentence punctuation.

SUBMISSIONS

Submissions of three forms of papers will be considered:

Regular long papers: up to eight (8) pages maximum*, presenting substantial, original, completed, and unpublished work.
Short papers: up to four (4) pages*, describing a small focused contribution, negative results, system demonstrations, etc.
Position papers: up to eight (8) pages*, discussing key hot topics, challenges and open issues, as well as cross-fertilization between computational linguistics and other disciplines.

* Excluding any number of additional pages for references, ethical consideration, conflict-of-interest, as well as data, and code availability statements.

We encourage the authors of papers reporting experimental results to make their results reproducible and the entire process of analysis replicable, by making the data and the tools they used available. The form of the presentation may be oral or poster, whereas in the proceedings there is no difference between the accepted papers. The submission is anonymous. The LREC-COLING 2024 official format is requested. Each paper will be reviewed but three independent reviewers.

As for EvaLatin and EvaHan, participants will be required to submit a technical report for each task (with all the related sub-tasks) they took part in. Technical reports will be included in the proceedings as short papers: the maximum length is 4 pages (excluding references) and they should follow the LREC-COLING 2024 official format. Reports will receive a light review (we will check for the correctness of the format, the exactness of results and ranking, and overall exposition). All participants will have the possibility to present their results at the workshop. Reports of the shared tasks are not anonymous.

IMPORTANT DATES

Workshop

~~26 February 2024:~~ submission due –> NEW: 1 March 2024 (23:59 CET)
18 March 2024: reviews due
22 March 2024: notifications to authors
1 April 2024: camera-ready (PDF) due

EvaLatin

22 December 2023: guidelines available
Evaluation Window I - Task: Dependency Parsing
- 1 February 2024: test data available
- 8 February 2024: system results due to organizers
Evaluation Window II - Task: Emotion Polarity Detection
- 12 February 2024: test data available
- 19 February 2024: system results due to organizers
11 March 2024: reports due to organizers
22 March 2024: short report review deadline
1 April 2024: camera ready version of reports due to organizers

EvaHan

22 December 2023: training data available
Evaluation Window
- 12 February 2024: test data available
- 19 February 2024: system results due to organizers
11 March 2024: reports due to organizers
22 March 2024: short report review deadline
1 April 2024: camera ready version of reports due to organizers

When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC-COLING authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones).

Back to the Main Page

CALL FOR PAPERS

DESCRIPTION

SHARED TASKS

SUBMISSIONS

IMPORTANT DATES

Identify, Describe and Share your LRs!