Conference topics

We invite submissions on a broad spectrum of topics concerning linguistic and computational studies focusing on low-resource languages, including but not limited to the following topics:

Language resources for low-resource languages

  • Dataset creation and annotation
  • Evaluation methodologies and benchmarks for low-resource settings
  • Lexical resources, corpora, and linguistic databases
  • Crowdsourcing and community-driven data collection
  • Tools and frameworks for low-resource language processing

Core language technologies for low-resource languages

  • Language modelling and pre-training for low-resource languages
  • Speech recognition, text-to-speech, and spoken language understanding
  • Phonology, morphology, word segmentation, and tokenisation
  • Syntax: tagging, chunking, and parsing
  • Semantics: lexical and sentence-level representation

NLP Applications for low-resource languages

  • Information extraction and named entity recognition
  • Question answering systems
  • Dialogue and interactive systems
  • Summarisation
  • Machine translation
  • Sentiment analysis, stylistic analysis, and argument mining
  • Content moderation
  • Information retrieval and text mining

Multimodality and Grounding for low-resource languages

  • Vision and language for low-resource contexts
  • Speech and text multimodal systems
  • Low-resource sign language processing

Ethics, Equity, and Social Impact for low-resource languages

  • Bias and fairness in low-resource language technologies
  • Sociolinguistic considerations in technology development
  • Cultural appropriateness and sensitivity

Human-Centred Approaches in low-resource languages

  • Usability and accessibility of low-resource language technologies
  • Educational applications and language learning
  • Community needs assessment and technology adoption
  • User experience research in low-resource contexts

Multilinguality and Cross-Lingual Methods for low-resource languages

  • Multilingual language models and their adaptation
  • Code-switching and code-mixing
  • Cross-lingual transfer learning in low-resource languages. 

Special Theme Track 1 — Building Applications Based on Large Language Models for Low-Resource Languages

LaTeLL’2026 will feature a Special Theme Track dedicated to the development of applications based on Large Language Models (LLMs) for low-resource languages.

This track aims to explore innovative methodologies, architectures, and tools that leverage the power of LLMs to enhance linguistic processing, accessibility, and inclusivity for underrepresented languages. Contributions are encouraged on topics such as model adaptation and fine-tuning, multilingual and cross-lingual transfer, ethical and fairness considerations, and the creation of datasets and benchmarks that facilitate the integration of LLM-based solutions in low-resource settings.

Special Theme Track 2 — Modern Standard Arabic (MSA) and Arabic Dialects

This special track addresses the unique challenges and opportunities in processing Modern Standard Arabic (MSA) and the rich landscape of Arabic dialects. The diglossic nature of Arabic, where the formal MSA coexists with numerous, widely used spoken dialects, presents a significant hurdle for NLP. While MSA is relatively well-resourced, Arabic dialects are quintessential examples of low-resource languages, often lacking standardised orthographies, annotated corpora, and dedicated processing tools. This track invites submissions on novel research and resources aimed at bridging this gap and advancing the state of the art in Arabic language technology. Topics of interest include, but are not limited to:

  • Code-switching between Arabic dialects, MSA, and other languages
  • Dialect identification and classification
  • Creation of corpora and lexical resources for Arabic dialects
  • Machine translation between MSA and dialects, and across different dialects
  • Speech recognition and synthesis for dialectal Arabic
  • Computational modelling of morphology, syntax, and semantics for dialects
  • NLP applications (e.g., sentiment analysis, NER) for dialectal user-generated content