Home

In short

The GETALP (Study Group for Machine Translation and Automated Processing of Languages and Speech) was created in 2007 at the same time as LIG laboratory.

Resulting from the virtuous union of researchers in natural language as well as  speech processing, GETALP is a multidisciplinary group (computer scientists, linguists, phoneticians, translators and signal processing specialists) with the objective to address all theoretical, methodological, and practical aspects of multilingual communication and multilingual (written or spoken) information processing.

GETALP methodology consists of a continuous interaction between data collection, fundamental research, development of systems, applications and experimental evaluations.

Highlights

Ecological approach:
one specificity of GETALP is the willingness to address the diversity of written or spoken language usage: multiplicity of languages, speakers, dialects, cultures, social contexts and applications, with a special interest for “long tails” (under-resourced languages, atypical speakers etc.)
Agnostic approach:
the history of the team and its different scientific cultures allow the synergy between expert and empirical methods, large-scale analysis (big data) and analysis of phenomena requiring fine-grain annotations (beautiful data), induction and models, etc.
Multidisciplinarity:
the strength of GETALP comes from the complementarity of its members enabling wode approach from data collection to evaluation, from understanding of fundamental communicative phenomena to industrial applications.
Human in the loop:
to assist humans in communicative situations, we include them in the automated processes (semi-supervised approaches, collaborative and interactive approaches, trace and error analysis).
Development of tools and resources:
GETALP develops and distributes open source tools and resources such as: a cooperative Web platform for development of multilingual lexical databases, a collection of written and oral corpora for processing low resource languages, corpora collected during human interaction, a collaborative system of post-editing and evaluation of machine translation, etc.

Research topics

The research activities of the GETALP team are organized around two main axes, guided by values of ethics, openness, and societal responsibility.

The first axis focuses on the scientific methods used and developed within the team. The team develops innovative methods for the analysis, processing, and modeling of natural languages. We are particularly focusing on the following:

  • Methods enabling the analysis, collection, processing, and representation of data and resources to allow for the discovery of knowledge, sharing, and accessibility of human knowledge.
  • The construction, application, and verification of theories and modeling approaches to contribute to the advancement of fundamental knowledge in machine learning and computational linguistics.
  • The development of tools and platforms for NLP tools to disseminate the tools to the economic world and support the work of linguists.
  • The development of evaluation methods to measure not only the performance of systems but also their societal impact.
  • The development and evaluation of methods for explaining automatic NLP processing models to make NLP models more transparent and interpretable to promote their adoption in sensitive areas and build user confidence.

The methods of the first axis contribute to the second axis, which focuses on NLP tasks and NLP applications. The team focuses particularly on:

  • Machine Translation (written, spoken, visual).
  • Automatic speech transcription.
  • Multilingual and multimodal processing.
  • Automatic linguistic analysis (document classification, text or speech understanding, syntactic and semantic analysis of text or speech…)
  • Processing of human and human-machine interactions such as the processing of affect in interaction.
  • Language documentation.

Local, national, and international ecosystem

GETALP builds upon a rich local ecosystem and collaborates with other LIG research teams and with other Grenoble laboratories. GETALP is an internationally recognized player in the natural language and speech processing communities with many collaborative and industrial projects and the organization of the JEP-TALN-RECITAL conference in 2012. A remarkable feature is its international network of collaborations and contacts on all continents, which makes GETALP a particularly relevant and convincing stakeholder to contribute to the theoretical and technological Grail of multilingualism.

The GETALP team comes from GEOD and GETA teams of the CLIPS laboratory and has a long history.

Groupe d'Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole