Massively Multilingual Langage Technologies - Jean Maillard

Next up Jean Maillard, who works on Meta’s Fundamental AI Research team, is talking about some of the projects he’s worked on in the language technologies space.

The technologies:

  1. No Language Left Behind (NLLB) - a machine text-to-text translation.
  2. UNESCO Language Translator - uses NLLB
  3. Open Language Data Initiative - a “shared task” to extend the language support of models
    • a “shared task” is a partnership in the research community where many researchers collaborate to solve a single problem.
  4. Seamless Communication - speech-to-text, text-to-speech, and speech-to-speech
    • Interestingly, you can change languages in real time, and the model can handle that.
  5. Massively Multilingual Speech Model - Automatic Speech Recognition (auto captioning) and Text-to-Speech (additional languages)

Challenges in the Multilingual Language Model Space:

  1. Resource Efficiency
  2. Data Efficiency
  3. Data Quality

Conversation