Meta Releases SeamlessM4T Multimodal Translation Model

Meta is breaking down language barriers with its new multitasking translation model, SeamlessM4T. This groundbreaking AI can seamlessly translate across nearly one hundred languages for both speech and text.

In our hyper-connected world, accessing content in multiple languages is key. SeamlessM4T aims to enable true cross-cultural understanding.

Seamless M4T Capailities:

  • Automatic speech recognition for almost 100 languages
  • Translating speech to text in 100 input and output languages
  • Translating speech to speech in 100 input languages and 35 output languages including English
  • Text-to-text translation for close to 100 languages
  • Text-to-speech translation for almost 100 input languages and 35 output languages including English

SeamlessM4T is available to researchers and developers under an open science license. Metadata from SeamlessAlign, a massive multimodal translation dataset, has also been released to enable community research.

This unified model overcomes previous limitations in multilingual AI tools. Rather than relying on separate subsystems for different tasks, SeamlessM4T handles translation across text and speech in one powerful package.

New Tool Builds On Prior Innovations:

It builds on Meta’s previous innovations like the No Language Left Behind project. The model shows strong performance on both high and low-resource languages.

At its core is the versatile multitask UnitY architecture. UnitY supports various translations through advanced techniques like dual encoders and sophisticated decoding.

In tests, SeamlessM4T surpassed other top models in accuracy across tasks. Meta has focused on bias mitigation and safety to ensure responsible AI development.

By open-sourcing SeamlessM4T, Meta enables collaboration to further improve multilingual communication AI. This technology could bring us closer to the dream of universally understandable communication unhindered by language.

