Facebook has announced its new machine translation model which can directly translate any pair of 100 languages without using English data. This algorithm, known as M2M-100, has been trained on 2,200 languages, ten times more than the previous best.

As per the company, this model outperformed English-centric multilingual models by 10 points on the BLEU (bilingual evaluation understudy), an algorithm which evaluates the quality of the translated text. The M2M-100 model has been open-sourced.

The ultimate goal of this multilingual machine translator is to build a model that can perform bidirectional translation between 7,000 languages of the world to benefit low-resource languages in particular. The novelty of Facebook’s M2M-100 model lies in the fact that it does not depend on English as a link between two languages. For example, for translation between Chinese and Hindi, typically systems train on Chinese to English and then English to Hindi; however, the M2M-100 model can now directly translate on Chinese to Hindi data to better preserve the original meaning.

#research #facebook m2m-100 #mining data #data-science

Facebook’s New Machine Translation Model Works Without Help Of English Data
1.50 GEEK