Google Unveils TranslateGemma: Dive Into Open Source Translation Models!

January 24, 2026

Google dévoile TranslateGemma, des modèles de traduction open source

The latest suite of models encompasses 55 languages and targets diverse deployment environments, ranging from mobile devices to cloud-based systems.

Table of Contents

On Thursday, January 15, OpenAI took on a staple of the web: Google Translate, which is celebrating its 20th anniversary this year. Although ChatGPT Translate is still imperfect at this stage, its launch sends a clear signal to Google, showing that OpenAI now aims to position itself in areas historically dominated by the tech giant from Mountain View.

Google responded on the same day by unveiling TranslateGemma, a new project dedicated to translation. At first glance, it appears to be just another service or tool. However, it’s actually a suite of open models designed as a technological foundation for research and development. Let’s delve deeper.

Open Translation Models Based on Gemma 3

TranslateGemma is a fresh suite of open translation models developed using Gemma 3. It is available in three sizes—4B, 12B, and 27B—to meet various technical constraints and usage requirements. The goal is to provide efficient, accessible, and powerful translation models that can be used across a wide range of hardware environments.

The design of TranslateGemma is based on a distillation process that aims to transfer capabilities from large models to more compact architectures. According to Google, this process allows it to “outperform models that are twice as large”.

For developers, this represents a major advancement. It’s possible to achieve high-fidelity translation with less than half the parameters of the base model. This significant improvement in efficiency allows for higher throughput and lower latency without sacrificing accuracy, Google emphasizes.

The transferred knowledge primarily comes from the Gemini models. The training involves a combination of “supervised fine-tuning” and “a reinforcement learning phase” which includes “a set of reward models designed to guide the models towards producing more contextually accurate and natural translations”.

Highlighted Performance Across 55 Languages

The performance of TranslateGemma has been tested on the WMT24++ dataset, which includes 55 languages. These languages come from a variety of linguistic families, including those considered “low-resource”.

Beyond these main languages, we have pushed the boundaries by training our model on nearly 500 additional language pairs. TranslateGemma is designed to serve as a solid foundation for further adaptations, making it an ideal starting point for researchers looking to refine their own cutting-edge models for specific language pairs or to enhance quality for under-resourced languages.

Tests conducted by Google show a reduction in error rates for TranslateGemma 12B compared to Gemma 3 27B across all evaluated linguistic families: Romance, Germanic, Balto-Slavic, Indo-Iranian, Dravidian, East and Southeast Asian, Afro-Asiatic, Niger-Congo, as well as Uralic, Turkic, and Hellenic.

A Suite Designed for Various Deployments

Unlike ChatGPT Translate, TranslateGemma is not intended for the general public. It is crafted to serve as a technological foundation for third-party uses, in contexts of research, development, or product integration. It could be particularly relevant for incorporating translation functions into professional software, multilingual business tools, or applications that require local data processing.

The three available sizes are designed to meet a range of deployment environments and technical constraints:

  • Model 4B: suitable for mobile deployments or at the edge of the network, where resource constraints and latency are significant considerations.
  • Model 12B: intended for use on local machines or development stations, for applications that require a balance between quality and efficiency.
  • Model 27B: designed for cloud deployments aimed at maximum fidelity, with greater hardware requirements.

The TranslateGemma models also retain the multimodal capabilities of Gemma 3. Google states that the improvements made to text translation “have a positive impact on the ability to translate text in images”.

TranslateGemma is made available as open-source and can be downloaded through the usual Gemma model distribution channels, including Hugging Face and Kaggle.

Similar Posts

Rate this post

Leave a Comment

Share to...