
Cover image from Google Blog. Google DeepMind has officially introduced Gemma 3, the latest...
Cover image from Google Blog .
Google DeepMind has officially introducedGemma 3, the latest iteration of its open-source language model series. This release brings significant improvements, including multimodal capabilities, extended context lengths, and enhanced multilingual performance. With sizes ranging from1 billion to 27 billion parameters, Gemma 3 is designed for efficient deployment on consumer-grade hardware while delivering state-of-the-art performance. It also outperforms Llama3-405B, DeepSeek-V3 and o3-mini in human preference evaluations on the LMArena leaderboard.
One of the biggest additions to Gemma 3 isvision understanding. Unlike previous versions, Gemma 3 can process images through a customSigLIP vision encoder. This encoder converts images into a fixed-size vector representation that the language model interprets as soft tokens.
Gemma 3 significantly increases its context length compared to previous versions, supporting up to128,000 tokens(except the 1B model, which supports 32K tokens). Handling such long contexts efficiently requires key architectural optimizations:
Gemma 3 enhances its multilingual capabilities by revisiting its training data mixture and adopting theGemini 2.0 tokenizer:
The instruction-tuned (IT) models of Gemma 3 undergo anadvanced post-training pipeline, incorporatingknowledge distillation, reinforcement learning (RLHF), and dataset filtering.
Gemma 3 achieves impressive results across various AI benchmarks:
Benchmark | Gemma 3 27B | Gemma 2 27B | Improvement |
---|---|---|---|
MMLU-Pro | 67.5% | 56.9% | ✅ +10.6% |
LiveCodeBench | 29.7% | 20.4% | ✅ +9.3% |
Bird-SQL (dev) | 54.4% | 46.7% | ✅ +7.7% |
FACTS Grounding | 74.9% | 62.4% | ✅ +12.5% |
Gemma3-27B-IT ranked #9 globally in the LMSYS Chatbot Arena, achieving anElo score of 1338. This puts it ahead of:
Gemma 3 sets a new benchmark for open-source AI by combining multimodal capabilities, efficient long-context processing, and enhanced multilinguality. Unlike larger proprietary models,Gemma 3 can run efficiently on consumer hardware, making it an excellent choice for:
Gemma 3 is a major step toward open AI innovation. With its improved efficiency, multimodal reasoning, and long-context abilities, it's clear that Gemma 3 is one of the best open models available today (we'll see about tomorrow...).
Read the full research paper: