This is a launch from Google.dev

8. Gemma 3

Build with Multimodal AI from Google

Gemma 3 is Google's new models for multimodal AI (text, images, video). 1B-27B sizes, 128K context, 140+ languages. Includes ShieldGemma 2 for safety.

Launch tags:

Open Source•Artificial Intelligence•Development

Meet the team

Best

Zac Zuo

Hunter

📌

Hi everyone!

Check out Gemma 3, Google's latest family of models for building multimodal AI applications! This is a big step up from the previous Gemma versions, adding video understanding and a much larger context window.

Key features:

🖼️ Multimodal: Handles text, images, and short videos.
🧠 Multiple Sizes: Available in 1B, 4B, 12B, and 27B parameter versions.
↔️ 128K Context Window: A major increase, allowing for processing much more information.
🌍 Multilingual: Supports over 35 languages out-of-the-box, pretrained on over 140.
🛠️ Integrates with Hugging Face Transformers, Ollama, JAX, Keras, PyTorch, Unsloth, vLLM, and Gemma.cpp.
🛡️ It Includes a separate 4B model, ShieldGemma 2, for image safety classification.
⚡ Optimized for NVIDIA GPUs, Google Cloud TPUs, and AMD GPUs.

Gamma 3 is a clear sign of how quickly the multimodal AI space is advancing.

Let's start exploring its capabilities in Google AI Studio!