Google DeepMind Releases Gemma 4 12B Unified Open Multimodal AI Model
Pure flour-power. Hearty enough to carry you through lunch.
Summary
Google DeepMind has released Gemma 4 12B Unified, an open multimodal AI model that processes text, audio, image, and video inputs natively without requiring separate encoders. Built under Apache 2.0 license, it is designed for local deployment on consumer devices, bringing native audio and vision understanding directly to edge environments.
Key quotes
· 3 pulledBuilt with the same multimodal functionality as Gemma 4 E2B and E4B (text, audio, image, and video inputs), it brings native audio and vision understanding directly to local environments without the need for separate encoders.
This unified approach to multimodality makes the model encoder-free, offering a deployment size that is perfect for consumer devices.
We're on a journey to advance and democratize artificial intelligence through open source and open science.
You might also wanna read
Google DeepMind's Gemma 4 12B: Encoder-free multimodal AI runs locally on 16GB VRAM
Google DeepMind's Gemma 4 12B is an open-source multimodal AI model that processes text, images, and audio natively on consumer hardware wit
Google DeepMind Releases Gemma 4: Most Advanced Open AI Model Family
Google DeepMind has released Gemma 4, its most advanced open AI model family to date. The models feature enhanced reasoning capabilities, mu
Google launches Gemma 4 12B: an encoder-free multimodal AI model for laptops
Google has introduced Gemma 4 12B, a unified, encoder-free multimodal AI model designed to run high-performance intelligence directly on lap
Google launches Gemma 4 12B: an encoder-free multimodal AI model for laptops
Google has introduced Gemma 4 12B, a unified, encoder-free multimodal AI model designed to run high-performance intelligence directly on lap
Gemma-Tuner-Multimodal: Fine-Tuning Google's Gemma Models on Apple Silicon for Text, Images, and Audio
The article introduces gemma-tuner-multimodal, an open-source tool for fine-tuning Google's Gemma language models (versions 4 and 3n) on mul
Google Launches Gemma 3 270M: A Compact AI Model for Efficient Task-Specific Fine-Tuning
Google has introduced Gemma 3 270M, a compact and energy-efficient AI model with 270 million parameters. Designed for task-specific fine-tun
Gemma 3n: Advanced On-Device Multimodal Capabilities for Edge Devices
The Gemma 3n model has been fully released, offering advanced on-device multimodal capabilities to edge devices with unprecedented performan
