All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.
First reported by Hacker News
Google launches Gemma 4 12B: an encoder-free multimodal AI model for laptops

Google DeepMind Releases Gemma 4 12B Unified Open Multimodal AI Model

5d ago· 15 min readen

Summary

Google DeepMind has released Gemma 4 12B Unified, an open multimodal AI model that processes text, audio, image, and video inputs natively without requiring separate encoders. Built under Apache 2.0 license, it is designed for local deployment on consumer devices, bringing native audio and vision understanding directly to edge environments.

Key quotes

· 3 pulled
Built with the same multimodal functionality as Gemma 4 E2B and E4B (text, audio, image, and video inputs), it brings native audio and vision understanding directly to local environments without the need for separate encoders.
This unified approach to multimodality makes the model encoder-free, offering a deployment size that is perfect for consumer devices.
We're on a journey to advance and democratize artificial intelligence through open source and open science.
Snippet from the RSS feed
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

You might also wanna read

Google DeepMind's Gemma 4 12B: Encoder-free multimodal AI runs locally on 16GB VRAM

Google DeepMind's Gemma 4 12B is an open-source multimodal AI model that processes text, images, and audio natively on consumer hardware wit

Product Hunt·9d ago

Google DeepMind Releases Gemma 4: Most Advanced Open AI Model Family

Google DeepMind has released Gemma 4, its most advanced open AI model family to date. The models feature enhanced reasoning capabilities, mu

Product Hunt·2mo ago

Google launches Gemma 4 12B: an encoder-free multimodal AI model for laptops

Google has introduced Gemma 4 12B, a unified, encoder-free multimodal AI model designed to run high-performance intelligence directly on lap

blog.google·9d ago

Google launches Gemma 4 12B: an encoder-free multimodal AI model for laptops

Google has introduced Gemma 4 12B, a unified, encoder-free multimodal AI model designed to run high-performance intelligence directly on lap

blog.google·9d ago

Gemma-Tuner-Multimodal: Fine-Tuning Google's Gemma Models on Apple Silicon for Text, Images, and Audio

The article introduces gemma-tuner-multimodal, an open-source tool for fine-tuning Google's Gemma language models (versions 4 and 3n) on mul

github.com·2mo ago

Google Launches Gemma 3 270M: A Compact AI Model for Efficient Task-Specific Fine-Tuning

Google has introduced Gemma 3 270M, a compact and energy-efficient AI model with 270 million parameters. Designed for task-specific fine-tun

developers.googleblog.com·10mo ago

Gemma 3n: Advanced On-Device Multimodal Capabilities for Edge Devices

The Gemma 3n model has been fully released, offering advanced on-device multimodal capabilities to edge devices with unprecedented performan

developers.googleblog.com·11mo ago