All Topics

Technology

Art

First reported by Hacker News

Google launches Gemma 4 12B: an encoder-free multimodal AI model for laptops

Google DeepMind Releases Gemma 4 12B Unified Open Multimodal AI Model

5d ago· 15 min readen

100/100

Golden Brown

Bagelometer↗

Pure flour-power. Hearty enough to carry you through lunch.

Score100Typepress releaseSentimentpositive

Summary

Google DeepMind has released Gemma 4 12B Unified, an open multimodal AI model that processes text, audio, image, and video inputs natively without requiring separate encoders. Built under Apache 2.0 license, it is designed for local deployment on consumer devices, bringing native audio and vision understanding directly to edge environments.

Key quotes

· 3 pulled

Built with the same multimodal functionality as Gemma 4 E2B and E4B (text, audio, image, and video inputs), it brings native audio and vision understanding directly to local environments without the need for separate encoders.

This unified approach to multimodality makes the model encoder-free, offering a deployment size that is perfect for consumer devices.

We're on a journey to advance and democratize artificial intelligence through open source and open science.

Snippet from the RSS feed

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

You might also wanna read

Google DeepMind's Gemma 4 12B: Encoder-free multimodal AI runs locally on 16GB VRAM

Google DeepMind's Gemma 4 12B is an open-source multimodal AI model that processes text, images, and audio natively on consumer hardware wit

Product Hunt·9d ago

Google DeepMind Releases Gemma 4: Most Advanced Open AI Model Family

Google DeepMind has released Gemma 4, its most advanced open AI model family to date. The models feature enhanced reasoning capabilities, mu

Product Hunt·2mo ago

Google launches Gemma 4 12B: an encoder-free multimodal AI model for laptops

Google has introduced Gemma 4 12B, a unified, encoder-free multimodal AI model designed to run high-performance intelligence directly on lap

blog.google·9d ago

Google launches Gemma 4 12B: an encoder-free multimodal AI model for laptops

Google has introduced Gemma 4 12B, a unified, encoder-free multimodal AI model designed to run high-performance intelligence directly on lap

blog.google·9d ago

Gemma-Tuner-Multimodal: Fine-Tuning Google's Gemma Models on Apple Silicon for Text, Images, and Audio

The article introduces gemma-tuner-multimodal, an open-source tool for fine-tuning Google's Gemma language models (versions 4 and 3n) on mul

github.com·2mo ago

Google Launches Gemma 3 270M: A Compact AI Model for Efficient Task-Specific Fine-Tuning

Google has introduced Gemma 3 270M, a compact and energy-efficient AI model with 270 million parameters. Designed for task-specific fine-tun

developers.googleblog.com·10mo ago

Gemma 3n: Advanced On-Device Multimodal Capabilities for Edge Devices

The Gemma 3n model has been fully released, offering advanced on-device multimodal capabilities to edge devices with unprecedented performan

developers.googleblog.com·11mo ago