Google DeepMind's Gemma 4 26B IT: Open Multimodal AI Model Released on Hugging Face

4d ago· 7 min readenNews

technology artificial intelligence programming open source

Summary

This article presents the Gemma 4 26B IT model on Hugging Face, an open multimodal AI model built by Google DeepMind. It handles text and image inputs, processes video as frame sequences, and generates text output. The model is designed for reasoning, agentic workflows, coding, and multimodal understanding on consumer GPUs and workstations, featuring a 256K-token context window, support for over 140 languages, and a hybrid attention mechanism combining local sliding-window and full global attention with unified Keys and Values.

Source

Twitter / XGoogle DeepMind's Gemma 4 26B IT: Open Multimodal AI Model Released on Hugging Facehuggingface.co

Key quotes

· 3 pulled

Gemma 4 26B IT is an open multimodal model built by Google DeepMind that handles text and image inputs, can process video as sequences of frames, and generates text output.

It is designed to deliver frontier-level performance for reasoning, agentic workflows, coding, and multimodal understanding on consumer GPUs and workstations.

The model uses a hybrid attention mechanism that interleaves local sliding-window and full global attention, with unified Keys and Values.

Snippet from the RSS feed

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

You might also wanna read

Google DeepMind Releases Gemma 4 12B Unified Open Multimodal AI Model

Google DeepMind has released Gemma 4 12B Unified, an open multimodal AI model that processes text, audio, image, and video inputs natively w

huggingface.co·21d ago

Google DeepMind's Gemma 4 12B brings native audio and vision AI to standard laptops

Google DeepMind has released Gemma 4 12B, an open multimodal AI model with native audio and vision processing capabilities designed to run l

gadgetbond.com·21d ago

Google DeepMind's Gemma 4 12B: Encoder-free multimodal AI runs locally on 16GB VRAM

Google DeepMind's Gemma 4 12B is an open-source multimodal AI model that processes text, images, and audio natively on consumer hardware wit

Product Hunt·24d ago

Google launches Gemma 4 12B: an encoder-free multimodal AI model for laptops

Google has introduced Gemma 4 12B, a unified, encoder-free multimodal AI model designed to run high-performance intelligence directly on lap

blog.google·24d ago

Google launches Gemma 4 12B: an encoder-free multimodal AI model for laptops

Google has introduced Gemma 4 12B, a unified, encoder-free multimodal AI model designed to run high-performance intelligence directly on lap

blog.google·24d ago

Google DeepMind Releases Gemma 4: Most Advanced Open AI Model Family

Google DeepMind has released Gemma 4, its most advanced open AI model family to date. The models feature enhanced reasoning capabilities, mu

Product Hunt·2mo ago

Gemma-4-12B v2 GGUF: A Local Coding Agent Model for Consumer Hardware

This is a Hugging Face model card for "yuxinlu1/gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUF" — a fine-tuned, quantized version

huggingface.co·4h ago

Comments

No comments yet. Be the first.