Gemma-4-12B v2 GGUF: A Local Coding Agent Model for Consumer Hardware
Summary
This is a Hugging Face model card for "yuxinlu1/gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUF" — a fine-tuned, quantized version of Google's Gemma 4 12B model optimized for local coding and agentic (tool-using) tasks. The model is designed to run on consumer hardware with ~4.5 GB of VRAM, enabling private, offline AI-assisted coding. The v2 release emphasizes multi-step reasoning, tool use, and agentic capabilities benchmarked on tau2-bench. The card promotes open-source AI democratization and local-first AI development.
Source
Key quotes
· 5 pulled🐣 Tiny footprint, big brain — a local coding & tool-using agent for everyone
With ~4.5 GB of VRAM or unified memory free, you can run your own private, offline coding agent right now.
v2 is the big agentic upgrade — it reads, reasons, uses tools, and works through multi-step technical tasks before it acts.
All local, all yours, no API, no cloud.
We're on a journey to advance and democratize artificial intelligence through open source and open science.
You might also wanna read
Google DeepMind's Gemma 4 12B: Encoder-free multimodal AI runs locally on 16GB VRAM
Google DeepMind's Gemma 4 12B is an open-source multimodal AI model that processes text, images, and audio natively on consumer hardware wit
Google launches Gemma 4 12B: an encoder-free multimodal AI model for laptops
Google has introduced Gemma 4 12B, a unified, encoder-free multimodal AI model designed to run high-performance intelligence directly on lap
Google launches Gemma 4 12B: an encoder-free multimodal AI model for laptops
Google has introduced Gemma 4 12B, a unified, encoder-free multimodal AI model designed to run high-performance intelligence directly on lap
Google DeepMind's Gemma 4 26B IT: Open Multimodal AI Model Released on Hugging Face
This article presents the Gemma 4 26B IT model on Hugging Face, an open multimodal AI model built by Google DeepMind. It handles text and im
Google DeepMind's Gemma 4 26B IT: Open Multimodal AI Model Released on Hugging Face
This article presents the Gemma 4 26B IT model on Hugging Face, an open multimodal AI model built by Google DeepMind. It handles text and im
Google Launches Gemma 3 270M: A Compact AI Model for Efficient Task-Specific Fine-Tuning
Google has introduced Gemma 3 270M, a compact and energy-efficient AI model with 270 million parameters. Designed for task-specific fine-tun
Google DeepMind Releases Gemma 4: Most Advanced Open AI Model Family
Google DeepMind has released Gemma 4, its most advanced open AI model family to date. The models feature enhanced reasoning capabilities, mu
Guide to Running Google Gemma 4 AI Model Locally with LM Studio CLI on macOS
This article provides a technical guide on running Google's Gemma 4 26B parameter model locally using LM Studio's new headless CLI tools. It

Comments
Sign in to join the conversation.
No comments yet. Be the first.