All Topics

Technology

Art

Gemma-Tuner-Multimodal: Fine-Tuning Google's Gemma Models on Apple Silicon for Text, Images, and Audio

MediaSquirrel

1mo ago· 11 min readenCode

100/100

Golden Brown

Bagelometer↗

Master baker tier. Every paragraph earns its place on the tray.

Score100TypenewsSentimentpositive

Summary

The article introduces gemma-tuner-multimodal, an open-source tool for fine-tuning Google's Gemma language models (versions 4 and 3n) on multimodal data including text, images, and audio. The key innovation is that it runs on Apple Silicon Macs using PyTorch and Metal Performance Shaders (MPS), eliminating the need for NVIDIA GPUs. The tool supports LoRA (Low-Rank Adaptation) fine-tuning and can stream training data from the cloud, making it accessible for users without high-end hardware. A comparison table shows its advantages over alternatives like MLX-LM, Unsloth, and axolotl in terms of multimodal support and Apple Silicon compatibility.

Key quotes

· 4 pulled

Fine-tune Gemma on text, images, and audio — on your Mac, on data that doesn't fit on your Mac.

LoRA for Gemma 4 & 3n — why not just use…?

Runs on Apple Silicon (MPS) ✅

No NVIDIA GPU required ✅

Snippet from the RSS feed

Fine-tune Gemma 4 and 3n with audio, images and text on Apple Silicon, using PyTorch and Metal Performance Shaders. - mattmireles/gemma-tuner-multimodal

You might also wanna read

TranslateGemma: Open AI Translation Models Based on Google's Gemma 3 Support 55 Languages

TranslateGemma is a new suite of open AI translation models built on Google's Gemma 3 framework, supporting 55 languages with high accuracy

Product Hunt·4mo ago

Google Unveils Gemini: A Multimodal AI Model to Rival GPT-4

Google's Gemini is introduced as its largest and most capable AI model, designed to be multimodal and capable of understanding and combining

Product Hunt·9mo ago

Google DeepMind Releases Gemma 4: Most Advanced Open AI Model Family

Google DeepMind has released Gemma 4, its most advanced open AI model family to date. The models feature enhanced reasoning capabilities, mu

Product Hunt·1mo ago

Russet: On-Device AI Platform for Apple Silicon with MLX Models and Local Processing

Russet is an on-device AI platform for Apple silicon that combines Apple Intelligence with hardware-optimized MLX models. It offers pre-conf

Product Hunt·2mo ago

MiniCPM 4.0: Open-source 8B multimodal AI model outperforms GPT-4o and Gemini Pro on vision benchmarks

MiniCPM 4.0 is an ultra-efficient 8B open-source multimodal AI model designed for on-device use that outperforms larger models like GPT-4o a

Product Hunt·9mo ago

Google launches Gemini 3.1 Flash-Lite, its fastest and cheapest model for high-volume AI pipelines

Google's Gemini 3.1 Flash-Lite has reached general availability as the company's most cost-efficient Gemini 3 model. It's designed for high-

Product Hunt·21d ago