How Google accelerates Gemini Nano on Pixel devices using frozen Multi-Token Prediction
Google discusses accelerating Gemini Nano and Gemma on-device LLMs on Pixel devices using a technique called frozen Multi-Token Prediction (MTP). This approach enables faster inference for features like notification summarization and text proofreading while keeping data private o