Appears on
Articles4
Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model
Hey HN, Henry here from Cactus. We open-sourced Needle, a 26M parameter function-calling (tool use) model. It runs at 6000 tok/s prefill and 1200 tok/s decode on consumer devices.We were always frustrated by the little effort made towards building agentic models that run on budget phones, so we conducted investigations that led to an observation: agentic exp
Code
github.com19d ago
Open-Source Compendium for Mathematics, Computer Science, and AI Fundamentals
Code
H
Cactus: An Open-Source Low-Latency AI Engine for Mobile Devices and Wearables
Code
Comparison of Cactus and Ollama Frameworks for Smartphones
The article discusses the compatibility and features of the Cactus and Ollama frameworks for smartphones, highlighting their differences and potential use cases.
News

