Microsoft Releases bitnet.cpp: Official Inference Framework for 1-bit Large Language Models

redm

2mo ago· 7 min readenCode

100/100

Golden Brown

Bagelometer↗

Crisp on the outside, thoughtful on the inside. A keeper.

Score100TypenewsSentimentpositive

Summary

Microsoft has released bitnet.cpp, an official inference framework for 1-bit large language models (LLMs) like BitNet b1.58. The framework provides optimized kernels for fast, lossless inference on CPUs and GPUs, with NPU support planned. The initial release focuses on CPU inference, achieving speedups of 1.37x to 5.07x on ARM CPUs (with larger models seeing greater gains) and reducing energy consumption by 55.4% to 70.0%. The project is open-source on GitHub and includes a demo for testing.

Key quotes

· 4 pulled

bitnet.cpp is the official inference framework for 1-bit LLMs (e.g., BitNet b1.58)

It offers a suite of optimized kernels, that support fast and lossless inference of 1.58-bit models on CPU and GPU

bitnet.cpp achieves speedups of 1.37x to 5.07x on ARM CPUs, with larger models experiencing greater performance gains

Additionally, it reduces energy consumption by 55.4% to 70.0%, further boosting overall efficiency

Snippet from the RSS feed

Official inference framework for 1-bit LLMs. Contribute to microsoft/BitNet development by creating an account on GitHub.

You might also wanna read

RTP-LLM: Alibaba's High-Performance Inference Engine for Large Language Model Deployment

This paper presents RTP-LLM, a high-performance inference engine developed by Alibaba for industrial-scale deployment of Large Language Mode

arxiv.org·2d ago

EXO Labs Runs Llama 2 AI Model on 1997 Pentium II Using BitNet Optimization

EXO Labs successfully ran a lightweight Llama 2 AI model on a 1997 Pentium II processor with only 128 MB of RAM by leveraging BitNet's terna

news.bitcoin.com·2d ago

AMD Releases Instella: Open 3 Billion Parameter Language Models

AMD has released Instella, a high-performance 3 billion parameter language model trained on MI300X hardware. The model weights are available

Product Hunt·2mo ago