All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Z80-μLM: A 2-Bit Quantized Language Model for Vintage Z80 Processors

By

quesomaster9000

5mo ago· 5 min readenCode

Summary

Z80-μLM is a 2-bit quantized language model designed to run on vintage 8-bit Z80 processors with only 64KB of RAM. The project explores how small a conversational AI can be while maintaining personality, resulting in a 40KB .COM binary that can run on 1976-era 4MHz hardware. It allows training conversational models in Python and exporting them as CP/M binaries for retrocomputing enthusiasts to chat with their vintage computers.

Key quotes

· 4 pulled
Z80-μLM is a 'conversational AI' that generates short character-by-character sequences, with quantization-aware training (QAT) to run on a Z80 processor with 64kb of ram.
The root behind this project was the question: how small can we go while still having personality, and can it be trained or fine-tuned easily? With easy self-hosted distribution?
The answer is Yes! And a 40kb .com binary (including inference, weights & a chat-style UI) running on a 4MHz processor from 1976.
It won't pass the Turing test, but it might make you smile at the green screen.
Snippet from the RSS feed
Z80-μLM is a 2-bit quantized language model small enough to run on an 8-bit Z80 processor. Train conversational models in Python, export them as CP/M .COM binaries, and chat with your vintage compu...

You might also wanna read