TinyTinyTPU: Educational Implementation of Google's TPU Architecture on FPGA
By
Xenograph
Slow-proofed and worth the wait. Worth its weight in flour.
Summary
TinyTinyTPU is an educational implementation of Google's TPU (Tensor Processing Unit) architecture, scaled down to a minimal 2×2 systolic array matrix-multiply unit. The project is implemented in SystemVerilog and deployed on FPGA hardware (specifically Basys3 XC7A35T). It serves as a learning resource demonstrating TPU architecture principles, including complete implementation with simulation/testing, FPGA build/deployment, and inference capabilities. The project includes documentation on resource usage, project structure, architecture details, and open-source tooling using Yosys/nextpnr.
Key quotes
· 4 pulledTinyTinyTPU is an educational implementation of Google's TPU architecture, scaled down to a 2×2 systolic array
A minimal 2×2 systolic-array TPU-style matrix-multiply unit, implemented in SystemVerilog and deployed on FPGA
This project implements a complete TPU architecture including resource usage on Basys3 XC7A35T FPGA
Demonstrates design philosophy as a minimal, educational implementation of TPU architecture
You might also wanna read
Building a BCD Scientific Calculator on FPGA: Architecture and Numerical Methods
This article details the design and implementation of a scientific calculator using binary-coded decimal (BCD) arithmetic on an FPGA. It cov
z386: An Open-Source FPGA CPU Recreating the Intel 80386 with Original Microcode
The fifth installment of the 80386 series describes the z386, an open-source FPGA-based CPU that recreates a 386-class processor using the o
FPGA-Based Scientific Calculator with Custom Soft CPU and Microcode Firmware
This project implements a fully functional scientific calculator in hardware using an FPGA. It includes a custom soft CPU, microcode firmwar
SHDL: A Minimalist Hardware Description Language for Education Using Logic Gates
SHDL (Simple Hardware Description Language) is a minimalist hardware description language designed for education and experimentation that us
iRISC: A New Processor Architecture Combining RISC-V Open Source with Performance-Focused Design
The article discusses iRISC, a new processor architecture that combines RISC-V's open-source approach with Apple's performance-focused desig
Creating VGA Graphics and Animations with 4000 Logic Gates in Tiny Tapeout 8 Competition
The article describes the author's participation in the Tiny Tapeout 8 demo competition, where they submitted three ASIC designs using only
