All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

NumKong: A Comprehensive Collection of 2,000 SIMD Kernels for Mixed-Precision Numerical Computing

By

ashvardanian

2mo ago· 42 min readenNews

Summary

The article announces the rebranding of the SimSIMD project to NumKong, which is described as a comprehensive collection of approximately 2,000 SIMD kernels for mixed-precision numerical computations. The project spans 200,000 lines of code and documentation across 7 programming languages, supporting various precision levels from Float6 to Float118. It leverages multiple hardware architectures including RISC-V, Intel AMX, Arm SME, and WebAssembly Relaxed SIMD. The library provides BLAS-like functionality for operations such as dot products, batched GEMMs, distance calculations, geospatial computations, ColBERT MaxSim, and mesh alignment. The project has been extensively tested against in-house 118-bit floating point numbers and profiled for both numerical stability and performance.

Key quotes

· 5 pulled
I'm killing my SimSIMD project and re-launching under a new name — NumKong — StringZilla's big brother.
Around 2'000 SIMD kernels for mixed precision numerics, spread across 200'000 lines of code & docstrings, for 7 programming languages.
One of the larger collections online — comparable to OpenBLAS, the default NumPy BLAS (Basic Linear Algebra Subprograms) backend.
All of that tested against in-house 118-bit floating point numbers and heavily profiled for both numerical stability and speed.
Around 2'000 SIMD kernels for mixed-precision BLAS-like numerics — dot products, batched GEMMs, distances, geospatial, ColBERT MaxSim, and mesh alignment — from Float6 to Float118.
Snippet from the RSS feed
Around 2'000 SIMD kernels for mixed-precision BLAS-like numerics — dot products, batched GEMMs, distances, geospatial, ColBERT MaxSim, and mesh alignment — from Float6 to Float118, leveraging RISC-V, Intel AMX, Arm SME, and WebAssembly Relaxed SIMD, in 7

You might also wanna read