Testing RDMA over Thunderbolt 5 on Mac Studio Cluster for AI Workloads
By
rbanffy
Crackling crust, pillowy middle. The kind of bagel that earns a second cup of coffee.
Summary
The article details testing of RDMA (Remote Direct Memory Access) over Thunderbolt 5 on a Mac Studio cluster, enabling multiple Macs to function as a single large memory pool for AI workloads. The author tested a cluster with 1.5 TB of unified memory using Exo 1.0, an open-source AI clustering tool, noting the $40,000 cost and that the hardware was loaned by Apple and DeskPi for testing purposes.
Key quotes
· 5 pulledRDMA lets the Macs all act like they have one giant pool of RAM, which speeds up things like massive AI models.
The stack of Macs I tested, with 1.5 TB of unified memory, costs just shy of $40,000, and if you're wondering, no I cannot justify spending that much money for this.
Apple gave me access to this Mac Studio cluster to test RDMA over Thunderbolt, a new feature in macOS 26.2.
The easiest way to test it is with Exo 1.0, an open source private AI clustering tool.
Apple loaned the Mac Studios for testing. I also have to thank DeskPi for sending over the 4-post mini rack containing the cluster.
You might also wanna read
HipKittens: Programming Primitives to Unlock AMD GPU Performance for AI Workflows
The article discusses the challenge of leveraging AMD GPU hardware for AI workflows due to insufficient software support. It introduces HipK
Optimizing LLM Inference by Combining NVIDIA DGX Spark and Apple Mac Studio Architectures
The article explores combining NVIDIA DGX Spark AI supercomputers with Apple Mac Studio systems to optimize large language model (LLM) infer
NVIDIA DGX Spark Review: Compact Workstation for High-Performance AI Inference
The article provides an in-depth review of NVIDIA's DGX Spark system, an unconventional compact workstation that brings supercomputing-class
a16z Builds Custom AI Workstation with Four NVIDIA RTX 6000 Pro Blackwell Max-Q GPUs
The article details a16z's custom-built personal AI workstation featuring four NVIDIA RTX 6000 Pro Blackwell Max-Q GPUs, designed to address
Why small pull request policies can backfire on software quality
The article critiques a common software engineering policy that limits pull requests (PRs) to small sizes (e.g., 500 lines, few files). Whil
apenwarr.ca·2h agoCerebras CSO Andy Hock Discusses Wafer-Scale AI Hardware and Democratizing Large-Scale AI
A podcast episode featuring Andy Hock, Chief Strategy Officer at Cerebras, discussing the company's Wafer-Scale Engine (WSE) technology. The
