Technology

Art

Meta announces two 24k GPU clusters for Llama 3 training and AI infrastructure

By Kevin Lee, Adi Gangidi, Mathew Oldham

2h ago· 11 min readen

technology business ai infrastructure data center

Summary

Meta is announcing two large-scale 24k GPU clusters built for training AI models, specifically Llama 3. The post details the hardware (Grand Teton, OpenRack), network, storage, design, performance, and software stack (PyTorch) used to achieve high throughput and reliability. Meta emphasizes its commitment to open compute and open source, positioning this as a key step in its ambitious AI infrastructure roadmap through 2024.

Source

Twitter / XMeta announces two 24k GPU clusters for Llama 3 training and AI infrastructureengineering.fb.com

Key quotes

· 5 pulled

Marking a major investment in Meta's AI future, we are announcing two 24k GPU clusters.

We are sharing details on the hardware, network, storage, design, performance, and software that help us extract high throughput and reliability for various AI workloads.

We are strongly committed to open compute and open source.

We built these clusters on top of Grand Teton, OpenRack, and PyTorch and continue to push open innovation across the industry.

This announcement is one step in our ambitious infrastructure roadmap.

Snippet from the RSS feed

Marking a major investment in Meta’s AI future, we are announcing two 24k GPU clusters. We are sharing details on the hardware, network, storage, design, performance, and software that help us extr…

You might also wanna read

Guide to Calculating GPU Memory for Self-Hosted LLM Inference

The article provides a guide on calculating GPU memory requirements and managing concurrent requests for self-hosted large language model (L

Product Hunt·10mo ago

AI-Generated Metal Kernels Accelerate PyTorch Inference by 87% on Apple Devices

Researchers developed AI-generated Metal kernels that accelerate PyTorch inference on Apple devices by 87% across 215 modules. The study dem

gimletlabs.ai·10mo ago

EXO Labs Runs Llama 2 AI Model on 1997 Pentium II Using BitNet Optimization

EXO Labs successfully ran a lightweight Llama 2 AI model on a 1997 Pentium II processor with only 128 MB of RAM by leveraging BitNet's terna

news.bitcoin.com·1mo ago

GPULlama3.java: Llama Models Compiled to PTX/OpenCL Integrated with Quarkus Framework

The article presents GPULlama3.java, a project that integrates Llama models compiled to PTX/OpenCL with Quarkus framework. It provides insta

news.ycombinator.com·6mo ago

Meta Considering Charged Access for New AI Model 'Avocado'

Meta is reportedly developing a new AI model code-named Avocado that may represent a strategic shift from its previous open-source approach.

The Verge·6mo ago

LlamaFactory: Open-Source Framework for Efficient Fine-Tuning of 100+ LLMs and VLMs

LlamaFactory is an open-source framework for unified efficient fine-tuning of 100+ large language models (LLMs) and vision-language models (

github.com·9mo ago

Comments

No comments yet. Be the first.