All Topics
All Topics
Technology
Technology
AI
AI
Business
Business
Entertainment
Entertainment
News
News
Programming
Programming
Security
Security
Science
Science
Design
Design
Environment
Environment
Finance
Finance
Crypto
Crypto
Politics
Politics
Sports
Sports
Education
Education
Gaming
Gaming
Art
Art
Music
Music
Health
Health
Books
Books
Food
Food
Travel
Travel
Personal
Personal
Bluesky
Twitter

University of Twente researchers find GPU clock adjustment can cut LLM training energy by 14% without speed loss

By

Dina Genkina

24d ago· 5 min readenNews

Summary

Researchers at the University of Twente have discovered that dynamically adjusting GPU clock frequency during LLM training can save up to 14% of energy without sacrificing performance. This technique, which involves fine-tuning clock speeds during different computational phases, addresses the massive energy consumption of training large language models like GPT-4, which used an estimated 50 GWh of electricity. The approach offers a practical way to reduce the environmental impact of AI training without requiring hardware changes or slowing down training times.

Source

bskyUniversity of Twente researchers find GPU clock adjustment can cut LLM training energy by 14% without speed lossspectrum.ieee.org

Key quotes

· 3 pulled
OpenAI's fourth large language model (LLM), GPT-4, took an estimated 50 Gigawatt-hours to train, or the equivalent of 5,000 American homes' yearly power consumption.
A research group at University of Twente in the Netherlands has shown that you can save up to 14 percent of the energy used in LLM training without sacrificing speed by cleverly adjusting the clock frequency of the GPU during computation.
Jeffrey Spaan, Ph.D. candidate
Snippet from the RSS feed
Adjusting clocking frequency during computation can save energy without affecting performance

You might also wanna read

Unsloth and NVIDIA Partner to Accelerate LLM Fine-Tuning by 20%

Unsloth has partnered with NVIDIA to optimize fine-tuning of large language models, achieving 20% faster training speeds. The collaboration

Unsloth - Train and Run Models Locally·1mo ago

Guide to Calculating GPU Memory for Self-Hosted LLM Inference

The article provides a guide on calculating GPU memory requirements and managing concurrent requests for self-hosted large language model (L

Product Hunt·11mo ago

A Practical Guide to Scaling Language Models: From Single Accelerators to Thousands

This article/book excerpt demystifies the science of scaling language models, explaining how TPUs and GPUs work, how they communicate, how L

jax-ml.github.io·9d ago

A Practical Guide to Scaling Language Models: From Single Accelerators to Thousands

This article/book excerpt demystifies the science of scaling language models, explaining how TPUs and GPUs work, how they communicate, how L

jax-ml.github.io·9d ago

Technical Analysis of LLM Inference Engines: Exploring Nano-vLLM Architecture and Scheduling

This article provides an in-depth technical exploration of LLM inference engines, focusing on Nano-vLLM as a case study. It explains the cri

neutree.ai·5mo ago

Breakthrough: 1.3-Second Cross-Machine Weight Transfer for Trillion-Parameter AI Models

Researchers have achieved ultra-fast 1.3-second cross-machine parameter updates for trillion-parameter AI models (Kimi-K2 with 1T parameters

research.perplexity.ai·5mo ago

NanoGPT Slowrun Achieves 10x Data Efficiency Breakthrough in Language Model Training

Researchers have achieved 10x data efficiency with NanoGPT Slowrun, where an ensemble of 1.8B parameter models (totaling 18B parameters) tra

qlabs.sh·3mo ago

Comments

Sign in to join the conversation.

No comments yet. Be the first.