All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

LLM Circuit Finder: Duplicating Specific Layers in Transformer Models Improves Reasoning Performance Without Training

By

xlayn

2mo ago· 6 min readenCode

Summary

The article describes a GitHub project called 'llm-circuit-finder' that implements a method for discovering and exploiting 'reasoning circuits' within transformer-based large language models. The author replicated Ng's RYS method and found that duplicating specific layers in models like Qwen2.5-32B and Devstral-24B significantly improves reasoning performance without any training or weight changes. For Qwen2.5-32B, duplicating 3 specific layers boosted reasoning by 17%, while for Devstral-24B, duplicating layers 12-14 improved logical deduction scores from 0.22 to 0.76 on the BBH benchmark. The approach involves routing hidden states through the same circuit twice, and the toolkit includes tools for finding these reasoning circuits. The project was completed using two AMD GPUs in one evening.

Key quotes

· 5 pulled
Duplicate 3 layers. No training. Logical deduction goes from 0.22 → 0.76.
This toolkit finds and exploits 'reasoning circuits' hidden inside transformer models.
The idea: certain contiguous blocks of layers act as indivisible cognitive units.
I replicated Ng's RYS method and found that duplicating 3 specific layers in Qwen2.5-32B boosts reasoning by 17%
no training, no weight changes, just routing hidden states through the same circuit twice
Snippet from the RSS feed
I replicated Ng's RYS method and found that duplicating 3 specific layers in Qwen2.5-32B boosts reasoning by 17% and duplicating layers 12-14 in Devstral-24B improves logical deduction from 0.2...

You might also wanna read