mHC: A Manifold-Constrained Framework to Stabilize and Scale Hyper-Connections in Neural Networks
By
ipnon
Pulled from the oven just right. Trustworthy, fact-dense, deeply satisfying.
Summary
This paper introduces Manifold-Constrained Hyper-Connections (mHC), a general framework that addresses training instability and scalability issues in Hyper-Connections (HC) by projecting the residual connection space onto a specific manifold to restore identity mapping properties. The authors propose infrastructure optimizations to reduce memory access overhead while maintaining performance gains. Empirical results show mHC offers tangible performance improvements and superior scalability for training at scale, positioning it as a flexible extension of HC for foundational model evolution.
Key quotes
· 3 pulledWe propose Manifold-Constrained Hyper-Connections (mHC), a general framework that projects the residual connection space of HC onto a specific manifold to restore the identity mapping property, while incorporating rigorous infrastructure optimization to ensure efficiency.
Empirical experiments demonstrate that mHC is effective for training at scale, offering tangible performance improvements and superior scalability.
We anticipate that mHC, as a flexible and practical extension of HC, will contribute to a deeper understanding of topological architecture design and suggest promising directions for the evolution of foundational models.
You might also wanna read
PromptEmbedder: A Dual-LLM Framework for Efficient, Architecture-Agnostic Text Embedding
The article presents PromptEmbedder, a novel dual-LLM framework for efficient and transferable text embedding. It addresses the bottleneck o
Unified Framework for Variational Quantum Knowledge Graph Embeddings on NISQ Devices
This paper introduces a unified framework for variational quantum algorithms (VQAs) applied to knowledge graph embeddings on near-term NISQ
Contextual Rollout Bandits: A Neural Scheduling Framework for Efficient Reinforcement Learning with Verifiable Rewards
This paper introduces Contextual Rollout Bandits, a novel framework for Reinforcement Learning with Verifiable Rewards (RLVR) that addresses
Eureka: An LLM-Driven Framework for Automated Feature Engineering in Enterprise AI
This paper presents Eureka, an LLM-driven framework for automated feature engineering in machine learning. It treats feature engineering as
Sleep-Like Consolidation Mechanism Improves Long-Context Performance in Transformer Language Models
This paper proposes a sleep-like consolidation mechanism for transformer-based large language models to address the poor scaling of attentio
PICO: A Practical Learned Image Codec Optimized for Human Visual Perception
The article introduces PICO (Perceptual Image Codec), a learned image compression codec optimized for the human visual system. It was develo
