All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Comprehensive Survey of Reasoning Failures in Large Language Models

By

T-A

3mo ago· 2 min readenInsight

Summary

This article presents a comprehensive survey of reasoning failures in Large Language Models (LLMs), introducing a novel categorization framework that distinguishes between embodied and non-embodied reasoning types. The survey systematically classifies reasoning failures into three categories: fundamental failures intrinsic to LLM architectures, application-specific limitations in particular domains, and robustness issues characterized by inconsistent performance across minor variations. For each failure type, the authors provide definitions, analyze existing studies, explore root causes, and present mitigation strategies, aiming to unify fragmented research efforts and guide future work toward building more reliable LLM reasoning capabilities.

Key quotes

· 4 pulled
Large Language Models (LLMs) have exhibited remarkable reasoning capabilities, achieving impressive results across a wide range of tasks. Despite these advances, significant reasoning failures persist, occurring even in seemingly simple scenarios.
We introduce a novel categorization framework that distinguishes reasoning into embodied and non-embodied types, with the latter further subdivided into informal (intuitive) and formal (logical) reasoning.
We classify reasoning failures along a complementary axis into three types: fundamental failures intrinsic to LLM architectures that broadly affect downstream tasks; application-specific limitations that manifest in particular domains; and robustness issues characterized by inconsistent performance across minor variations.
By unifying fragmented research efforts, our survey provides a structured perspective on systemic weaknesses in LLM reasoning, offering valuable insights and guiding future research towards building stronger, more reliable, and robust reasoning capabilities.
Snippet from the RSS feed
Large Language Models (LLMs) have exhibited remarkable reasoning capabilities, achieving impressive results across a wide range of tasks. Despite these advances, significant reasoning failures persist, occurring even in seemingly simple scenarios. To syst

You might also wanna read