Program of Thoughts: Separating Computation from Reasoning in Language Models for Numerical Tasks

mkagenius

6mo ago· 2 min readenInsight

75/100

Toasty

Bagelometer↗

A respectable bake. You'd come back tomorrow for another.

Score75TypeanalysisSentimentpositive

Summary

The article introduces "Program of Thoughts" (PoT), a new approach that disentangles computation from reasoning in language models for numerical reasoning tasks. Unlike Chain-of-Thoughts prompting (CoT) which handles both reasoning and computation within the language model, PoT uses language models (primarily Codex) to generate programs that express the reasoning process, while delegating computation to an external computer. The method was evaluated on five math word problem datasets and three financial-QA datasets, showing an average 12% performance improvement over CoT in both few-shot and zero-shot settings. When combined with self-consistency decoding, PoT achieves state-of-the-art performance on math datasets and near-state-of-the-art on financial datasets.

Key quotes

· 4 pulled

To disentangle computation from reasoning, we propose 'Program of Thoughts' (PoT), which uses language models (mainly Codex) to express the reasoning process as a program.

The computation is relegated to an external computer, which executes the generated programs to derive the answer.

Under both few-shot and zero-shot settings, PoT can show an average performance gain over CoT by around 12% across all the evaluated datasets.

By combining PoT with self-consistency decoding, we can achieve SoTA performance on all math problem datasets and near-SoTA performance on financial datasets.

Snippet from the RSS feed

Recently, there has been significant progress in teaching language models to perform step-by-step reasoning to solve complex numerical reasoning tasks. Chain-of-thoughts prompting (CoT) is by far the state-of-art method for these tasks. CoT uses language

You might also wanna read

Researchers Develop Method to Predict Real-Time Progress in Reasoning Language Models

This research paper investigates whether real-time progress prediction is feasible for reasoning language models that use long latent chains

arxiv.org·3d ago

Study Reveals How RL and SFT Differently Teach Transformers Chain-of-Thought Reasoning on Sparse Boolean Functions

This research paper analyzes how transformers learn Chain-of-Thought (CoT) reasoning capabilities through Reinforcement Learning (RL) with p

arxiv.org·3d ago