Mathematical Unification of Decision Trees and Diffusion Models via Global Trajectory Score Matching
By
[Submitted on 1 May 2026 (v1), last revised 21 May 2026 (this version, v2)]
Right out the toaster. Reliable, with some real depth.
Summary
This paper establishes a mathematical correspondence between decision trees and diffusion models, unifying these two seemingly disparate model classes. The authors introduce Global Trajectory Score Matching (GTSM) as a shared optimization principle, for which gradient boosting is asymptotically optimal. Two practical applications are demonstrated: TreeFlow, which achieves competitive generation quality on tabular data with higher fidelity and 2x computational speedup, and DSMTree, a distillation method that transfers hierarchical decision logic into neural networks, matching teacher performance within 2% on many benchmarks.
Key quotes
· 5 pulledDecision trees and diffusion models are ostensibly disparate model classes, one discrete and hierarchical, the other continuous and dynamic.
This work unifies the two by establishing a crisp mathematical correspondence between hierarchical decision trees and diffusion processes in appropriate limiting regimes.
Our unification reveals a shared optimization principle: Global Trajectory Score Matching (GTSM), for which gradient boosting (in an idealized version) is asymptotically optimal.
TreeFlow achieves competitive generation quality on tabular data with higher fidelity and a 2× computational speedup.
DSMTree is a novel distillation method that transfers hierarchical decision logic into neural networks, matching teacher performance within 2% on many benchmarks.
You might also wanna read
ConSPO: A Contrastive Approach to Improving Reinforcement Learning with Verifiable Rewards for LLMs
This paper analyzes Group Relative Policy Optimization (GRPO), a widely used RLVR algorithm for post-training large language models on reaso
Bidirectional Evolutionary Search: A New Framework for Self-Improving Language Models
This paper introduces Bidirectional Evolutionary Search (BES), a novel search framework for self-improving language models that addresses li
Researchers Develop Method to Predict Real-Time Progress in Reasoning Language Models
This research paper investigates whether real-time progress prediction is feasible for reasoning language models that use long latent chains
Feedback Distillation: A New Training Method for Improving LLM Reasoning in Theorem Proving
This paper introduces Feedback Distillation, a novel training method for reasoning models that improves upon standard GRPO (Group Relative P
Rank-Aware Decomposition Technique Reduces Computation in Recommender Systems by 87.5%
This paper presents a rank-aware decomposition technique for deep ranking models in industrial recommender systems. The key insight is that
ByteDance's Seed Diffusion Model Boosts Code Generation Speed by 5.4x
Seed Diffusion, an experimental open-source diffusion language model by ByteDance's Seed team, offers a 5.4x inference speedup over comparab
