Paper-replication workflow enables coding agents to systematically replicate computational claims from ML papers
By
[Submitted on 2 Jul 2026]
Summary
This paper introduces Paper-replication, a workflow that enables coding agents to systematically replicate computational claims from scientific machine learning papers. The workflow makes agents record target claims, reconstruct methods, run experiments, link outputs to provenance, and pass validation checks. Evaluated on twelve runs across four papers, all workspaces passed completion gates and all 158 recorded targets were matched with report coverage, though runs varied in target division, numerical fidelity, and replication time.
Source
Key quotes
· 4 pulledWe introduce Paper-replication, a workflow that makes each selected paper claim a target with recorded evidence, and implement it as a coding-agent skill.
All twelve workspaces pass the completion gate, and all 158 recorded targets are matched with report coverage.
Paper-replication makes completion depend on workspace evidence and validation checks rather than on the agent's final message.
Even in this completed workspace state, repeated runs differ in how papers are divided into targets, in numerical fidelity to the source papers, in elapsed replication time, in the number of intermediate executions replaced before final evidence is accepted, and in the rules used to accept evidence.
You might also wanna read
Paper2Agent: Converting Research Papers into Interactive AI Agents for Scientific Discovery
Paper2Agent is an automated framework that converts research papers into interactive AI agents, transforming static research outputs into ac
GitHub Repository: 300-Line Implementation of Computational Life Research on Self-Replicating Programs
This article describes a GitHub repository containing a 300-line code reproduction of computational life research, where self-replicating pr
AI-Powered Code Review: A Framework for Agentic Workflows in Software Development
This paper examines the evolution of code review practices and proposes a vision for AI-powered, agentic code review workflows. It argues th
A Practical Workflow for Running 4-8 Parallel Coding Agents with tmux and Markdown Specifications
The article describes a practical workflow for running 4-8 parallel coding agents using a lightweight setup with tmux, Markdown files, bash
Coding Agent Formalization of Numerical Analysis Reveals Flaws in Compilation-Based Quality Metrics
This paper presents research on using coding agents to formalize advanced mathematics textbooks in Lean 4, specifically focusing on Numerica
How AI coding agents are reshaping social science research: Opportunities and concerns
This article examines how AI coding agents are transforming social science research by automating core research tasks traditionally performe

Comments
Sign in to join the conversation.
No comments yet. Be the first.