Exploring RLHF on every prompt for local coding models
By
cloudking
Summary
A Hacker News user explores the idea of using Reinforcement Learning from Human Feedback (RLHF) on every prompt with a medium-sized local model to personalize it for daily coding tasks. The user questions whether manually fine-tuning a model to individual use habits would improve it or degrade performance, specifically targeting common LLM annoyances like sycophancy, verbosity, and overuse of analogies. The post reflects on whether individual prompt feedback is sufficient for meaningful model improvement.
Source
Key quotes
· 4 pulledI've been wondering lately if it would help to take a medium sized model and either in cloud or some local setup actually do Reinforcement Learning from Human Feedback (RLHF) on every prompt as a chore
I don't know if trying to manually finetune a model to your use habits would ruin it or help
ideally if you were diligent you could get rid of some of the ticks that make models for the general public difficult to work with e.g. overly sycophantic, overly verbose, annoying tendency to explain via analogies
but perhaps one individuals prompt feedback just isn't g
You might also wanna read
New Framework Formalizes Learning from Language Feedback with Provable Performance Guarantees
This paper formalizes the Learning from Language Feedback (LLF) problem, providing a principled framework for interactive learning using lan
Turing-RL: A Reinforcement Learning Approach for Training User Simulators Using Turing Test Rewards
This paper introduces Turing-RL, a novel reinforcement learning approach for training user simulator models that can mimic human users in in
HSIR: New Method Improves Self-Improvement Training for Large Reasoning Models
This research paper identifies two key problems in self-improvement training for Large Reasoning Models (LRMs): data imbalance (too many sim
River: "‼️I wrote a new blog post‼️ "An Exploration into…" - DEF CON Social
Microsoft Research's ARTIST: Using Reinforcement Learning to Train LLM Agents for Dynamic Tool Use
Microsoft Research's ARTIST framework uses reinforcement learning to train LLM agents to discover when and how to call tools (like search or
dev.to·23d ago
Study finds large language models vulnerable to classic persuasion tactics for harmful requests
This study tested whether three widely used large language models (LLMs) are susceptible to classic persuasion principles (authority, social
