RunRL - Content Missing

ag8

8mo agoenNews

Summary

The article appears to be about RunRL, but the content is completely empty or missing. Without any substantive content, it's impossible to determine the actual subject matter, context, or details about what RunRL refers to.

Key quotes

· 3 pulled

No content available for quote extraction

Article body appears to be empty

Unable to extract meaningful quotes from blank content

Snippet from the RSS feed

Hey HN, we’re Andrew and Derik at RunRL (https://runrl.com/). We've built a platform to improve models and agents with reinforcement learning. If you can define a metric, we'll make your model or agent better, without you having to think about managing GPU clusters.

Here's a demo video: https://youtu.be/EtiBjs4jfCg

I (Andrew) was doing a PhD in reinforcement learning on language models, and everyone kept...not using RL because it was too hard to get running. At some point I realized that someone's got to sit down and actually write a good platform for running RL experiments.

Once this happened, people started using it for antiviral design, formal verification, browser agents, and a bunch of other cool applications, so we decided to make a startup out of it.

How it works:

- Choose an open-weight base model (weights are necessary for RL updates; Qwen3-4B-Instruct-2507 is a good starting point)

- Upload a set of initial prompts ("Generate an antiviral targeting Sars-CoV-2 protease", "Prove this theorem", "What's the average summer high in Windhoek?")

- Define a reward function, using Python, an LLM-as-a-judge, or both

- For complex settings, you can define an entire multi-turn environment

- Watch the reward go up!

For most well-defined problems, a small open model + RunRL outperforms frontier models. (For instance, we've seen Qwen-3B do better than Claude 4.1 Opus on antiviral design.) This is because LLM intelligence is notoriously "spiky"; often models are decent-but-not-great at common-sense knowledge, are randomly good at a few domains, but make mistakes on lots of other tasks. RunRL creates spikes precisely on the tasks where you need them.

Pricing: $80/node-hour. Most models up to 14B parameters fit on one node (0.6-1.2 TB of VRAM). We do full fine-tuning, at the cost of parameter-efficiency (with RL, people seem to care a lot about the last few percent gains in e.g. agent reliability).

Next up: continuous learning; tool use. Tool use is currently in private beta, which you can join here: https://forms.gle/D2mSmeQDVCDraPQg8

We'd love to hear any thoughts, questions, or positive or negative reinforcement!

Comments URL: https://news.ycombinator.com/item?id=45277704

Points: 2

# Comments: 0

You might also wanna read

Google's Debug program seeks EPA approval to release 64 million modified mosquitoes in California and Florida

Google's Debug program plans to release up to 64 million genetically modified "good" mosquitoes in California and Florida over two years to

bit.ly·50m ago

AI's Real Threat: The Normalization of Mediocrity Over Originality

Ray Nayler argues that AI's real danger isn't superhuman intelligence but the encouragement of mediocrity. He contends that AI systems optim

time.com·50m ago

Data Center Activism as a Strategic Lever for AI Backlash

The article discusses data center activism as a strategic "bankshot" against the AI industry's growing energy consumption. While the author

davekarpf.beehiiv.com·51m ago

Phishing Campaign Targets Signal Users by Stealing Backup Recovery Keys

A new wave of phishing attacks is targeting Signal users by impersonating the app's support team. Hackers send messages inside Signal claimi

cybersecuritynews.com·52m ago

Apple Plans to Launch Smart Glasses in Late 2027, Competing With Meta's Ray-Ban Wearables

The article discusses Apple's anticipated entry into the smart glasses market, reportedly launching in late 2027, directly competing with Me

gizmodo.com·52m ago

European Commission explores new semiconductor factory as part of Chips Act 2.0 strategy

The European Commission, along with two R&D hubs, is exploring the establishment of a cutting-edge semiconductor factory in Europe as part o

politico.eu·52m ago