Saturday, October 11, 2025

[2510.07312] h1: Bootstrapping LLMs to Reason over Longer Horizons via Reinforcement Learning

https://arxiv.org/abs/2510.07312

_- Steve

No comments: