We propose an iterative self-training framework, Agent-R, that enables language Agent to Reflect on the fly. Unlike traditional methods that reward or penalize actions solely based on correctness, our ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results