last-mile

Experiment 25

23 minute read

This post demonstrates that reinforcement learning agents can be trained with sparse reward signals as effectively as carefully tuned dense rewards.