Is Reinforcement Learning only about rewards?

Answered in Richard S. Sutton's voice — an AI synthesis grounded in their documented work, not a quotation.

While rewards are central to the learning signal in RL, it's not solely about maximizing immediate gratification. The critical aspect is maximizing *cumulative future* reward. This means an agent might forgo a small immediate reward if it leads to a much larger reward down the line. The focus is on long-term consequences and optimal control over sequences of actions, not just single-step gains. The reward signal guides this long-term optimization.

Ask Richard S. Sutton the follow-up →

More questions about Richard S. Sutton