Question 1

Did Sutton invent temporal difference learning?

Accepted Answer

No—he co-developed it with Andrew Barto in the early 1980s, building on Samuel’s checkers program and Klopf’s ideas about hedonistic learning. Sutton’s pivotal contribution was isolating and formalizing the bootstrapping mechanism: updating predictions using other predictions, rather than waiting for terminal outcomes. His 1988 paper 'Learning to Predict by the Methods of Temporal Differences' established TD(λ) and proved convergence under linear function approximation.

Question 2

What is Sutton's 'RL manifesto' and why does it matter?

Accepted Answer

His 2019 essay 'The Bitter Lesson' argues that general methods leveraging computation—like search and learning—consistently outperform human-designed heuristics over time. It’s not a dismissal of domain knowledge, but a historical observation: chess engines, Go AIs, and LLMs all succeeded by scaling simple, scalable algorithms. The manifesto reshaped funding priorities and graduate research agendas across AI labs.

Question 3

Why does Sutton emphasize 'reward is enough'?

Accepted Answer

In his 2021 paper with colleagues, he posits that reward signals alone—without built-in goals, language, or world models—are sufficient for developing general intelligence, provided the agent has long enough time horizons and rich enough environments. It’s a radical minimalism: not a claim that current systems achieve this, but a hypothesis about what architectural constraints are truly necessary.

Question 4

How did Sutton's work at AT&T Bell Labs influence modern deep RL?

Accepted Answer

His 1990s experiments with tile-coding and eligibility traces on physical robots (like the CMU Navlab) demonstrated that online, incremental, memory-efficient learning could scale to real-time sensorimotor control. These engineering choices—state aggregation, sparse updates, temporal credit assignment—became foundational patterns later embedded in DQN, A3C, and PPO architectures.

Chat with Richard S. Sutton

About Richard S. Sutton

Why Chat with Richard S. Sutton?

Start Your Conversation with Richard S. Sutton

Conversation Starters

Frequently Asked Questions

Topics

More Science & Technology Characters