Richard S. Sutton facts for kids
Quick facts for kids
Richard S. Sutton
FRS FRSC
|
|
---|---|
![]() |
|
Nationality | Canadian |
Citizenship | Canadian |
Alma mater | University of Massachusetts Amherst Stanford University |
Known for | Temporal difference learning, Dyna, Options, GQ(λ) |
Awards | AAAI Fellow (2001) President's Award (INNS) (2003) Royal Society of Canada Fellow (2016) Turing Award (2025) |
Scientific career | |
Fields | Artificial Intelligence Reinforcement Learning |
Institutions | University of Alberta |
Thesis | Temporal credit assignment in reinforcement learning (1984) |
Doctoral advisor | Andrew Barto |
Doctoral students | David Silver, Doina Precup |
Richard S. Sutton FRS FRSC is a Canadian computer scientist. He is a professor of computing science at the University of Alberta and a research scientist at Keen Technologies. Sutton is considered one of the founders of modern computational reinforcement learning, having several significant contributions to the field, including temporal difference learning and policy gradient methods.
Life and education
Richard Sutton was born in Ohio, and grew up in Oak Brook, Illinois, a suburb of Chicago.
Sutton received his B.A. in psychology from Stanford University in 1978 before taking an M.S. (1980) and Ph.D. (1984) in computer science from the University of Massachusetts Amherst under the supervision of Andrew Barto. His doctoral dissertation, Temporal Credit Assignment in Reinforcement Learning, introduced actor-critic architectures and temporal credit assignment.
He was influenced by Harry Klopf's work in the 1970s, which proposed that supervised learning is insufficient for AI or explaining intelligent behavior, and trial-and-error learning, driven by "hedonic aspects of behavior", is necessary. This focused his interest to reinforcement learning.
Career
In 1984, Sutton was a postdoctoral researcher at the University of Massachusetts. From 1985 to 1994, he was a principal member of technical staff in the Computer and Intelligent Systems Laboratory at GTE in Waltham, Massachusetts. After that, he spent 3 years at the University of Massachusetts Amherst as a senior research scientist. From 1998 to 2002, Sutton worked at the AT&T Shannon Laboratory in Florham Park, New Jersey as principal technical staff member in the artificial intelligence department.
Since 2003, he has been a professor of computing science at the University of Alberta. He led the institution's Reinforcement Learning and Artificial Intelligence Laboratory until 2018. While retaining his professorship, Sutton joined Deepmind in June 2017 as a distinguished research scientist and co-founder of its Edmonton office.
Sutton became a Canadian citizen in 2015 and renounced his US citizenship in 2017.
Reinforcement learning
Sutton joined Andrew Barto in the early 1980s at UMass, trying to explore the behavior of neurons in the human brain as the basis for human intelligence, a concept that had been advanced by computer scientist A. Harry Klopf. Sutton and Barto used mathematics toward furthering the concept and using it as the basis for artificial intelligence. This concept became known as reinforcement learning and went on to becoming a key part of artificial intelligence techniques.
Barto and Sutton used Markov decision processes (MDP) as the mathematical foundation to explain how agents (algorithmic entities) made decisions when in a stochastic or random environment, receiving rewards at the end of every action. Traditional MDP theory assumed the agents knew all information about the MDPs in their attempt toward maximizing their cumulative rewards. Barto and Sutton's reinforcement learning techniques allowed for both the environment and the rewards to be unknown, and thus allowed for these category of algorithms to be applied to a wide array of problems.
Sutton returned to Canada in the 2000s and continued working on the topic which continued to develop in academic circles until one of its first major real world applications saw Google's AlphaGo program built on this concept defeating the then prevailing human champion. Barto and Sutton have widely been credited and accepted as pioneers of modern reinforcement learning, with the technique itself being foundational to the modern AI boom.
In a 2019 essay, Sutton criticized the field of AI research for failing "to learn the bitter lesson that building in how we think we think does not work in the long run", arguing that "70 years of AI research [had shown] that general methods that leverage computation are ultimately the most effective, and by a large margin", beating efforts building on human knowledge about specific fields like computer vision, speech recognition, chess or Go.
Sutton and John Carmack announced a partnership for the development of AGI in 2023
Selected publications
- Sutton, R. S., Barto, A. G., Reinforcement Learning: An Introduction. MIT Press, 1998. Also translated into Japanese and Russian. Second edition MIT Press 2018.
- Miller, W. T., Sutton, R. S., Werbos, P. J. (Eds.), Neural Networks for Control. MIT Press, 1991.
- Sutton, R. S. (Ed.), Reinforcement Learning. Reprinting of a special issue of Machine Learning Journal. Kluwer Academic Press, 1992
Awards and honors
Sutton is a fellow of the Association for the Advancement of Artificial Intelligence (AAAI) since 2001; his nomination read: "For significant contributions to many topics in machine learning, including reinforcement learning, temporal difference techniques, and neural networks." In 2003 he received the President's Award from the International Neural Network Society and in 2013, the Outstanding Achievement in Research award from the University of Massachusetts Amherst. In 2025, he received the Turing Award from the Association for Computing Machinery together with Andrew Barto; the citation of the award read: "For developing the conceptual and algorithmic foundations of reinforcement learning."
In 2016, Sutton was elected Fellow of the Royal Society of Canada. In 2021, he was elected Fellow of the Royal Society of London.