The David E. Rumelhart Prize is awarded annually to an individual or collaborative team making a significant contemporary contribution to the theoretical foundations of human cognition. Contributions may be formal in nature: mathematical modeling of human cognitive processes, formal analysis of language and other products of human cognitive activity, and computational analyses of human cognition using symbolic or non-symbolic frameworks all fall within the scope of the award.
The David E. Rumelhart Prize is funded by the Robert J. Glushko and Pamela Samuelson Foundation. Robert J. Glushko received a Ph.D. in Cognitive Psychology from the University of California, San Diego in 1979 under Rumelhart’s supervision. He is an Adjunct Full Professor in the Cognitive Science Program at the University of California, Berkeley.
The prize consists of a hand-crafted, custom bronze medal, a certificate, a citation of the awardee’s contribution, and a monetary award of $100,000.
In 2020 the Society celebrated the 20th year of this prestigious prize. Click here to watch the official video, commemorating the life and career of David E. Rumelhart.
David E. Rumelhart made many contributions to the formal analysis of human cognition, working primarily within the frameworks of mathematical psychology, symbolic artificial intelligence, and parallel distributed processing. He also admired formal linguistic approaches to cognition and explored the possibility of formulating a formal grammar to capture the structure of stories.
Rumelhart obtained his undergraduate education at the University of South Dakota, receiving a B.A. in psychology and mathematics in 1963. He studied mathematical psychology at Stanford University, receiving his Ph. D. in 1967. From 1967 to 1987 he served on the faculty of the Department of Psychology at the University of California, San Diego. In 1987 he moved to Stanford University, serving as Professor there until 1998. He became disabled by Pick’s disease, a progressive neurodegenerative illness, and died in March 2011.
Rumelhart developed models of a wide range of aspects of human cognition, ranging from motor control to story understanding to visual letter recognition to metaphor and analogy. He collaborated with Don Norman and the LNR Research Group to produce “Explorations in Cognition” in 1975 and with Jay McClelland and the PDP Research Group to produce “Parallel Distributed Processing: Explorations in the Microstructure of Cognition” in 1986. He mastered many formal approaches to human cognition, developing his own list processing language and formulating the powerful back-propagation learning algorithm for training networks of neuron-like processing units. Rumelhart was elected to the National Academy of Sciences in 1991 and received many prizes, including a MacArthur Fellowship, the Warren Medal of the Society of Experimental Psychologists, and the APA Distinguished Scientific Contribution Award.
Rumelhart articulated a clear view of what cognitive science, the discipline, is or ought to be. He felt that for cognitive science to be a science, it would have to have formal theories — and he often pointed to linguistic theories, as well as to mathematical and computational models, as examples of what he had in mind.
Since 2002, the Rumelhart Prize Recipients have been honored with a symposium at the annual Cognitive Science Conference and the papers published in special issues of the “Cognitive Science” journal.
Ten Years of Rumelhart Prizes – A Symposium
At the 2010 annual conference of the Cognitive Science Society the first ten recipients of the Rumelhart Prize posed “Outstanding Questions for Cognitive Science”
Their questions were printed in “The Little Red Book” distributed to all attendees. [download]
Video of the symposium can be viewed at the Science Network. [video]
Bechtel, W., Behrmann, M., Chater, N., Glushko, R. J., Goldstone, R. L. and Smolensky, P. (2010), The Rumelhart Prize at 10. Cognitive Science, 34: 713-715. doi: 10.1111/j.1551-6709.2010.01116.x. [download]
Group Photo, Portland 2010
Bob Glushko and Geoff Hinton at UCSD, 2007
Rumelhart Prize Recipients Gather in Vancouver, 2006 (Shepard, Elman, Shiffrin, Smolensky, and Anderson — with Bob Glushko)
The Rumelhart Prize is administered by the Prize Selection Committee in consultation with the Glushko-Samuelson Foundation. Screening of nominees and selection of the prize winner will be performed by the Prize Selection Committee. Scientific members (including the Chair) of the Prize Selection Committee will serve for up to two four-year terms. A representative of the Foundation will also serve on the Prize Selection Committee.
Richard Cooper (Committee Chair), Department of Psychological Sciences Birkbeck, University of London
Dedre Gentner, Department of Psychology Northwestern University
Robert J. Glushko Glushko-Samuelson Fund
Tania Lombrozo, Department of Psychology University of California, Berkeley
Steven T. Piantadosi, Department of Psychology University of Berkeley
Jesse Snedeker, Department of Psychology Harvard University
The Rumelhart nominations process will be open from January 23rd – February 24th, 2023
Each year, the selection committee will continue to consider nominations previously submitted. The committee invites updates to existing nominations as well as new nominations. Materials should be sent to the Chair of the Rumelhart Prize Committee.
Nominations should include the following materials:
a three-page statement of nomination
a complete curriculum vitae
copies of up to five of the nominee’s relevant publications
The nominee may be an individual or a team, and in the case of a team, vitae for all members should be provided. The prize selection committee considers both the scientific contributions and the scientific leadership and collegiality of the nominees, so these issues should be addressed in the statement of nomination. Supporting letters may also be provided.
The recipient of the twenty third David E. Rumelhart Prize is Nick Chater, who has spent more than three decades searching for fundamental principles that underpin the cognitive sciences. His work covers a wide range of topics, ranging from reasoning and decision-making to perception, the processing, acquisition and evolution of language, and the virtual bargaining theory of social interaction. In each case, he has focused on underlying principles that can be applied across cognitive domains. He has also made significant contributions to the public understanding of science and the application of the cognitive and behavioural sciences to practical problems in public policy and business. Chater is a Professor of Behavioural Science at Warwick Business School, having previously held appointments at the University of Edinburgh, University of Oxford and University College London. He received his undergraduate degree in experimental psychology from the University of Cambridge and his PhD in Cognitive Science from the Centre for Cognitive Science at the University of Edinburgh.
Chater’s early work, with Mike Oaksford, developed the view that human reasoning is effectively uncertain inference, where the normative standard for human performance is probability theory. This led to the development of models of Wason’ selection task, syllogistic reasoning, and conditional reasoning using a Bayesian standpoint. On this view, reasoning as Bayesian inference is a fundamental principle of cognition.
Chater’s interest in general principles of cognition is also exemplified by his arguments for the cognitive system’s preference for simplicity. He has been instrumental in bringing the mathematical theory of Kolmogorov complexity, which provides a rigorous formal foundation for the notion of simplicity, to problems in cognitive science. He has used this approach to show a formal duality between Bayesian and simplicity-based explanations of perceptual organisation, thus helping to reframe a century-long debate between these apparently rival approaches. He has also helped build concrete simplicity-based models of categorisation (with Emmanuel Pothos) and aspects of language acquisition (with Anne Hsu). In a series of papers with the mathematician Paul Vitanyi, he has addressed questions of learning from positive evidence. This work addresses fundamental questions on the nature of inductive inference and language learnability.
Chater’s approach to judgement and decision-making (including the influential Decision by Sampling [DbS] model of risky choice) starts from a third general principle: the comparative nature of human perceptual and conceptual judgement (and the lack of absolute scales for representing magnitudes). The foundational work for this principle was embodied in a model, with Neil Stewart and Gordon Brown, of the representation of perceptual magnitudes. Building on this viewpoint, the classic models of expected utility, and variants such as prospect theory, are cognitively implausible. Instead, DbS claims that decisions are made by sampling values (from the immediate environment, or memory) along relevant dimensions (probability, amount of money, etc.) evaluating an item by comparison with those sampled items. This approach turns out to reconstruct many aspects of Prospect Theory; but also to generate many new context effects, which were either known already or which have since been confirmed.
Chater has also made important contributions related to language acquisition, processing, and evolution. In two Behavioural and Brain Sciences articles with Morten Christiansen, he first provided strong restrictions of the possibility of coevolution between language/culture and the brain, and then outlined how many properties of language, and its structure and processing, may arise from (another) general cognitive principle: the roughly serial attentional bottleneck through which language must pass in real time.
In hist most recent work, Chater has been working on yet another general principle, this time concerned with social and communicative interaction. Building on work by Herb Clark, David Lewis, Thomas Schelling and Paul Grice, among others, he has developed a new model of how people interact, based on the idea that social interactions operate via ‘tacit’ agreements, about who does what, what actions are appropriate, about the meaning of communicative terms in a concrete situation, and so on. The idea is that these tacit agreements are created on-the-fly; but also that each agreement provides a precedent for the next. The creation (and application) of such ‘virtual bargains’ is presumed to arise
Chater has served as Associate Editor for the journals Cognitive Science, Psychological Review, and Psychological Science. He was elected a Fellow of the Cognitive Science Society in 2010 and a Fellow of the British Academy in 2012. Chater is co-founder of the research consultancy Decision Technology, and has served as a member of the UK government’s Climate Change Committee. He was co-creator and resident scientist for eight series of the BBC Radio 4 show The Human Zoo, a psychological perspective on everyday life and politics; and he has written two books for a general audience: The Mind is Flat (2018) and The Language Game (2022, with Morten Christiansen).
Christiansen, M. H., & Chater, N. (2022). The language game: How improvisation created language and changed the world. Basic Books.
Sanborn, A. N., Heller, K., Austerweil, J. L., & Chater, N. (2021). REFRESH: A new approach to modeling dimensional biases in perceptual similarity and categorization. Psychological review, 128(6), 1145.
Chater, N. (2018). The mind is flat: The remarkable shallowness of the improvising brain. Yale University Press.
Christiansen, M. H., & Chater, N. (2016). The now-or-never bottleneck: A fundamental constraint on language. Behavioral and brain sciences, 39.
Tsetsos, K., Usher, M., & Chater, N. (2010). Preference reversal in multiattribute choice. Psychological review, 117(4), 1275.
Christiansen, M. H., & Chater, N. (2008). Language as shaped by the brain. Behavioral and brain sciences, 31(5), 489-509.
Oaksford, M., & Chater, N. (2007). Bayesian rationality: The probabilistic approach to human reasoning. Oxford University Press.
Stewart, N., Brown, G. D., & Chater, N. (2005). Absolute identification by relative judgment. Psychological review, 112(4), 881.
Chater, N., & Vitányi, P. (2003). Simplicity: a unifying principle in cognitive science? Trends in cognitive sciences, 7(1), 19-22.
Chater, N. (1996). Reconciling simplicity and likelihood principles in perceptual organization. Psychological Review, 103(3), 566.
Oaksford, M., & Chater, N. (1994). A rational analysis of the selection task as optimal data selection. Psychological Review, 101(4), 608.
Celebrating 20 Years of the Rumelhart Prize
2022 Recipient - Michael Tomasello
The recipient of the twenty second David E. Rumelhart Prize is Michael Tomasello, James F. Bonk Distinguished Professor at Duke University. Michael is a fellow of the German National Academy of Sciences, the Hungarian National Academy of Sciences, the American Academy of Arts and Sciences, the National Academy of Science, and of course our own Cognitive Science Society. His numerous previous awards include the Jean Nicod Prize, the Mind and Brain Prize, the British Academy Wiley Prize in Psychology, and the Heineken Prize for Cognitive Science, among many others.
The guiding question behind the Michael Tomasello’s research is one that many cognitive scientists have pondered at some time or other. That question, which Michael has pursued in a career spanning over 40 years and in two continents, is: What makes us human? What distinguishes us from our great ape cousins? For some, it is tool use, but many species use tools. For some, it is language, but many species have communicative systems. For some, it is the ability to craft our environment to our own ends. For Professor Tomasello, it is not a single cognitive domain, like language, or tool use. Humans are unique in multiple ways – in their communication, yes, but also in aspects of their social cognition, their cultural learnings, their cooperative thinking, and their abilities to collaborate, to express prosocial behaviours, to follow social norms, and to maintain a moral identity.
Through his comparative work with children and primates, Michael Tomasello has shown how in each of the above areas, the capabilities of even three-year-old humans go beyond those of more mature chimps and other (non-human) great apes in significant ways. It is his contention, that these excess capabilities are driven by the capacity to entertain shared intentions – a capacity that he argues is absent or limited in non-human species but that develops in humans by the age of three and that subsequently supports the development of everything from collaboration and use of grammatical abstractions in communicative systems, to moral identity and cultural evolution.
But where does the capacity to entertain shared intentions come from? Evolution provides part of the answer, on the assumption that shared intentionality is a pre-requisite of a co-operative society, and that cooperation provide societies built on it with an adaptive advantage. For this argument to hold, one must also appreciate the co-evolution of society and culture, another area that he has explored.
2021 Recipient - Susan Goldin-Meadow
The recipient of the twenty first David E. Rumelhart Prize is Susan Goldin-Meadow, Beardsley Ruml Distinguished Service Professor at the University of Chicago. She became interested in psychology while an undergraduate at Smith College when she did her junior year abroad in Geneva, Switzerland. She spent the year at the Institut des Sciences de l’Education, taking courses with Piaget and Inhelder, and doing research on language with Hermine Sinclair and a fellow student, Annette Karmiloff-Smith. That experience piqued her interest in the relation between language and thought and in the creation of language, and led her to do her doctorate in developmental psychology at the University of Pennsylvania under the guidance of Rochel Gelman and Lila Gleitman. She joined the faculty of the University of Chicago in 1976.
Susan Goldin-Meadow has produced fundamental insights in multiple areas of cognitive science. Her work on linguistically-isolated deaf children has provided a preliminary answer to one of the most enduring questions in cognitive science: Where does language come from? She has demonstrated that homesign systems –– gestural systems of communication developed spontaneously by profoundly deaf children in the absence of exposure to formal sign language –– have many of the fundamental properties of natural languages. In so doing, she has addressed the innateness question by showing that some of the building blocks of language come from individual human minds rather than from cultural evolution. She has thus made significant theoretical contributions concerning which aspects of language are hard-wired, and which are introduced into the linguistic system by each new generation of language-learners and language-users. This work has implications for understanding how we learn to communicate, for explaining developmental phenomena such as critical periods, and for understanding the very nature of the human capacity for language.
2020 Recipient - Stanislas Dehaene
The recipient of the twentieth David E. Rumelhart Prize is Stanislas Dehaene. Stanislas received his training in mathematics at the École Normale Supérieure in Paris, then completed a PhD in cognitive psychology with Jacques Mehler, postdoctoral studies with Michael Posner, as well as neuronal modelling studies with Jean-Pierre Changeux. He has been working since 1997 at INSERM and the Commissariat à l’énergie atomique, where he created the Cognitive Neuroimaging Unit in 2001. In September 2005 he was elected as a full professor on the newly created chair of Experimental Cognitive Psychology at the Collège de France in Paris. In 2017 he became the director of NeuroSpin, France’s advanced brain imaging center. He is also the president of France’s Scientific Council for Education.
Stanislas Dehaene’s interests concern the cerebral mechanisms of specifically human cognitive functions such as language, calculation, and reasoning. The team uses a variety of experimental methods, including mental chronometry in normal subjects, cognitive analyses of brain-lesioned patients, and brain-imaging studies with positron emission tomography, functional magnetic resonance imaging, and high-density recordings of event-related potentials. Formal models of minimal neuronal networks are also devised and simulated in an attempt to throw links between molecular, neurophysiological, imaging and behavioral data.
Stanislas Dehaene’s main scientific contributions include the study of number processing. Using converging evidence from mental chronometry, PET, ERPs, fMRI, and brain lesions, Stanislas Dehaene demonstrated the central role played by a region of the intraparietal sulcus in understanding quantities and arithmetic (the number sense). He was also the first to demonstrate that subliminal presentations of numbers and words can yield detectable cortical activations in fMRI, and has used these data to support an original theory of conscious and nonconscious processing in the human brain. With neurologist Laurent Cohen, he also studied the neural networks of reading and demonstrated the crucial role of the left occipito-temporal region (the visual word form area) in reading acquisition and literacy.
Stanislas Dehaene is the author of over 300 scientific publications in major international journals. With more than 100,000 citations, he is a Thomas Reuters Highly Cited Researcher. He is a member of seven academies, including the French, US and Pontifical academies of science. He has received several international prizes including McDonnell Centennial Fellowship, the Louis D. prize of the French Academy of Sciences (with D. Lebihan), and the Grete Lundbeck Brain Prize (with G. Rizzolatti and T. Robins). His four general-audience books (The Number Sense, Reading in the brain, Consciousness and the brain, and How we learn) have been translated in more than ten languages. He has also authored three general-audience documentaries on the human brain. His courses at College de France, available athttps://www.college-de-france.fr/site/stanislas-dehaene/are followed by a broad audience world-wide.
2019 Recipient - Michelene (Micki) T. H. Chi
The recipient of the nineteenth David E. Rumelhart Prize is Michelene (Micki) T. H. Chi, who, more than once, has challenged basic assumptions about the mind and defined new approaches that have shaped a generation of cognitive and learning scientists. Chi received a bachelor’s degree in mathematics from Carnegie Mellon University, followed by a PhD from the same institution. Following post-doctoral work, she joined the Learning Research and Development Center and the Department of Psychology at the University of Pittsburgh. In 2008, Chi moved to Arizona State University, where she is now the Dorothy Bray Endowed Professor of Science and Teaching. She has been recognized with numerous honors throughout her career, including election to the National Academy of Education in 2010, and an E.L. Thorndike Career Achievement Award from the American Psychological Association in 2015. In 2016 she was inducted into the American Academy of Arts and Sciences.
Chi received a bachelor’s degree in mathematics from Carnegie Mellon University, followed by a PhD from the same institution. Following post-doctoral work, she joined the Learning Research and Development Center and the Department of Psychology at the University of Pittsburgh. In 2008, Chi moved to Arizona State University, where she is now the Dorothy Bray Endowed Professor of Science and Teaching. She has been recognized with numerous honors throughout her career, including election to the National Academy of Education in 2010, and an E.L. Thorndike Career Achievement Award from the American Psychological Association in 2015. In 2016 she was inducted into the American Academy of Arts and Sciences.
In the 1980s, Chi’s foundational work on expertise showed that expert performance arose not from more strategic search through some problem space of solutions, but through more effective representations of the problem. Subsequently, her work on student learning identified “self-explaining” as an activity that differentiates more and less effective learners, and one that can be fostered in formal and informal learning environments – a finding that is now endorsed by the Institute of Education Sciences as one that should be implemented in classrooms. More recently, she has developed the ICAP theory of active learning, which classifies learning activities into four modes that align with corresponding cognitive processes. The framework not only helps synthesize decades of research in cognitive, developmental, and educational psychology, but also clarifies our basic understanding of learning as an active process.
Chi’s work has also taught us the importance of relating our science to the real world, and specifically to education. She has done so with the rigor of the lab, but without losing sight of the richness of qualitative data, the complexities of real-world content, or the social context within which learning typically occurs.
Chi has been an active member of the Cognitive Science Society since the 1980s. She served on the governing board of the society from 1993-1999, and was one of the inaugural fellows of the society in 2003. Her impact on the field and beyond is evidenced by the fact that her papers “Categorization and Representation of Physics Problems by Experts and Novices” and “Self-Explanations: How Students Study and Use Examples in Learning to Solve Problems” have, between them, been cited almost 10,000 times (at the time of writing).
2018 Recipient - Michael Tanenhaus
The recipient of the eighteenth David E. Rumelhart Prize is Michael Tanenhaus, who over the course of 40 years gradually transformed our understanding of human language and its relation to perception, action and communication. Tanenhaus is the Beverly Petterson Bishop and Charles W. Bishop Professor of Brain and Cognitive Sciences at the University of Rochester. Tanenhaus also has a limited appointment as Chair Professor in the School of Psychology at Nanjing Normal University. He was a founding member and served as Director of the Center for Language Sciences and the PI for center’s NIH-supported interdisciplinary training program for twenty years. He received an undergraduate degree in Speech and Hearing Science from University of Iowa and PhD in Cognitive Psychology from Columbia.
Through ingenious theory development, experimentation, and computational modeling, Tanenhaus has shown that language comprehension is goal-directed and highly interactive. As we hear a sentence unfold, our interpretation at each level–phonological, lexical, syntactic and semantic–is affected by our higher level knowledge (such as our goals and understanding of the situation) and by fine-grained information from perception that has been passed up the processing chain. His work on language both informs and is informed by other aspects of cognition, including visual and auditory perception, attention, representation, and social-pragmatic interaction. Throughout his career he has engaged in a two-way dialog with formal and computational linguistics – through work on lexicalized grammars, phonemic encoding, prosody and, most recently, pragmatics.
Tanenhaus’ work is characterized by closely linked experimental and theoretical advances. He is best known as the creator of the visual world paradigm: a means of studying spoken language comprehension by measuring how it shapes visual attention. The development of this paradigm was rooted in Tanenhaus’ theoretical vision of cognition: only in a highly interactive system would we expect eye-movements to incrementally reflect language. This method has, in turn, provided some of the clearest evidence for his theoretical claims. Due to its simplicity, the visual world paradigm has been rapidly adopted for studying children and special, allowing researchers to ask how language processing changes across development or breaks down in developmental disorders.
The breadth of Tanenhaus’ thinking, combined with his uncanny experimental skills, have inspired a new generation of researchers to take a fresh look at what it means to communicate. His impact has been amplified by his former students who have taken his insights and applied them to new questions at psychology, linguistics and cognitive science departments around the world.
Grodner, D. J., Klein, N. M., Carbary, K. M., & Tanenhaus, M. K. (2010). “Some,” and possibly all, scalar inferences are not delayed: Evidence for immediate pragmatic enrichment. Cognition, 116(1), 42-55.
Tanenhaus, M.K. & Brown-Schmidt, S. (2008). Language processing in the natural world. Philosophical Transactions of the Royal Society B: Biological Sciences, 363, 1105-1122.
Clayards, M.A., Tanenhaus, M.K., Aslin, R.N. & Jacobs, R.A. (2008). Perception of speech reflects optimal use of probabilistic speech cues. Cognition, 108, 804-809.
McMurray, B., Tanenhaus, M.K. & Aslin, R.N. (2002). Gradient effects of within-category phonetic variation on lexical access. Cognition, 86, B33-42.
Sedivy, J.E., Tanenhaus, M.K., Chambers, C.G. & Carlson, G.N. (1999). Achieving incremental interpretation through contextual representation: Evidence from the processing of adjectives. Cognition, 71, 109-147.
McRae, K., Spivey-Knowlton, M.J. & Tanenhaus, M.K. (1998). Modeling thematic fit (and other constraints) within an integration competition framework. Journal of Memory and Language 38, 283-312.
Allopenna, P. D, Magnuson, J.S. & Tanenhaus, M.K. (1998). Tracking the time course of spoken word recognition: evidence for continuous mapping models. Journal of Memory and Language, 38, 419-439.
Tanenhaus, M.K., Spivey-Knowlton, M.J., Eberhard, K.M., & Sedivy, J.C. (1995). Integration of visual and linguistic information during spoken language comprehension. Science, 268, 1632-1634.
Trueswell, J.C., Tanenhaus, M.K., & Garnsey, S.M. (1994). Semantic influences on parsing. Use of thematic role information in syntactic ambiguity resolution. Journal of Memory and Language 33(3), 285.
Tanenhaus, M.K., Leiman, J.M. & Seidenberg, M.S. (1979). Evidence for multiple stages in the processing of ambiguous words in syntactic contexts. Journal of Verbal Learning and Verbal Behavior, 18, 427-441.
2017 Recipient - Lila Gleitman
Lila Gleitman has fundamentally shaped our scientific understanding of both language and cognition, and the relationship between these fields, as well as the nature of human learning. Over a long career, Gleitman has done more than anyone to establish both the theoretical structure and the empirical basis for the notion that when children learn language they are not simply forming statistical associations between sequences of speech sounds, or associations between words and percepts or experiences; rather children are doing a remarkably sophisticated kind of symbolic reasoning or detective work, reverse-engineering the logic of language with syntax – the law-like relations between linguistic form and meaning – at its core.
Gleitman’s contributions are incredibly wide-reaching, but she is best known for her proposals for how sensitivity to syntactic structure lets children infer the underlying meanings of words, especially with a focus on verbs. For example, when hearing “John pilked Bill”, you can infer that “pilk” is a two-argument predicate, likely conveying an externally caused event in which John does something to or for Bill. This is complementary to, often completely independent of and more important than the data that for centuries scholars took to be the primary evidence for learning word meanings or concepts, namely seeing what is going on in the world while people are talking. Her syntactic bootstrapping research program has has allowed her to explain many general properties of language acquisition, including
(1) How blind children, without the main route of perceptual access to the surrounding referent world nevertheless acquire language in ways similar, and essentially at the same pace, as sighted children.
(2) How deaf children lacking any access to a spoken language, spontaneously create communication systems (home-signs) that reflect the semantic-syntactic structure found in spoken languages of the world.
(3) Why verbs are learned more slowly than nouns, despite the fact that infants can conceptualize the meanings of verbs – the structure of events in the world – quite well from a young age.
(4) How older children use the meaning-structure mappings embedded in syntactic frames to disambiguate very underdetermined mappings between words and the world, and thereby learn words much more quickly than simple associative mechanisms ever could.
(5) Why young children are observed to rely heavily on linguistic cues to word meaning over and above social-referential cues when identifying verb meanings and processing sentences
Gleitman, L. (1990). The structural sources of verb meanings. Language acquisition, 1(1), 3-55.
Gleitman, L., January, D., Nappa, R., & Trueswell, J.C. (2007). On the give and take between event apprehension and utterance formulation. Journal of Memory and Language, 57(4), 544-569.
Gleitman, L. R., Cassidy, K., Nappa, R., Papafragou, A., & Trueswell, J. C. (2005). Hard words. Language Learning and Development, 1(1), 23-64.
Gleitman, L. R., & Newport, E. L. (1995). The invention of language by children: Environmental and biological influences on the acquisition of language. An invitation to cognitive science, 1, 1-24.
Gleitman, L. R., Newport, E. L., & Gleitman, H. (1984). The current status of the motherese hypothesis. Journal of Child Language, 11(01), 43-79.
Gillette, J., Gleitman, H., Gleitman, L., & Lederer, A. (1999). Human simulations of vocabulary learning. Cognition, 73(2), 135-176.
Gleitman, L. & Landau, B. (2012). Every child an isolate: nature’s experiments in language learning. In M. Piattelli-Palmarini and R. C. Berwick (Eds.), Rich Languages from Poor Inputs. Oxford: Oxford University Press.
Landau, B., and Gleitman, L. R. (2009). Language and experience: Evidence from the blind child (Vol. 8). Harvard University Press.
Medina, T. N., Snedeker, J., Trueswell, J. C., & Gleitman, L. R. (2011). How words can and cannot be learned by observation. Proceedings of the National Academy of Sciences, 108(22), 9014-9019.
Trueswell, J. C., Medina, T. N., Hafri, A., & Gleitman, L. R. (2013). Propose but verify: Fast mapping meets cross-situational word learning. Cognitive Psychology, 66(1), 126-156.
2016 Recipient - Dedre Gentner
Dr. Dedre Gentner, the recipient of the 2016 Rumelhart Prize, personifies the success of Cognitive Science as an interdisciplinary enterprise, tackling foundational questions about the mind through the seamless integration of psychological theory, empirical methodology, and computational insight. The resulting work has shaped our understanding of learning, reasoning, language, and the very nature of mental representation.
Gentner has made important contributions to the study of verbs, mental models, similarity, language and thought, as well as word learning in children. Underlying this diverse body of work is a common thread: an interest in how it is that we can represent and reason about relationships, such as that between the arguments of a relational predicate, or between two models that are superficially distinct, yet share common underlying structure. It’s not surprising, then, that this year’s recipient has also been a pioneer in the contemporary study of analogical reasoning, and it is this work for which she is best known.
Gentner has influenced the field not only through her prolific experimental work with both children and adults, but also for the general theory of analogical reasoning that she developed and tested alongside students and collaborators: Structure Mapping Theory. A central insight of this theory is that analogies consist of matching relational structures between a base domain and a target domain. The properties of objects in the domains need not match, and deeply nested relational structures are favored over independent relations. In the analogy between heat flow and water flow, for example, the relevant similarities involve a flow of some quantity from areas of high pressure to areas of low pressure, even though the domains differ in many superficial respects. This theory was implemented in the Structure-Mapping Engine (SME), which both formalized the theory and offered a computationally-tractable algorithm for carrying out the process of mapping structures and drawing inferences.
Gentner’s work has not been restricted to analogical reasoning, however, and her influential edited volumes – on mental models in 1983, on analogical reasoning in 2001, and on language and thought in 2003, attest to the breadth of her interests and impact.
Gentner received a bachelor’s degree in physics from UC Berkeley, and a PhD in psychology from UCSD. As a student of Don Norman’s at San Diego, Dedre also worked with David Rumelhart, notably on topics related to verb meaning and representation. These interactions contributed not only to Dedre’s dissertation on possession verbs, but also to subsequent work on metaphor and analogy. Before joining the faculty at Northwestern, where she is currently Alice Gabrielle Twight Professor of Psychology and the director of the Cognitive Science Program, Gentner held positions at the University of Illinois at Urbana-Champaign, Bolt Beranek and Newman, Inc, and the University of Washington. Gentner has been an active member of the Cognitive Science Society its beginning, first presenting a paper at the society’s third annual meeting in 1981, where she presented a paper titled: “Generative analogies as mental models.” She was president of the society from 1993-1994, became a society fellow in 2003, and has served on the governing board for several periods. Dedre Gentner was also associate editor of the society’s flagship journal, Cognitive Science, from 2001-2006.
Gentner, D. (1981). Some interesting differences between nouns and verbs. Cognition and brain theory, 4, 161-178.
Gentner, D. (1983). Structure-mapping: A theoretical framework for analogy. Cognitive science, 7(2), 155-170.
Gentner, D., & Stevens, A. L. (1983). Mental models. Psychology Press.
Falkenhainer, B., Forbus, K. D., & Gentner, D. (1989). The structure-mapping engine: Algorithm and examples. Artificial intelligence, 41(1), 1-63.
Medin, D. L., Goldstone, R. L., & Gentner, D. (1993). Respects for similarity. Psychological review, 100(2), 254.
Markman, A. B., & Gentner, D. (1993). Structural alignment during similarity comparisons. Cognitive psychology, 25(4), 431-467.
Forbus, K. D., Gentner, D., & Law, K. (1995). MAC/FAC: A model of similarity‐based retrieval. Cognitive science, 19(2), 141-205.
Gentner, D., & Markman, A. B. (1997). Structure mapping in analogy and similarity. American psychologist, 52(1), 45.
Gentner, D., Holyoak, K. J., & Kokinov, B. N. (2001). The analogical mind: Perspectives from cognitive science. MIT press.
Gentner, D, & Goldin-Meadow, S. (2003). Language in mind: Advances in the study of language and thought. MIT Press.
Bowdle, B. F., & Gentner, D. (2005). The career of metaphor. Psychological review, 112(1), 193.
Loewenstein, J., & Gentner, D. (2005). Relational language and the development of relational mapping. Cognitive psychology, 50, 315-353.
Gentner, D. (2010). Bootstrapping the mind: Analogical processes and symbol systems. Cognitive science, 34 (5). 752-775.
Gentner, D., & Forbus, K. D. (2011). Computational models of analogy. Wiley Interdisciplinary Reviews: Cognitive Science, 2(3), 266-276.
2015 Recipient - Michael I. Jordan
2014 Recipient - Ray Jackendoff
Prof. Ray Jackendoff performs the clarinet with music director, XX, in Goddard Chapel.
Dr. Ray Jackendoff is one of the world’s leading figures in the cognitive science of language. He has developed a theory of language that articulates the contribution of each level of linguistic representation and their interaction, while also elucidating how language relates to other cognitive systems. While working broadly within the generative paradigm in linguistics, the cornerstone of Jackendoff’s research has been the human conceptual system. His central contributions to the study of language fall into two areas: the development of a theory of conceptual semantics, and an architecture of the language system designed to express conceptual meaning. In addition, he has developed (with Fred Lerdahl) one of the most influential theories of music cognition, in his book “A Generative Theory of Tonal Music”.
Within his account of conceptual semantics, Jackendoff has examined the conceptualization of space, the relationship between language, perception, and consciousness, and, most recently, on socially grounded concepts such as value, morality, fairness, and obligations. His approach not only provides a framework for the theory of meaning, which naturally integrates with linguistics, philosophy of language, and cognitive science, but also develops the formal machinery needed to instantiate this framework. The theory specifies the mental representations underlying communicative intentions, seamlessly incorporating pragmatics and world knowledge. Diverging from the syntax-focused view of much generative linguistics, Jackendoff’s approach shares much with Cognitive Grammar in postulating a powerful, generative conceptual system, in which semantic units such as objects, events, times, properties, and quantifiers, need not correspond one-to-one with syntactic categories.
Jackendoff’s Conceptual Semantics framework led naturally to the development of a characterization of the human language faculty that is expressly designed to explain the means by which concepts are expressed in language. In his book Foundations of Language: Brain, Meaning, Language, Evolution he outlines a parallel architecture for linguistic representation and processing. He argues that phonology, syntax, and semantics constitute independent generative components in language, each with its own primitives and combinatorial systems. In contrast with traditional generative approaches, these three “tiers” are not derived from syntax, but are rather correlated with each other by interface rules that establish correspondence between each pair of tiers. An important consequence of distributing the generative capacity across the three tiers, is that it clarifies how syntax can be seen as being primarily in service of mapping from semantic to phonology. Building on these ideas, Jackendoff develops this account more fully in his book “Simpler Syntax” (together with Peter Culicover), which reexamines the explanatory balance between syntax and semantics, structure and derivation, and rule systems and lexicon. In addition to being motivated by linguistic considerations, Jackendoff’s proposals also draw on theory and evidence from cognitive psychology, the neurosciences, and evolutionary biology, in examining fundamental issues such as innateness, the relationship between language and perception, and the evolution of the language faculty.
Ray Jackendoff is Seth Merrin Professor of Philosophy and Co-Director of the Center for Cognitive Studies at Tufts University. After receiving his BA in Mathematics from Swarthmore College in 1965, he completed his PhD in Linguistics from MIT in 1969, under the supervision of Noam Chomsky. He then joined the faculty at Brandeis University, where he remained until 2005, until taking up his current position at Tufts. Jackendoff is a Fellow of the American Academy of Arts and Sciences, of the American Association for the Advancement of Science, of the Linguistic Society of America, and of the Cognitive Science Society. He has held fellowships at the Center for Advanced Studies in the Behavioral Sciences and at the Wissenschaftskolleg zu Berlin, and has been a member of the External Faculty of the Santa Fe Institute. He has been President of both the Linguistic Society of America and the Society for Philosophy and Psychology, and was recipient of the 2003 Jean Nicod Prize in Cognitive Philosophy. He has been awarded five honorary degrees, the most recent in 2013 from Tel Aviv University.
1, Semantic Interpretation in Generative Grammar, MIT Press, 1972.
2. X-Bar Syntax: A Study of Phrase Structure, MIT Press, 1977.
3. A Generative Theory of Tonal Music (with Fred Lerdahl), MIT Press, 1982.
4. Semantics and Cognition, MIT Press, 1983.
5. Consciousness and the Computational Mind, Bradford/MIT Press, 1987.
6. Semantic Structures, MIT Press, 1990.
7. The Architecture of the Language Faculty, MIT Press, 1997.
8. Foundations of Language: Brain, Meaning, Grammar, Evolution, Oxford University Press, 2002.
9. Simpler Syntax (with Peter Culicover), Oxford University Press, 2005.
10. Language, Consciousness, Culture: Essays on Mental Structure, MIT Press, 2007.
11. A User’s Guide to Thought and Meaning, Oxford University Press, 2012.
2013 Recipient - Linda Smith
Dr. Linda Smith is one of the world’s leading cognitive scientists. Her research has focused on developmental process and mechanisms of change especially as they relate to early word learning. Her book, with Esther Thelen, “A Dynamical Systems Approach to the Development of Cognition and Action” has been a touchstone for this movement, and tremendously influential on the new generation of cognitive scientists. The book argues for a complex systems approach to cognitive development in which the functional integrations of sensory, motor processes, memorial and attentional processes in in real time and in specific tasks drives developmental change and for a synthetic systems approach over an analytic (divide and conquer) approach to understanding cognition and development. Her empirical and theoretical work exemplifies the systems approach showing how children’s early skills in early word learning are built from on attentional, associative, motor and visual processes; that early changes in visual object recognition and in object name learning co-develop, and that the spatial and temporal properties of attention and working memory are tightly tied to the sensory-motor systems of infants and toddlers. While arguing for a systems approach to cognition and development in terms of multiple component processes that are nested over times scales and levels of analysis, Smith also argues powerfully against classic approaches to cognitive science that focus on discrete reasoning with arbitrary symbols and that in so doing are profoundly adevelopmental and segregate cognition from sensory and motor systems.
Her approach has led to a number of empirical discoveries that broadly inform the study of typical and atypical cognitive development including her early work showing that children perceive their world more holistically than do adults, and that this difference could be formally modeled in terms of a perceptual system that was more broadly tuned in early development and more narrowly tuned later in development; that Piaget’s A-not-B error in early infancy – a phenomenon widely understood as reflecting infants’ concept that objects persist in time and space – reflected the sensory-motor processes of visually-directed reaching. These findings and insights from dynamic field models of the error yielded a more unified understanding of the overlapping spatio-temporal properties underlying motor planning and attention and are extended in recent work to the role of motor planning and working memory processes in toddlers’ ability to bind names to things. Finally, and perhaps most influentially, she has shown both empirically and in formal models how the statistical structure of language influences the properties that children will attend to, so that when a linguistic label is assigned to an object, shape becomes selectively important for children. Her careful work has not only documented this “shape bias” but has diagnosed its origins, consequences, and functionality to a developing system, as well as its role in atypical development. Her work has had broad impact outside as well as within developmental and cognitive psychology, including epigenetic robotics.
Dr. Smith is a Chancellor’s Professor and Distinguished Professor of Psychological and Brain Science, and of Cognitive Science, at Indiana University – Bloomington. She received her B.S from the University of Wisconsin (Madison) in 1973 and her Ph.D from the University of Pennsylvania in 1977, and joined the faculty at Indiana in 1977. She won the American Psychological Association Award for an Early Career Contribution, a Lilly Fellowship, and, from Indiana University, the Tracy Sonneborn Award. She is a Fellow of the Society of Experimental Psychologists, the American Psychological Society, the Cognitive Science Society, and the American Academy of Arts and Sciences. Her graduate students now occupy prestigious faculty positions around the world. She has chaired Psychological and Brain Science Department at Indiana University, served on multiple advisory committees concerned with the future directions of science for the National Science Foundation and the National Institutes of Health, served on the governing boards of the Cognitive Science Society and the International Conference on Development and Learning, and as the chair of the Rumelhart Prize Committee.
1. Smith, L. B. (1989). A model of perceptual classification in children and adults. Psychological Review. 96. 125-144.
2. Jones, S.S. & Smith, L.B. (1993) The place of perceptions in children’s concepts. Cognitive Development. 8, 113-140.
3. Thelen, E., & Smith, L. B. (1994) A dynamical systems approach to the development of cognition and action. MIT Press.
4. Smith, L.B., Jones, S. &Landau, B. (1996) Naming in young children: A dumb attentional mechanism? Cognition 60, 143-171.
5. Smith, L.B., Thelen, E., Titzer, R, & McLin, D. (1999) Knowing in the context of acting: The task dynamics of the A not-B error. Psychological Review, 106, 235-260.
6. Smith, L.B., Jones, S.S., Landau, B., Gershkoff-Stowe, L. & Samuelson, S. (2002) Early noun learning provides on-the-job training for attention. Psychological Science, 13, 13-19.
7. Smith, L.B. & Gasser, M. (2005) The development of embodied cognition: Six lessons from babies. Artificial Life, 11, 13-30.
8. Colunga, E., & Smith, L. B. (2005). From the lexicon to expectations about kinds: A role for associative learning. Psychological Review, 112(2), 347-382.
9. Smith, L. B. (2009). From fragments to geometric shape: Changes in visual object recognition between 18 and 24 months. Current Directions in Psychological Science, 18(5), 290-294.
10. Smith, L. B., Yu, C., & Pereira, A. F. (2011) Not your mother’s view: the dynamics of toddler visual experience. Developmental Science, 14 (1), 9-17
11. Yu, C. & Smith, L. B. (2012) Modeling Cross-Situational Word-Referent Learning: Prior Questions. Psychological Review, 119(1), 21-39.
12. Samuelson, L., Smith, L. B., Perry, L. & Spencer, J. (2011) Grounding Word Learning in Space. PLoS One 6(12): e28095. doi:10.1371/journal.pone.0028095.
2012 Recipient - Peter Dayan
Dr. Peter Dayan is a pre-eminent researcher in Computational Neuroscience with a primary focus on the application of theoretical computational and mathematical methods to the understanding of neural systems. He has pioneered the use of Bayesian and other statistical and control theory methods from machine learning and artificial intelligence for building theories of neural function. A major focus of his work is on understanding the ways in which animals and humans come to choose appropriate actions in the face of rewards and punishments, and the processes by which they come to form neural representations of the world. The models are informed and constrained by careful attention to neurobiological, psychological and ethological data and the models are both mathematically specified and computationally implemented. Dr. Dayan is a co-author of ‘Theoretical Neuroscience,’ a leading textbook in the field. It also provides one of the most influential sources of hypotheses about the function of the human mind and brain in current cognitive science.
Dr. Dayan’s early work focused on reinforcement learning. This is an appealing learning paradigm because it does not require explicit instruction on what actions would have been ideal for an organism. Instead, it only requires that an environment provide rewards depending upon sequences of actions. Furthermore, through a process of comparing actions to internally generated predictions of rewards, these methods are able to learn even when external rewards are not immediately available. Reinforcement learning integrates psychological insights from human and non-human animal learning with notions from control theory about optimal behavior. With colleagues, Dr. Dayan broadened these existing links using Bayesian ideas about uncertainty, and extended them into the domain of neuroscience by identifying the neuromodulator dopamine as a neural mechanism that plays a critical role in generating internal predictions of reward.
Dr. Dayan’s path-breaking work has been influential in several fields impinging on cognitive science, including machine learning, mathematics, neuroscience, and psychology. The center of mass of his research program has been concerned with learning (self-supervised and reinforcement) and conditioning, and the influence of neuromodulation on this learning. However, he has also contributed significantly to the study of activity-dependent development and to key issues relating to population coding and dynamics. For example, in contrast with classical accounts, which suggest that the activity of large populations of neurons encode the value of the stimulus, he has articulated a view in which neural computation is akin to a Bayesian inference process, with population activity patterns representing uncertainty about stimuli in the form of probability distributions. Finally, Dr. Dayan has contributed to furthering our understanding of hippocampal function. As an example, he has suggested that a key function of the replay of patterns of activity in the hippocampus that occurs during sleep is to maintain the representational relationship between this structure and the cortex so that episodic memory can continue to work over a lifetime, even as the coding of information changes with the acquisition of new knowledge. Critically, in all of his work, Dr. Dayan has been systematically attentive to the empirical findings, using them to constrain the simulations and theory closely, and, in so doing, has been able to articulate tractable and plausible biological accounts of the neural computations subserving learning and memory function, more generally.
Dr. Dayan’s academic career started at the University of Cambridge where he obtained a Bachelor of Arts (Hons) degree in Mathematics. This was followed by a PhD degree in artificial intelligence at the University of Edinburgh, which focused on statistical and neural network models of learning. He then went on to do a series of postdoctoral fellowships, a brief one with the MRC Research Centre in Brain and Behaviour at Oxford, followed by one at the Computational Neurobiology Laboratory at The Salk Institute, the Department of Computer Science at the University of Toronto, following which he became an assistant professor at MIT. Dr. Dayan subsequently relocated to the Gatsby Computational Neuroscience Unit at University College London in 1998, assuming the position of Director of this Unit in 2002. He continues to direct the Gatsby Unit at present, and is a Professor of Computational Neuroscience at University College London.
Dr. Dayan has written approximately 200 publications, which have garnered in excess of 15,000 citations. Dr. Dayan has played a formidable role in fostering the growth of Computational Neuroscience as a discipline. He has mentored many junior investigators, served as an adviser to numerous advisory boards and selection panels, been on the editorial boards of multiple journals and participated as a member of program committees for a variety of academic conferences.
Hinton, GE, Dayan, P, Frey, BJ & Neal, RM (1995). The wake-sleep algorithm for unsupervised neural networks. Science, 268, 1158-1160.
Montague, PR, Dayan, P & Sejnowski, TK (1996). A framework for mesencephalic dopamine systems based on predictive Hebbian learning. Journal of Neuroscience, 16, 1936-1947.
Dayan, P, Kakade, S & Montague, PR (2000). Learning and selective attention. Nature Neuroscience, 3, 1218-1223.
Daw, ND, Kakade, S & Dayan, P (2002). Opponent interactions between serotonin and dopamine. Neural Networks, 15, 603-616.
Dayan, P & Balleine, BW (2002). Reward, motivation and reinforcement learning. Neuron, 36, 285-298.
Kali, S & Dayan, P (2004) Off-line replay maintains declarative memories in a model of hippocampal-neocortical interactions. Nature Neuroscience, 7, 286-294.
Daw, ND, Niv, Y & Dayan, P (2005) Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience 8:1704-1711.
Yu, AJ & Dayan, P (2005) Uncertainty, neuromodulation, and attention. Neuron 46:481-492.
Dayan, P, Niv, Y, Seymour, BJ & Daw, ND (2006) The misbehavior of value and the discipline of the will. Neural Networks 19:1153-1160.
Niv, Y, Daw, ND, Joel, D & Dayan, P (2007) Tonic dopamine: Opportunity costs and the control of response vigor. Psychopharmacology, 191:507-520.
Dayan, P & Huys, QHM (2008) Serotonin, inhibition and negative mood. Public Library of Science: Computational Biology, 4(2):e4.
Huys, QJM & Dayan, P (2009). A Bayesian formulation of behavioral control. Cognition, 113:314-328.
Schwartz, O, Sejnowski, TJ & Dayan, P (2009) Perceptual organization in the tilt illusion. Journal of Vision, 9:1-20.
Dayan, P & Solomon, JA (2010) Selective Bayes: Attentional load and crowding. Vision Research, 50:2248-2260.
2011 Recipient - Judea Pearl
Dr. Judea Pearl has been a key researcher in the application of probabilistic methods to the understanding of intelligent systems, whether natural or artificial. He has pioneered the development of graphical models, and especially a class of graphical models known as Bayesian networks, which can be used to represent and to draw inferences from probabilistic knowledge in a highly transparent and computationally natural fashion. Graphical models have had a transformative impact across many disciplines, from statistics and machine learning to artificial intelligence; and they are the foundation of the recent emergence of Bayesian cognitive science. Dr. Pearl’s work can be seen as providing a rigorous foundation for a theory of epistemology which is not merely philosophically defensible, but which can be mathematically specified and computationally implemented. It also provides one of the most influential sources of hypotheses about the function of the human mind and brain in current cognitive science.
Dr. Pearl has further developed his work on graphical models to address one of the deepest challenges in philosophy and science: the analysis of causality. He has developed a calculus for reasoning about the causal structure of the world, which is able, for the first time, to give a precise analysis of the impact of interventions and how they combine with passive observations. He is able to interpret graphical models as providing a specification of the causal structure of a system, rather than merely providing a compact representation of a joint probability distribution. Given that our knowledge of the world is important primarily because it serves as the basis for action—i.e., for making interventions to the world which, we hope, will help achieve our goals, building a theory of causality is of central importance to understanding human cognition. Dr. Pearl’s path-breaking work has been enormously influential. In statistics, his work on causality has substantially contributed to the re-engagement of the statistical community with the problem of modeling causation, inferring causal structure from data, and pinpointing precisely the assumptions necessary for such inference. In philosophy, his analyses have provided a precise formulation, and elaboration, of previously informal theories of the nature of causality, counterfactual thinking, and interpretation of the natural language indicative and subjunctive conditionals, if-then, had-it-been, and if-it-were-not-for. Moreover, Dr. Pearl’s work on causality has helped reinvigorate causality research in cognitive science, leading to a wide variety of models and experiments.
Dr Pearl’s academic career began in electrical engineering. He has a Bachelors degree in Electrical Engineering from the Technion – Israel Institute of Technology (1960); a Masters degree in Physics from Rutgers University (1965); and Ph.D. degree in Electrical Engineering from the Polytechnic Institute of Brooklyn (1965). He worked at RCA Research Laboratories, Princeton, New Jersey, on superconductive parametric and storage devices, and at Electronic Memories, Inc., Hawthorne, California, on advanced memory systems, before joining UCLA in 1970, where he is currently Director of the Cognitive Systems Laboratory in the Department of Computer Science.
He has written over 350 publications, including three highly influential books. The first, Heuristics (1984) provided an analysis and overview of heuristic methods for domains including planning, problem-solving, scheduling, and optimization, with particular reference to applications in artificial intelligence and operations research. His second book, Probabilistic Reasoning in Intelligent Systems (1988), outlined his seminal work on graphical models for the representation of, and reasoning with, probabilistic knowledge and uncertain evidence. His third book, Causality: Models, Reasoning, and Inference (2000), summarized his breakthrough research on representing, and making inferences about, causal and counterfactual relationships.
Dr. Pearl has previously been awarded a number of distinctions and honors. He is a Fellow of the Institute for Electronics and Electrical Engineers, and the Association for the Advancement of Artificial Intelligence, and a Member of the National Academy of Engineering, and he received an honorary doctorate from the University of Toronto in 2007. He has received major awards recognizing the impact of his research across a number of disciplines, including the Award for Research Excellence from the International Joint Conferences on Artificial Intelligence (1999), the Classic Paper Award from the Association for the Advancement of Artificial Intelligence (2000), the Lakatos Award for distinguished contributions to the philosophy of science (2001), the Association for Computing Machinery’s Allen Newell Award for outstanding contributions to computer science (2003), and the Benjamin Franklin Medal in Computers and Cognitive Science (2008).
Pearl, J. (1984). Heuristics. Reading, MA: Addison-Wesley.
Pearl, J. (1986). Fusion, Propagation and Structuring in Belief Networks. Artificial Intelligence, 29, 241 – 288.
Dechter, R. & Pearl, J. (1987). Network-Based Heuristics for Constraint-Satisfaction Problems. Artificial Intelligence, 34, 1 – 38.
Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems, San Mateo, CA and : Morgan-Kaufmann.
Dechter, R. & Pearl, J. (1989) Tree-Clustering Schemes for Constraint-Processing. Artificial Intelligence, 38, 353 – 366.
Pearl, J. & Verma, T. S. (1991). A Theory of Inferred Causation. In J.A Allen, R. Fikes, and E. Sandewall (Eds.), Principles of Knowledge Representation and Reasoning: Proceeding of the Second International Conference, San Mateo, CA: Morgan Kaufmann, 441 – 452.
Dechter, R., Meiri, I. & J. Pearl, J. (1991). Temporal Constraint Networks, Artificial Intelligence, 49, 61 – 95.
Verma, T. & Pearl, J. (1991). Equivalence and Synthesis of Causal Models. In P. Bonissone, M. Henrion, L. N. Kanal & J. F. Lemmer (Eds.), Uncertainty in Artificial Intelligence 6, Cambridge, MA, Elsevier Science Publishers, 225 – 268.
Pearl, J. (1995). Causal Diagrams for Empirical Research, Biometrika, 82, 669 – 709.
Pearl, J. (2000). Causality: Models, Reasoning, and Inference, Cambridge, UK: Cambridge University Press.
Halpern, J. Y. & Pearl, J. (2005). Causes and explanations: A structural-model approach—Part I: Causes. British Journal of Philosophy of Science, 56, 843 – 887.
Halpern, J. Y. & Pearl, J. (2005). Causes and explanations: A structural-model approach—Part II: Explanations. British Journal of Philosophy of Science, 56, 889 – 911.
Shpitser, I. & J. Pearl, J. (2008). Complete Identification Methods for the Causal Hierarchy, Journal of Machine Learning Research, 9, 1941 – 1979.
Pearl, J. (2009). Causal inference in statistics: An overview. Statistics Surveys, 3, 96 – 146.
2010 Recipient - James McClelland
Dr. James McClelland’s theoretical and experimental contributions have been instrumental in establishing an alternative to the traditional symbolic theory of mind. In his connectionist alternative, cognition is conceptualized as the emergent result of interactions within interconnected networks of simple neuron-like units. Inspired by the massive parallelism found in brains, local inhibitory and excitatory connections between units give rise to structured thoughts, mental schemas, and memories that are distributed across units. Learning is conceptualized as changes to the efficacy with which units excite or inhibit one another. Drs. McClelland and Rumelhart formed the PDP Research group to pursue this connectionist program, and this group produced the two-volume Parallel Distributed Processing (Rumelhart, McClelland, and the PDP Research Group, 1986). These two volumes galvanized much of the cognitive science community to develop, explore and test new computational models of phenomena in learning, memory, language, and cognitive development.
Much of Dr. McClelland’s work has fused connectionist computational modeling with empirical research in cognitive psychology and neuroscience. He pioneered information processing models in which earlier processing stages do not complete their processing before beginning to send their products to subsequent stages (McClelland, 1979). This cascaded processing dynamic was put to effective use in the joint work with Rumelhart on the Interactive Activation model of word perception. This model was an early and elegant example of a working computational model that showed how it is possible for letter perception to influence word perception at the same time that word perception influences letter perception without these bidirectional influences being viciously circlar (McClelland & Rumelhart, 1981). This model captured many empirical phenomena, including the striking “word superiority effect” in which letters are better identified in the context of words than in isolation or when contained within non-words (Johnston & McClelland, 1974; Rumelhart & McClelland, 1982).
In 1986, Dr. McClelland (in collaboration with Jeffrey Elman) proposed a connectionist model of speech perception and lexical processing based on the idea that word activation and competition unfolds in time. The McClelland and Elman TRACE (1986) paper is one of the most highly cited papers in psycholinguistics with its original ideas now incorporated in contemporary models of lexical activation and well supported by the experimental evidence.
Another collaboration between Rumelhart and McClelland addressed the basis of language knowledge and the process of language acquisition. Focusing on the past tense inflection of English words, they showed how a simple PDP network could learn through a simple connection adjustment process to regularize and even over-regularize, over-riding correct performance on exceptions as knowledge of the regular inflectional pattern in the language was acquired. With Karalyn Patterson, David Plaut, Mark Seidenberg, and others, McClelland extended these ideas to single word reading, and with Mark St. John he extended them to sentence comprehension. This work prompted an intense ongoing debate on the nature of language knowledge and of the mechanisms of language acquisition.
More recently, Dr. McClelland has developed and empirically tested neurologically plausible models of memory. His model of the neurological specialization of memory function has been particularly influential. This model attributes rapid learning of potentially arbitrarily juxtaposed aspects of an event to the hippocampus, whereas the neocortex plays a complementary role in gradual learning that exploits structure implicitly present in ensembles of inputs (McClelland, McNaughton, and O’Reilly, 1995). This model accounts for striking patterns of spared and impaired memory in amnesic patients and sets a new standard for theory in cognitive science — insight and testable predictions about both behavior and brain functioning.
Building on earlier work on learned distributed semantic representations by Geoffrey Hinton and David Rumelhart, McClelland has worked with several colleagues to develop a broad theory of how the neocortex may learn and represent semantic knowledge (Rogers and McClelland, 2004; Rogers et al, 2004). This connectionist model of semantic knowledge provides a unified explanation of children’s acquisition of basic and superordinate categories, their reasoning about these categories, and also of the deterioration of this knowledge in dementia.
Dr. McClelland has also made many important service contributions to cognitive science. He has been associate or senior editor for Cognitive Science, Neural Computation, Hippocampus, Neurocomputing, and Proceedings of the National Academy of Sciences and he served as a member of the National Advisory Mental Health Council He was president of the Cognitive Science Society from 1991-1992, member of the Cognitive Science Society governing board from 1988-1994, and his currently is the president-elect of the Federation of Associations in Behavioral and Brain Sciences. He has was the founding Co-Director of the Center for the Neural Basis of Cognition at Carnegie Mellon, and is currently the founding Director of the Center for Mind, Brain, and Computation at Stanford University. In September, 2009, he will become the Chair of the Psychology Department at Stanford.
Prior to receiving the Rumelhart Prize, Dr. McClelland has earned a number of honors and awards. He is a member of the National Academy of Sciences, a fellow of the American Association for the Advancement of Science, a member of the American Philosophical Society,. He has received the APS William James Fellow Award for lifetime contributions to the basic science of psychology, the 1993 Howard Crosby Warren Medal from the Society of Exerimental Psychologists, the 1996 American Psychological Association Distinguished Scientific Contribution Award, the 2001 Barlett lectureship from the Experimental Psychological Society, The 2001 Grawemeyer Award, the 2002 IEEE Neural Networks Pioneer Award, the 2003 American Psychological Society William James Fellow award, and the 2005 University of Turin Mind-Brain Prize.
Johnston, J. C., & McClelland, J. L. (1974). Perception of letters in words: Seek not and ye shall find. Science, 184, 1192 – 1194.
McClelland, J. L. (1979). On the time relations of mental processes: An examination of systems of processes in cascade. Psychological Review, 86, 287 – 330.
McClelland, J. L., & Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception, Part I: An account of basic findings. Psychological Review, 88, 375 – 407.
Rumelhart, D. E., & McClelland, J. L. (1982). An interactive activation model of context effects in letter perception, Part II: The contextual enhancement effect and some tests and extensions of the model. Psychological Review, 89, 60 – 94.
McClelland, J. L. & Elman, J. E. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1-86.
Rumelhart, D. E., & McClelland (1986). On learning the past tenses of English verbs. In McClelland, J. L., Rumelhart, D. E., and the PDP research group (Eds.) Parallel distributed processing: Explorations in the microstructure of cognition. Volume II. Cambridge, MA: MIT Press. Chapter 18, pp. 216-271.
Rumelhart, D. E., McClelland, J. L., and the PDP research group. (1986). Parallel distributed processing: Explorations in the microstructure of cognition. Volume I. Cambridge, MA: MIT Press.
McClelland, J. L., Rumelhart, D. E., and the PDP research group. (1986). Parallel distributed processing: Explorations in the microstructure of cognition. Volume II. Cambridge, MA: MIT Press.
Seidenberg, M. S., & McClelland, J. L. (1989). A distributed developmental model of word recognition and naming. Psychological Review, 96(4), 523 – 568.
Cohen, J. D., Dunbar, K., & McClelland, J. L. (1990). On the control of automatic processes: A parallel distributed processing model of the stroop effect. Psychological Review, 97, 332 – 361.
St. John, M. F., & McClelland, J. L. (1990). Learning and applying contextual constraints in sentence comprehension. Artificial Intelligence, 46, 217-257.
McClelland, J. L., McNaughton, B. L., & O’Reilly, R. C. (1995). Why there are complementary learning systems in hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102, 419 – 457.
Plaut, D.C., McClelland, J. L., Seidenberg, M. S., & Patterson, K. (1996). Understanding normal and impaired word reading: Computational principles in quasi-regular domains. Psychological Review, 103, 56 – 115.
Munakata, Y., McClelland, J. L., Johnson, M. H., & Siegler, R. S. (1997). Rethinking infant knowledge: Toward an adaptive process account of successes and failures in object permanence tasks. Psychological Review, 104, 686 – 713.
McClelland, J. L., & Chappell, M. (1998). Familiarity breeds differentiation: A subjective-likelihood approach to the effects of experience in recognition memory. Psychological Review, 105, 4, 724 – 760.
Movellan, J. R., & McClelland, J. L. (2001). The Morton-Massaro law of information integration: Implications for models of perception. Psychological Review, 108, 113 – 148.
Usher, M., & McClelland, J. L. (2001). The time course of perceptual choice: The leaky competing accumulator model. Psychological Review, 108, 550 – 592.
Rogers, T. T., Lambon Ralph, M. A., Garrard, P., Bozeat, S., McClelland, J. L., Hodges, J. R., and Patterson, K. (2004). The structure and deterioration of semantic memory: A neuropsychological and computational investigation. Psychological Review, 111, 205 – 235.
Rogers, T. T., & McClelland, J. L. (2004). Semantic Cognition: A Parallel Distributed Processing Approach. Cambridge, MA: MIT Press.
Usher, M., & McCelland, J. L. (2004). Loss aversion and inhibition in dynamical models of multi-alternative choice. Psychological Review, 111, 757 – 769.
Vallabha, G. K., McClelland, J. L., Pons, F., Werker, J. and Amano, S. (2007). Unsupervised learning of vowel categories from infant-directed speech. Proceedings of the National Academy of Sciences, 104, 13273 – 13278.
Spencer, J. P., Thomas, M. S. C. & McClelland, J. L. (Eds.) (2009). Toward a Unified Theory of Development: Connectionism and Dynamic Systems Theory Re-Considered. New York: Oxford University Press.
2009 Recipient - Susan Carey
Susan Carey is a Harvard psychologist whose work has explored fundamental issues surrounding the nature of the human mind. Carey is the Henry A. Morss, Jr. and Elisabeth W. Morss Professor of Psychology in the Faculty of Arts and Sciences and is the first woman to receive the Rumelhart Prize. Additionally, Carey is the first recipient awarded the prize for her theoretical contributions to the study of human development.
The selection committee recognized Carey’s work for the clarity of insights on deep and foundational questions concerning philosophy of mind and also for her rigorous and elegant experimental methods. Her book Conceptual Change in Childhood (MIT Press, 1985) was highly influential in setting the agenda for research on concepts in both children and adults. Her current research on number concepts and her forthcoming book The Origins of Concepts (to be published by Oxford University Press) have extraordinary reach, spurring advances in cognitive neuroscience, in evolutionary psychology, and in the comparative study of human and nonhuman primates.
Carey received her B.A. from Radcliffe in 1964, and she received a Fullbright Fellowship to London University in 1965. She received her Ph.D. from Harvard in 1971. Carey is a member of the American Philosophical Society, the National Academy of Sciences, the American Academy of Arts and Sciences, the National Academy of Education, and the British Academy. She has been a member of the Harvard faculty since 2001, and previously taught at Massachusetts Institute of Technology and at New York University.
2008 Recipient - Shimon Ullman
Throughout his career, Shimon Ullman has exploited computational methods and experimental investigations, leading to key insights into the process of perceiving the three-dimensional structure of the world and recognizing objects from vision. He is a fitting recipient of the David E. Rumelhart prize since his research addresses the theoretical foundations of perception, and draws heavily on both mathematical and experimental investigations, as did the research of David Rumelhart.
Dr. Ullman did his undergraduate work in Mathematics, Physics and Biology at the Hebrew University in Israel. He received his Ph.D. from MIT in Electrical Engineering and Computer Science in 1977, becoming David Marr’s first Ph.D. student. Remaining at MIT, he became an Associate Professor in the Department of Brain and Cognitive Sciences in 1981 and a Full Professor in 1985. Simultaneously, he took a position in applied mathematics at the Weizmann Institute of Science in Israel. While employed at both institutions, he also became the chief scientist at Orbotech, a position he held until 2004. In 1994, he left MIT to be the Head of the Department of Applied Mathematics and Computer Science at the Weizmann Institute, where he is now the Samy & Ruth Cohn Professor of Computer Science.
Dr. Ullman has developed elegant and well-grounded computational models of vision and carefully compared them to human visual processes. This comparison has proven valuable for furthering research in both natural and artificial vision. The computational models have provided working systems that provide accounts of how humans recognize objects, perceive motion, probe their visual world for task-relevant information, and create coherent representations of their environments. These models, by reproducing many impressive feats of human vision as well as occasional illusory percepts, provide satisfying theories of how humans perceive their world . Reciprocally, a close consideration of human vision has provided Dr. Ullman with inspiration for his computational models, leading to solutions to difficult problems in artificial intelligence . By learning from natural intelligence, Dr. Ullman has created artificial intelligence systems that would otherwise most likely never have been constructed.
Dr. Ullman’s contributions have spanned across low-level [1, 3, 4, 14] and high-level [6, 7, 9, 10, 11, 12, 15] vision. Low-level vision is associated with the extraction of physical properties of the visible environment, such as depth, three-dimensional shape, object boundaries, and surface material. High-level vision concerns object recognition, classification, and determining spatial relations among objects. By conducting pioneering research on both fronts, Dr. Ullman has been able to create complete models of vision that begin with raw, unprocessed visual inputs and produce as outputs categorizations such a “car,” “face,” and “Winston Churchill.”
In the 1970’s, Dr. Ullman pioneered research on motion perception. In his dissertation, he developed computational mechanisms able to perceive motion of objects from noisy and complex scenes. These models assumed only the presence of patterns of light intensity that changed over time. They did not presume coherent, stable objects. In the book stemming from his dissertation , Dr. Ullman showed that the perception of stable objects depends on solving the “correspondence problem” – determining which elements of one movie frame correspond to the elements from the next frame. Dr. Ullman’s solution to this problem employed several constraints for determining correspondences, including a drive to create one-to-one correspondences, and the proximity, light similarity, and shape similarity of the elements across frames. None of these constraints is decisive by itself. People can see two elements as belonging to the same object even if they do not have the color, darkness, shape, or location. However, when these sources of information are combined together, correspondences emerge over time that are coherent and globally harmonious. Once established, these correspondences determine what elements across frames will be deemed as belonging to the same object, as well as the motion of the hypothesized objects.
Dr. Ullman went on to show that it is possible to determine the three-dimensional structure of an object from its motion [1, 2]. It is not necessary to have a pre-established object representation, but only identifiable points from the object projected on a two-dimensional image plane akin to the human retina. The representation of the object itself can be computed rather than assumed. Assuming that a moving object is rigid, Dr. Ullman formally proved that it is possible to deduce both the three-dimensional structure and motion of the object from only three different views of it with four non-colinear identified points. This work was an early influential example of a computationally driven approach to human vision, adding to a growing corpus of algorithmically formulated solutions for enabling artificial cognitive systems to see and interpret the world.
In the same way that the elements of temporally adjacent frames can be placed into alignment with one another to reveal motion, Dr. Ullman used another alignment process to recognize objects. He and his students developed techniques to align two-dimensional images with three-dimensional models or previously stored two-dimensional views in order to classify the images [6, 7, 10, 12]. This approach has had noteworthy success in recognizing faces and other difficult-to-describe objects [9, 10, 11].
Dr. Ullman pioneered the use of “visual routines” to compute visual relations such as “X lying inside versus outside of Object Y” and “X on Contour A but not B” [3, 9]. This work posited program-like visual operations such as marking regions, spreading regions, and boundary tracing to act as a bridge between low-level and high-level perceptual properties. Whitman Richards, Professor of Cognitive Science at MIT, notes that “these ideas continue to influence visual psychophysics and models for object recognition, saliency and attention.” Related to this work, Dr. Ullman created models of contour integration that demonstrated how people can find informative edges of objects despite noise and occlusions . This work has provided the basis for a model of functional recovery following retinal lesions, linking observations on remapping of cortical topography to expected perceptual changes in subjects suffering from adult macular degeneration
Together with Christof Koch, Dr. Ullman developed the notion of “saliency maps” that underlie the detection of image locations, and are employed in conjunction with attentional mechanisms . These structures serve perceptual segmentation processes, and implicate both low-level visual properties and high-level task demands. This work, while grounded in the behavior and neurobiology of human vision, has proven useful in computer and robotic vision applications in which a system must quickly and adaptively interact with a changing environment. It also led to the development of his “counter streams” model of the bi-directional flow of information in visual cortex. This model is consistent with the massive recurrent feedback found in visual cortex, and gives rise to mutually reinforcing bottom-up and top-down influences on perception .
Dr. Ullman has proposed that object recognition can effectively proceed by learning fragments based upon environmentally presented objects. Avoiding problems with either extreme view – either creating whole-object templates or using elemental features such as simple lines or dots – his fragment-based model acquires diagnostic intermediate representations that are weighted according to their informational content. The fragments are not predetermined, but rather are based upon training stimuli and will vary with the class of objects to be classified. Using a hierarchical representation, the fragments are assembled into larger fragments constrained by color, texture, and contour similarity [14, 15]. Firmly grounded in mathematical information theory, fragments have also received empirical support from neurophysiological investigations. Dr. Ullman’s work addresses one of the most salient puzzles regarding the neural coding of objects: why do we find few, if any neurons, that code for objects, but instead find visual neurons that are sensitive to seemingly random object features and parts? Dr. Ullman has also played an important role in training computer, cognitive, and vision scientists. Several of his students have gone on to become leading researchers themselves, including Moshe Bar, Ronen Basri, Shimon Edelman, Kalanit Grill-Spector, Avraham Guissin, Ellen Hildreth, Dan Huttenlocher, Brian Subirana, and Dimitri Terzoupolous.
Over his career, Dr. Ullman has consistently developed elegant models that are appropriately constrained by psychological and neurophysiological evidence. His 1997 book “High-level Vision: Object recognition and visual cognition” provides a unified and singularly coherent formal approach to vision. His contributions reach far beyond the computer-vision community to researchers whose scientific passions center on the development of computational accounts of animal, and especially human, intelligence+; his outstanding contributions have been highly influential in shaping the research direction of a whole generation of cognitive scientists and neuroscientists interested in vision.+ Through his involvement with the company Orbotech, he has also participated in the development of real-world applications of his theories to the automated inspection of circuit boards and displays. Just as his models have integrated line segments to create contours and contours to create objects, his research has integrated low-level and high-level perception, neuroscience and functional descriptions, human and machine vision, as well as theory and application.
 Ulman, S. (1979). The interpretation of visual motion. Cambridge, MA: MIT Press. Ullman, S. (1980). Against direct perception. The Behavioral and Brain Sciences, 3, 373-415. Ullman, S. (1984). Visual routines. Cognition, 18, 97-159. Koch, C., & Ullman, S. (1985). Shifts in selective visual attention: Towards the underlying neural circuitry. Human Neurobiology, 4, 219-227. Ullman, S. (1986). Artificial intelligence and the brain: Computational studies of the visual system. Annual Review of Neuroscience, 9, 1-26. Ullman, S. (1989). Aligning pictorial descriptions: An approach to object recognition. Cognition, 32, 193-254. Huttenlocher, D. P., & Ullman, S. (1990). Recognizing solid objects by alignment with an image. International Journal of Computer Vision, 5, 195-212. Ullman, S. (1995). Sequence-seeking and counter streams: A computational model for bi-directional information flow in the visual cortex. Cerebral Cortex, 5(1) 1-11 Ullman, S. (1996). High-level vision: Object recognition and visual cognition. Cambridge, MA: MIT Press. Ullman, S., & Basri, R. (1991). Recognition by linear combination of models. IEEE Pattern Matching and Machine Intelligence, 13, 992-1006. Adini, Y., Moses, Y. and Ullman, S. (1997). Face recognition: the problem of compensating for changes in illumination direction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 721-732 Moses, Y. and Ullman, S. (1998). Generalization to novel views: Universal, class-based and model-based processing. International Journal of Computer Vision, 29(3) 233-253 Ullman, S. & Solovieiv, S. (1999) Computation of pattern invariance in brain-like structures. Neural Networks, 12, 1021-1036. Ullman, S., Vidal-Naquet, M. , and Sali, E. (2002) Visual features of intermediate complexity and their use in classification. Nature Neuroscience, 5(7), 1-6 Ullman, S. (2006). Object recognition and segmentation by a fragment-based hierarchy. Trends in Cognitive Sciences, 11, 58-64.
2007 Recipient - Jeffrey L. Elman
Jeffrey L. Elman has made several major contributions to the theoretical foundations of human cognition, most notably in the areas of language and development. His work has had an immense impact across fields as diverse as cognitive science, psycholinguistics, developmental psychology, evolutionary theory, computer science and linguistics. Elman’s 1990 paper Finding Structure in Time  introduced a new way of thinking about language knowledge, language processing, and language learning based on distributed representations in connectionist networks. The paper is listed as one of the 10 most-cited papers in the field of psychology between 1990 and 1994, and the most often cited paper in psycholinguistics in that period. This work, together with earlier Elman’s earlier work on speech perception and subsequent work on learnability, representation, innateness, and development, continues to shape the research agendas of researchers in cognitive science, psycholinguistics, and many other fields.
Elman received his Bachelor’s degree from Harvard in 1969 and his Ph.D. in Linguistics from the University of Texas in 1977. That same year he joined the faculty at UCSD, where he has remained ever since, first in the department of Linguistics and now in the Department of Cognitive Science. He is now Distinguished Professor in the Department of Cognitive Science, as well as Acting Dean of the Division of Social Sciences and Co-Director of the Kavli Institute for Mind and Brain.
In the early 1980’s, Jeff was among the first to apply the principles of graded constraint satisfaction, interactive processing, distributed representation, and connection-based learning that arose in the connectionist framework to fundamental problems in language processing and learning. His early work concentrated on speech perception and word recognition, leading to the co-development (with Jay McClelland) of TRACE [2,3], an interactive-activation model that addressed a wide range of findings on the role of context in the perception of speech. Elman and McClelland conducted their simulations in conjunction with experimental studies of speech recognition, predicting novel results that provided strong empirical support for the central tenet of the model . The key finding — that contextual and lexical influences can reach down into and retune the perceptual mechanisms that assign an initial perceptual representation to spoken words — has been the focus of intense ongoing investigation. More generally, there is a large body of ongoing computational and experimental research addressing the principles embodied in the TRACE model, and a dedicated web site with a complete implementation of nearly all published TRACE simulations.
For Elman, TRACE was only the beginning of an important new way of construing the nature of spoken language knowledge, language learning, and language processing. Elman’s subsequent work on language learning in Simple Recurrent Networks has been revolutionary. In this work, Elman lets go of all of the commitments previous researchers have made about language. First, instead of treating time explicitly as a variable that must be encoded in the information fed to a neural network, the “Elman net” (as it is often called) actually lives in time, combining its memory of past events with the current stimulus and generating a prediction about “what will come next.” Second, instead of treating language knowledge as a system of rules operating over abstract categories, the Elman net acquired representational and structure-processing capabilities as a result of exposure to a corpus of sentences embodying language regularities. These issues were first developed in the Finding structure in time  and elaborated in a subsequent paper on distributed representations and grammatical structure in simple recurrent networks ; many subsequent investigations have been spawned by these two papers.
The next major development in Elman’s work appeared in a subsequent paper on “The importance of starting small” . This has had as much impact in developmental psychology as it has had in linguistics. In the 1993 paper, Elman showed that successful learning of grammatical structure depends, not on innate knowledge of grammar, but on starting with a limited architecture that is at first quite restricted in complexity, but then expands its resources gradually as it learns. The demonstration in Starting Small is of central importance as it stands in stark contrast to earlier claims that the acquisition of grammar requires innate language-specific endowments. It is also crucial for developmental psychology, because it illustrates the adaptive value of starting, as human infants do, with a simpler initial state, and then building on that to develop more an more sophisticated representations of structure. It seems that starting simply may be a very good thing, making it possible for us to learn what might otherwise prove to be unlearnable, in the absence of detailed linguistic knowledge.
More recent applications of these ideas include a paper capturing historical language change in English grammar across generations of neural network simulations, and fundamental studies of the computational properties of recurrent nets, showing how these systems are able to solve problems of recursive embedding [7,8]. This latter work lays the groundwork for a formal theory of neural computation in recurrent networks that might ultimately do for what neural networks what the Chomsky hierarchy has done for discrete automata. The results from these studies (with Paul Rodriguez and Janet Wiles) suggest that there are indeed important differences in the way recursive structure is encoded in recurrent networks and in discrete automata. Furthermore, these differences, which revolve around the context- and content-sensitivity of recurrent networks’ encoding of constituency, seem highly relevant for explaining natural language phenomena.
Many of Elman’s ideas about ontogeny were worked out in detail with several colleagues in the 1996 book, Rethinking innateness: A connectionist perspective on development , where the Nature-Nurture controversy is redefined in new terms . The volume lays the theoretical foundations for what may prove to be a new framework for the study of behavior and development, synthesizing insights from developmental neurobiology and connectionist modeling. It acknowledges that evolution may have provided biases that guide the developmental process, while eschewing the notion that it does so by building in specific substantive constraints as such, and while still leaving experience as the engine that drives the emergence of competence in language, perception, and other aspects of human cognitive processes.
Elman’s most recent research extends the themes of this earlier work in several ways, focusing on new ways of thinking about that nature of the mental lexicon and of the role of lexical constraints in sentence processing [10, 11]. This work involves a three-pronged effort, using corpus analysis, simulations, psycholinguistic experiments to understand the temporal dynamics of language processing at the sentence level.
In addition to his research, Elman is also an exemplary teacher and scientific citizen. In fact, before he went to graduate school himself, Jeff spent several years as a high school teacher (teaching history, French, and social studies; in Spanish, in a Boston immigrant community). He learned how to teach, how to make his material maximally accessible without distorting or betraying the content. This is a lesson that has stayed with him all his life, and his colleagues and students are all the richer for it. In fact, Rethinking Innateness and the companion handbook grew out of a teaching initiative, a five-year experimental training program funded by the MacArthur Foundation, designed to introduce developmental psychologists (from graduate students to senior scientists) to the ideas and techniques of connectionist modeling. Jeff has been especially concerned with graduate student and postdoc mentoring, and in 1995-96, he developed a course on Ethics and Survival Skills in Academe for graduate students for his department at UCSD.
Elman has also been a leading contributor as a scientific citizen, working continually to build bridges between the disciplines that contribute to the field of Cognitive Science. For many years, Jeff directed the UCSD Center for Research in Language, where he turned a local resource into an internationally renowned research unit. At the international level, Jeff has been an active member of the Governing Board for the Cognitive Science Society. He has served as President of the society, and serves as consultant and advisory board member of many departments and institutions, and on the editorial board of numerous journals. He is in great demand throughout the world as a keynote speaker, and gives generously of his time with little recompense, including generous commitments to the international Cognitive Science Summer School at the New Bulgarian University, which awarded him an honorary Doctorate in 2002. In the same year, Elman was also chosen as one of five inaugural Fellows of the Cognitive Science Society.
In short, Jeff exemplifies the kind of model that David Rumelhart set for our field, not only in the quality and depth of his science, but in the degree of compassion, leadership and generosity that he provides to his colleagues around the world.
 Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14, 179-211. Elman, J. L., & McClelland, J. L. (1986). Exploiting the lawful variability in the speech wave. In J. S. Perkell and D. H. Klatt (Eds.), Invariance and variability of speech processes. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1-86. Elman, J. L., & McClelland, J. L. (1988). Cognitive penetration of the mechanisms of perception: Compensation for coarticulation of lexically restored phonemes. Journal of Memory and Language, 27, 143-165. Elman, J. L. (1991). Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning, 7, 195-224. Elman, J.L. (1993). Learning and development in neural networks: The importance of starting small. Cognition, 48, 71-99. Hare, M., & Elman, J.L. (1995). Learning and morphological change. Cognition, 56, 61-98. Rodriguez, P., Wiles, J. and Elman, J. (1999) A Recurrent neural network that learns to count. Connection Science, 11, 5-40. Elman, J.L., Bates, E.A., Johnson, M.H., Karmiloff-Smith, A., Parisi, D., Plunkett, K. (1996). Rethinking Innateness: A Connectionist Perspective on Development. Cambridge, MA: MIT Press. Elman, J.L. (2004). An alternative view of the mental lexicon. Trends in Cognitive Science, 8, 301-306. McRae, K., Hare, M., Elman, J.L., & Ferretti, T.R. (2006). A basis for generating expectancies for verbs from nouns. Memory and Cognition, 33, 1174-1184.
2006 Recipient - Roger N. Shepard
Roger N. Shepard, Professor of Psychology at Stanford University, is a particularly appropriate recipient for a prize dedicated to the “Theoretical Foundations of Human Cognition”. Throughout his research career, Roger Shepard has been searching for theoretical foundations of a science of the mind. His work attempts to specify such foundations in the form of universal laws formulated in an explicit mathematical manner, derivable from first principles, and which apply to human and to animal behaviour under a variety of tasks and stimulus sets. His endeavour to combine mathematical and physical modelling with quantified psychological experimentation has resulted in extraordinary advances in psychology. His work has opened up new research avenues in domains as varied as visual and auditory perception, mental imagery, representation, learning, and generalization. It is no exaggeration to state that several generations of psychologists have been influenced by the imagination and rigor that he has brought to psychological investigation. Indeed, many of the research paradigms that he has invented, from multidimensional scaling to mental rotation or guided apparent motion, continue to play a central role in current psychological investigations.
Roger Shepard’s recent keynote address to the Psychonomic Society  provides insights into his intellectual trajectory. Initially fascinated by mechanics, geometry, and Newtonian and relativistic physics, disciplines in which he received undergraduate training at Stanford, Roger Shepard became increasingly interested in the possibility that the tools of physics and mathematics might provide insights into the organization of mental representations, including those that took place within his own mind when he mentally explored those abstract objects. Under the influence of Fred Attneave’s research on similarity judgments, and of memorable lectures by William Estes and George Miller, Roger Shepard decided to orient his career towards the investigation of the laws of psychology. The influence of his early geometrical and physical training is clearly perceptible in the elegant, formal character of his experimental and theoretical contributions.
Internal representational spaces. A highly influential early contribution was Roger Shepard’s development of the method of nonmetric multidimensional scaling [12, 13], which was later improved by Joseph Kruskal, his mathematician colleague from Bell Laboratories. This method provided a new means of recovering the internal structure of mental representations from qualitative measures of similarity. This was accomplished without making any assumptions about the absolute quantitative validity of the data, but solely based on the assumption of a reproducible ordering of the similarity judgements. As a data analysis tool, non-metric dimensional scaling has proven extremely useful in many areas of science, and is now part of all major statistical packages. In this respect, Shepard’s contribution follows a long list of psychologists whose research has lead to the creation of new mathematical and statistical tools useful to the scientific community at large, as initially exemplified by Francis Galton’s invention of correlation or by Charles Spearman’s work on rank correlation and factor analysis.
More than any other psychologist perhaps, Roger Shepard played a pivotal role in drawing attention to the highly regular structure of mental representations, which he depicted as multidimensional “representational spaces”. He successively applied his non-metric multidimensional scaling method to many dimensions, including color , the pitch of sounds , or even abstract dimensions such as number . In each case, a beautiful internal structure emerged: the color circle, the “double helix” of pitch with independent circles for octaves and fifths, the logarithmic number line. All of these inferred representations have received extensive subsequent validation using a broad array of methods.
Universal law of generalization. Derivation of the internal structure of mental representations by multidimensional scaling allowed Roger Shepard to progress on his lifelong quest for laws of generalization [9-11, 17], which he considered as “the most fundamental problem confronting learning theory” [20, page 5]. As he notes, any theory of learning must specify how what has been learned in one situation is generalized to another. Once inter-stimulus distances were measured on the inferred representational space, Shepard observed that the generalization data from many experiments on both human and animal became highly regular. In essentially all cases, the probability with which a response that had been learned to one stimulus was made to another stimulus followed an exponentially decaying function. This regularity was reported in Shepard’s celebrated Science article entitled “Towards a universal law of generalization for Psychological Science” . In it, Shepard showed how a elegant, general mathematical theory, based on simple Bayesian principles and the concept of “consequential region”, could account for the universal exponential law of generalization. The theory also explained why two metrics were observed for psychological space: for unitary stimuli with “integral” dimensions (such as the lightness and saturation of colors), measuring internal distances with the Euclidean metric provides the best predictor of generalization; for other, analyzable stimuli (such as shapes differing in size and orientation), the best-fitting metric was the “city-block” metric, also known as the L1-norm. Roger Shepard suggested that any organism that attempts to generalize according to optimal laws should be led by natural selection to adopt the exponentially decaying law with the stimulus-appropriate metric. The theory also provided an explanation for another universal law, the law of discriminative reaction time, which indicates that the time to discriminate between two stimuli falls off as the inverse of the inter-stimulus distance. Shepard’s elegant theorizing thus lead to the unification of many fundamental observations on generalization and discrimination tasks in a remarkably broad variety of stimuli, tasks, and species.
Mental transformations. Perhaps Shepard’s most universally renowned experimental contribution consists in his experiments with the mental rotation task, thoroughly reported in his classic 1982 book “Mental images and their transformations”, written with his collaborator Lynn Cooper . Considered some of the most elegant chronometric experiments in the history of psychology, these studies demonstrated that the comparison of two views of the same objects, displayed in different 3-dimensional orientations, involves a process of “mental rotation”: the object is successively represented internally at successive positions which progressively bring one view in alignment with the other. Thus, the response time is a highly regular, linear function of the angle of internal rotation. It might be thought that mental rotation is a mere metaphor, but with Lynn Cooper, Roger Shepard demonstrated its “psychological reality”, for instance by demonstrating that the presentation of probe stimuli at intermediate orientations receive an especially fast response if presented at precisely the time when the theory predicts that this intermediate orientation should be internally represented. Mental rotation has become a standard tool of psychology, and is now being applied in a variety of domains, from the assessment of brain-lesioned patients and airplane pilots to the investigation of the neural coding of movements and their transformations [4, 8]. Shepard himself extended his work in several directions. He showed that the phenomenon of apparent motion, which is perceived when two shapes are successively flashed in different orientations, exhibits lawful relations of display duration and trajectory length analogous to those observed under mental rotation conditions [e.g. 3]. He also provided a theoretical account of these laws. In both cases, the object path could be predicted by an analysis of the geodesic paths in the six-dimensional manifold jointly determined by the Euclidean group of three-dimensional space and the symmetry group of each object. Shepard also demonstrated that the path of apparent motion could be distorted by the presentation of a curved grey cue , again yielding highly regular laws relating motion and path length.By demonstrating that mental images could be empirically measured, transformed and controlled with unexpected precision, Shepard’s mental rotation paradigm played a key role in the great mental imagery debate. The similarities between perception and visual imagery were also demonstrated in other tasks such as a figure-ground search task that Shepard studied with his colleague Podgorny [6, 7].
Musical cognition. Roger Shepard’s interest in the internal structure of representations also led him to invent new perceptual illusions. In particular, Shepard’s highly imaginative research in the domain of musical cognition led to the invention of the Shepard scale. This is a sequence of sounds (now known as “Shepard tones”) which are each composed of multiple tones in octave relations, with fading amplitudes at each end of the frequency scale. Listening to the Shepard scale gives the illusion of an ever-ascending pitch . This illusion is analogous to Penrose’s illusion of ever-ascending steps, made famous by M.C. Escher’s wood engraving Ascending and Descending. The Shepard tones have been subject to much further experimental work such as the “tritone paradox” explored by UCSD psychologist Diana Deutsch . With Carol Krumhansl , Shepard further explored the universal laws of musical perception, again observing that distance on the internal representation of pitch could account for experimental results on the perception of tonal hierarchies. His work provided elaborate tools with which to study issues of universality and cultural differences in music perception [e.g. 1].
Besides those musical creations, Roger Shepard’s vivid imagination led him to generate playful, yet insightful visual illusions. A collection of his visual inventions, which he drew himself with great artistic talent, were published in his book MindSights .
Second-order isomorphism and internalization of physical laws. In recent syntheses of his work, Shepard has proposed an evolutionary psychology argument for why internal representations and their transformations are so regularly organized and often faithfully reflect the structure of physical laws [16, 19, 20]. He proposes that mental representations have evolved over millions of years as adaptations to universal physical principles (such as the kinematic laws governing object motion, those underlying light reflection and diffusion, etc). As a result, mental representations have become highly structured and attuned to physical laws –in Shepard’s terms they are “second-order isomorphic”, which means that the relations between physical events in the environment are preserved in the relations between their internal mental representations. According to Shepard, this mental internalization process explains why physicists such as Galileo, Newton or Einstein, were able to rely on thought experiments in order to derive plausible physical laws – thought processes are sufficiently isomorphic to physical processes that the properties of the latter can be inferred, in part, by mere introspection on the former. For Shepard, the mental regularities imposed this internalization process are so extensive that they attain “the kind of universality, invariance, and formal elegance (…) previously accorded only to the laws of physics and mathematics” . Indeed, Roger Shepard’s own work exemplifies how such universality, invariance, and elegance can be achieved in experimental psychology.
Roger N. Shepard is a fellow of the American Association for the Advancement of Science and the American Academy of Arts and Sciences, and is the William James Fellow of the American Psychological Association. In 1977 he was elected to the National Academy of Sciences. In 1995 he received United States’ highest scientific award, the National Medal of Science.
 Castellano MA, Bharucha JJ, Krumhansl CL, Tonal hierarchies in the music of north India. J Exp Psychol Gen 1984;113:394-412.
 Deutsch D, Some new pitch paradoxes and their implications. Philos Trans R Soc Lond B Biol Sci 1992;336:391-7.
 Farrell JE, Shepard RN, Shape, orientation, and apparent rotational motion. J Exp Psychol Hum Percept Perform 1981;7:477-86.
 Georgopoulos AP, Lurito JT, Petrides M, Schwartz AB, Massey JT, Mental rotation of the neuronal population vector. Science 1989;243:234-6.
 Krumhansl CL, Shepard RN, Quantification of the hierarchy of tonal functions within a diatonic context. J Exp Psychol Hum Percept Perform 1979;5:579-94.
 Podgorny P, Shepard RN, Functional representations common to visual perception and imagination. J Exp Psychol Hum Percept Perform 1978;4:21-35.
 Podgorny P, Shepard RN, Distribution of visual attention over space. J Exp Psychol Hum Percept Perform 1983;9:380-93.
 Richter W, Somorjai R, Summers R, Jarmasz M, Menon RS, Gati JS, Georgopoulos AP, Tegeler C, Ugurbil K, Kim SG, Motor area activity during mental rotation studied by time-resolved single-trial fMRI. J Cogn Neurosci 2000;12:310-20.
 Shepard RN, Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space. Psychometrika 1957;22:325-45.
 Shepard RN, Stimulus and response generalization: deduction of the generalization gradient from a trace model. Psychol Rev 1958;65:242-56.
 Shepard RN, Stimulus and response generalization: tests of a model relating generalization to distance in psychological space. J Exp Psychol 1958;55:509-23.
 Shepard RN, The analysis of proximities: Multidimensional scaling with an unknown distance function. I. Psychometrika 1962;27:125-40.
 Shepard RN, The analysis of proximities: Multidimensional scaling with an unknown distance function. II. Psychometrika 1962;27:219-46.
 Shepard RN, Circularity in judgments of relative pitch. Journal of the Acoustical Society of America 1964;36:2346-53.
 Shepard RN, Geometrical approximations to the structure of musical pitch. Psychol Rev 1982;89:305-33.
 Shepard RN, Ecological constraints on internal representation: resonant kinematics of perceiving, imagining, thinking, and dreaming. Psychol Rev 1984;91:417-47.
 Shepard RN, Toward a universal law of generalization for psychological science. Science 1987;237:1317-23.
 Shepard RN, Mind sights. 1990: W.H. Freeman.
 Shepard RN, Perceptual-cognitive universals as reflections of the world. Behav Brain Sci 2001;24:581-601; discussion 52-71.
 Shepard RN, How a cognitive psychologist came to seek universal laws. Psychon Bull Rev 2004;11:1-23.
 Shepard RN, Cooper LA, Mental images and their transformations. 1982, Cambridge: MIT Press.
 Shepard RN, Kilpatrick DW, Cunningham JP, The internal representation of numbers. Cognitive Psychology 1975;7:82-138.
Paul Smolensky has pursued a unified theory of the mind/brain inspired by a fundamental analogy with modern physics: just as quantum and classical theory do for the physical world, Smolensky holds that for cognition, connectionist and symbolic theory provide valid formal characterizations at micro- and macro-levels, respectively.
A research program in which connectionist and symbolic theory collaborate to form a multi-level analysis of the mind/brain was laid out in Smolensky’s influential article, .On the proper treatment of connectionism. . At a time when connectionism and symbolic theory were overwhelmingly seen only as competitors, this image of an integrative research program defined a unique alternative vision, and established the ground on which Smolensky and a small number of like-minded cognitive scientists have worked ever since. The Integrated Connectionist/Symbolic (ICS) Cognitive Architecture, constructed by Smolensky and collaborators, is developed in depth and broadly applied in the comprehensive collection, The Harmonic Mind .
Defending the importance and viability of the connectionist substrate in this architecture led Smolensky into a lengthy debate with Jerry Fodor and his collaborators . In the ICS Architecture, connectionism does not “merely implement” a classical symbolic theory — rather it furnishes ineliminable subsymbolic accounts of processing and novel explanations of central aspects of higher cognition, such as unbounded productivity; further, it forms the basis of a new theory of grammar. Through his extended debate with Fodor, Smolensky has brought to the attention of philosophers the important foundational implications of crucial technical aspects of connectionist theory [10: §§22.23].
In the ICS Architecture, the abstract high-level computational properties of the connectionist micro-level theory are formally described by a symbolic macro-level theory. In three major contributions to the seminal 1986 Parallel Distributed Processing (PDP) volumes, Smolensky first showed how mathematical analysis of high-level properties of neural computation could make substantial connection with symbolic theory: using vector calculus, neural activation patterns can be identified with conceptual-level symbolic description ; spreading activation can be analyzed as optimization of well-formedness or Harmony, a principled form of statistical inference , and (with David Rumelhart, James McClelland, and Geoffrey Hinton) a particularly flexible kind of schema-based reasoning . Many neural network theorists have exploited optimization analysis techniques such as these and others introduced independently around this time by S. Grossberg, J. J. Hopfield, Hinton & T. Sejnowski, and others. The work building on Smolensky’s emphasized optimization as a key link between neural and symbolic computation.
Substantially extending the vector analysis of distributed representations in 1988, Smolensky introduced tensor analysis into connectionist theory, establishing a formal isomorphism between high-level properties of certain distributed connectionist networks and symbolic computation [10: §§5, 7, 8]]. To promote further integrative cognitive research based on other types of formal high-level analysis of neural computation, in 1996 Smolensky wrote extensive pedagogical and integrative material for Mathematical Perspectives on Neural Networks , which he edited with Michael Mozer and Rumelhart.
A particularly crucial test area for a unified connectionist and symbolic theory is language, especially aspects related to grammar. This has been the focus of Smolensky’s work since 1990. Collaborative work with syntactician Géraldine Legendre showed that tensorial distributed representations combined with optimization entails Harmonic Grammar, a new framework in which symbolic linguistic representations are assigned numerical well-formedness values (the Harmony of the connectionist representations that realize them). The grammar is realized by the connection weights of a network, the outputs of which are optimal — maximal-Harmony — representations [10: §§11, 20].
Smolensky’s most influential work arose from what was intended to be a confrontation in 1988 with Alan Prince, a preeminent phonologist also known as a critic of connectionist research on language. Smolensky and Prince found a strong basis for collaboration in their shared respect for formal analysis and explanation in cognitive science. Addressing phonology, and taking Harmonic Grammar as a starting point, they built Optimality Theory (OT), which adds strong principles of restricted, universal grammatical explanation [5, 8]. OT provides the first formal theory of cross-linguistic typology, postulating that all grammars are built of literally the same set of well-formedness constraints — but, crucially, these constraints, like those of connectionist networks, are conflicting and violated in well-formed structures. A possible grammar is precisely a hierarchical constraint ranking, in which each constraint has absolute priority over all lower-ranked constraints combined. This crisp theory of constraint interaction enables a singular precision of grammatical analysis and explanation, as demonstrated in a series of penetrating papers by Prince and collaborators .
After the circulation of Prince & Smolensky’s OT book manuscript in 1993, OT rapidly became a dominant theory of phonology [14; http://roa.rutgers.edu], the first major challenger to Chomsky and Halle’s serial symbol-manipulation framework which had provided the field’s foundation since the 1960s. Smolensky’s own contributions to linguistic theory since 1993 have addressed a range of formal issues in phonology, syntax, and semantics, such as the grammaticization of scales, the supra-linear interaction of local constraint violations, underspecification, and the structure of features in phonological representations [10: §14]; the initial state of the learner [10: §12]; and the grammatical role of competition among interpretations in comprehension [6, 11].
In collaboration with a number of leading linguists, Smolensky has played a major role in the expansion of OT outside phonology. Work in 1993 with Legendre on the typology of grammatical voice and case marking systems was the first published OT work outside phonology [10: §15]; their joint research on the typology of wh-questions [10: §16] also played a ground-breaking role in creating the field of OT syntax . Bruce Tesar’s work with Smolensky in 1993 on learnability of OT grammars remains the cornerstone of that active area . Experimental research with Peter Jusczyk established a new paradigm for probing young infants’ knowledge of phonological grammar [10: §17]. Collaborations with Lisa Davidson [10: §17] and with Suzanne Stevenson [10: §19] pushed OT towards a theory of performance in phonological production and syntactic comprehension, respectively.
As a result, OT is not simply a theory of phonological competence; it is the grammatical component of an emerging unified theory of linguistic cognition, integral to not just phonological, syntactic and semantic knowledge, but also to the theory of performance, both production and comprehension, and to learning. It is a high-level description of connectionist processes using spreading activation to perform sub-symbolic maximization of the well-formedness of distributed linguistic representations [10: §21].
Outside his own research, Smolensky has also worked to promote a formal, principle-based, aggressively interdisciplinary vision of cognitive science, strongly influenced by his training with Rumelhart and McClelland. This vision has driven his efforts as two-time President of the Cognitive Science Society, as President of the Society for Psychology and Philosophy, as lecturer at the Linguistic Society of America Summer Institute and Annual Conference, and as Chair of the Cognitive Science Department at Johns Hopkins University, where he has built a strong PhD program that is training a new generation of innovative, multidisciplinary cognitive scientists. Smolensky’s students are playing a leading role in extending linguistics to embrace the full cognitive science of language; they include M. Goldrick and J. Hale as well as the speakers featured at the Cognitive Science Society’s Rumelhart Prize Symposium, L. Davidson, A. Gafos, B. Tesar, and C. Wilson.
SELECTED BIBLIOGRAPHY Smolensky, Paul. 1986. Information processing in dynamical systems: Foundations of harmony theory. In Parallel distributed processing: Explorations in the microstructure of cognition. Vol. 1, Foundations, David E. Rumelhart, James L. McClelland and the PDP Research Group, 194-281. MIT Press. Smolensky, Paul. 1986. Neural and conceptual interpretations of parallel distributed processing models. In Parallel distributed processing: Explorations in the microstructure of cognition. Vol. 2, Psychological and biological models, David E. Rumelhart, James L. McClelland and the PDP Research Group, 390-431. MIT Press. Rumelhart, David E., Paul Smolensky, James L. McClelland, and Geoffrey E. Hinton. 1986. Schemata and sequential thought processes in parallel distributed processing. In Parallel distributed processing: Explorations in the microstructure of cognition. Vol. 2, Psychological and biological models, David E. Rumelhart, James L. McClelland and the PDP Research Group, 7-57. MIT Press. Smolensky, Paul. 1988. On the proper treatment of connectionism. The Behavioral and Brain Sciences 11, 1-74. Prince, Alan, and Paul Smolensky. 1993/2004. Optimality Theory: Constraint interaction in generative grammar. Technical Report, Rutgers University and University of Colorado at Boulder, 1993. Revised version published by Blackwell, 2004. Rutgers Optimality Archive 537. Smolensky, Paul. 1996. On the comprehension/production dilemma in child language. Linguistic Inquiry 27, 720-31. Rutgers Optimality Archive 118. Smolensky, Paul, Michael C. Mozer, and David E. Rumelhart, eds. 1996. Mathematical perspectives on neural networks. Erlbaum. Prince, Alan, and Paul Smolensky. 1997. Optimality: From neural networks to universal grammar. Science 275, 1604-10. Tesar, Bruce B., and Paul Smolensky. 2000. Learnability in Optimality Theory. MIT Press. Smolensky, Paul, and Géraldine Legendre. 2005. The harmonic mind: From neural computation to Optimality-Theoretic grammar. Vol 1: Cognitive architecture. Vol 2: Linguistic and philosophical implications. MIT Press. Additional Citations Blutner, Reinhard, and Henk Zeevat, eds. 2003. Pragmatics in Optimality Theory. Palgrave Macmillan. Legendre, Géraldine, Sten Vikner, and Jane Grimshaw, eds. 2001. Optimality-Theoretic syntax. MIT Press. Macdonald, Cynthia, and Graham Macdonald. 1995. Connectionism: Debates on psychological explanation. vol. 2 Blackwell. McCarthy, John J., ed. 2004. Optimality Theory in phonology: A reader. Blackwell. Prince, Alan S. 2006. The structure of Optimality Theory.
2004 Recipient - John R. Anderson
John R. Anderson, Richard King Mellon Professor of Psychology and Computer Science at Carnegie Mellon University is an exemplary recipient for a prize that is intended to honor “a significant contemporary contribution to the formal analysis of human cognition”. For the last three decades, Anderson has been engaged in a vigorous research program with the goal of developing a computational theory of mind. Anderson’s work is framed within the symbol processing framework and has involved an integrated program of experimental work, mathematical analyses, computational modeling, and rigorous applications. His research has provided the field of cognitive psychology with comprehensive and integrated theories. Furthermore, it has had a real impact on educational practice in the classroom and on student achievement in learning mathematics.
Anderson’s contributions have arisen across a career that consists of five distinct phases. Phase 1 began when he entered graduate school at Stanford at a time when cognitive psychology was incorporating computational techniques from artificial intelligence. During this period and immediately after his graduation from Stanford, he developed a number of simulation models of various aspects of human cognition such as free recall . His major contribution from this time was the HAM theory, which he developed with Gordon Bower. In 1973, he and Bower published the book Human Associative Memory , which immediately attracted the attention of everyone then working in the field. The book played a major role in establishing propositional semantic networks as the basis for representation in memory and spreading activation through the links in such networks as the basis for retrieval of information from memory. It also provided an initial example of a research style that has become increasingly used in cognitive science: to create a comprehensive computer simulation capable of performing a range of cognitive tasks and to test this model with a series of experiments addressing the phenomena within that range.
Dissatisfied with the limited scope of his early theory, Anderson undertook the work which has been the major focus of his career to date, the development of the ACT theory . ACT extended the HAM theory by combining production systems with semantic nets and the mechanism of spreading activation. The second phase of Anderson’s career is associated with the initial development of ACT. The theory reached a significant level of maturity with the publication in 1983 of The Architecture of Cognition , which is the most cited of his research monographs (having received almost 2000 citations in the ensuing years). At the time of publication, The ACT* model described in this book was the most integrated model of cognition that had then been produced and tested. It has had a major impact on the theoretical development of the field and on the movement toward comprehensive and unified theories, incorporating separation of procedural and declarative knowledge and a series of mechanisms for production rule learning that became the focus of much subsequent research on the acquisition of cognitive skills. In his own book on Unified Theories of Cognition, Alan Newell had this to say: “ACT*, is in my opinion, the first unified theory of cognition. It has pride of place…. [It] provides a threshold of success which all other candidates… must exceed”.
Anderson then began a major program to test whether ACT* and its skill acquisition mechanisms actually provided an integrated and accurate account of learning. He started to apply the theory to development of intelligent tutoring systems; this defines the third phase of his research. This work grew from an initial emphasis on teaching the programming language LISP to a broader focus on high-school mathematics , responding to perceptions of a national crisis in mathematics education. These systems have been shown to enable students to reach target achievement levels in a third of the usual time and to improve student performance by a letter grade in real classrooms. Anderson guided this research to the point where a full high school curriculum was developed that was used in urban schools. Subsequently, a separate corporation has been created to place the tutor in hundreds of schools, influencing tens of thousands of students. The tutor curriculum was recently recognized by the Department of Education as one of five “exemplary curricula” nationwide. While Anderson does not participate in that company, he continues research developing better tools for tracking individual student cognition, and this research continues to be informed by the ACT theory. His tutoring systems have established that it is possible to impact education with rigorous simulation of human cognition.
In the late 1980s, Anderson began work on what was to define the fourth phase of his research, which was an attempt to understand how the basic mechanisms of a cognitive architecture were adapted to the statistical structure of the environment. Anderson (1990)  called this a rational analysis of cognition and applied it to the domains of human memory, categorization, causal inference, and problem solving. He utilized Bayesian statistics to derive optimal solutions to the problems posed by the environment and showed that human cognition approximated these solutions. Such optimization analysis and use of Bayesian techniques have become increasingly prevalent in Cognitive Science.
Subsequent to the rational analysis effort, Anderson has returned his full attention back to the ACT theory, defining the fifth and current phase of his career. With Christian Lebiere, he has developed the ACT-R theory, which incorporates the insights from his work on rational analysis . Reflecting the developments in computer technology and the techniques learned in the applications of ACT*, the ACT-R system was made available for general use. A growing and very active community of well over 100 researchers is now using it to model a wide range of issues in human cognition, including dualtasking, memory, language, scientific discovery, and game playing. It has become increasingly used to model dynamic tasks like air-traffic control, where it promises to have training implications equivalent to the mathematics tutors. Through the independent work of many researchers, the field of cognitive science is now seeing a single unified system applied to an unrivaled range of tasks. Much of Anderson’s own work on the ACT-R has been involved relating the theory to data from functional brain imaging .
In addition to his enormous volume of original work, Anderson has found the time to produce and revise two textbooks, one on cognitive psychology  and the other on learning and memory . The cognitive psychology textbook, now in its fifth edition, helped define the course of study that is modern introductory cognitive psychology. His more recent learning and memory textbook, now in its second edition, is widely regarded as reflecting the new synthesis that is occurring in that field among animal learning, cognitive psychology, and cognitive neuroscience.
Anderson has previously served as president of the Cognitive Science Society and has received a number of awards in recognition of his contributions. In 1978 he received the American Psychological Association’s Early Career Award; in 1981 he was elected to membership in the Society of Experimental Psychologists; in 1994 he received APA’s Distinguished Scientific Contribution Award; and in 1999 he was elected to both the National Academy of Sciences and the American Academy of Arts and Science. Currently, as a member of the National Academy, he is working towards bringing more rigorous science standards to educational research. SELECTED BIBLIOGRAPHY
 Anderson, J. R., & Bower, G. H. (1972). Recognition and retrieval processes in free recall. Psychological Review, 79, 97-123. Anderson, J. R. & Bower, G. H. (1973). Human associative memory. Washington: Winston and Sons. Anderson, J. R. (1976). Language, memory, and thought. Hillsdale, NJ: Erlbaum. Anderson, J. R. (1983). The Architecture of Cognition. Cambridge, MA: Harvard University Press. Anderson, J. R., Corbett, A. T., Koedinger, K., & Pelletier, R. (1995). Cognitive tutors: Lessons learned. The Journal of Learning Sciences, 4, 167-207. Anderson, J. R. (1990). The Adaptive Character of Thought. Hillsdale, NJ: Erlbaum. Anderson, J. R. & Lebiere, C. (1998). The atomic components of thought. Mahwah, NJ: Erlbaum. Anderson, J. R., Qin, Y., Sohn, M-H., Stenger, V. A. & Carter, C. S. (2003.) An information-processing model of the BOLD response in symbol manipulation tasks. Psychonomic Bulletin & Review. 10, 241-261. Anderson, J. R. (2000). Cognitive Psychology and Its Implications: Fifth Edition. New York: Worth Publishing. Anderson, J. R. (2000). Learning and Memory, Second Edition. New York: Wiley.
Joshi was the Henry K. Salvatore Professor of Computer and Cognitive Science at the University of Pennsylvania. He has previously received considerable recognition for his accomplishments. Three of his honors are particularly worthy of note. In 1997, he was the recipient of the highest honor in the field of artificial intelligence, the Research Excellence Award of the International Joint Conference of Artificial Intelligence (IJCAI), a distinction held by only eight other outstanding computer scientists. In 1999, he was appointed to the National Academy of Engineering, the only researcher in Natural Language Processing to have ever recieved this distinction. And just this year, Joshi was chosen to be the first recipient of the Lifetime Achievement Award given by the Association for Computational Linguistics.
Joshi has contributed a number of key ideas to the formal science of language. Perhaps the best known of these is Tree Adjoining Grammar. His work on TAG has played an important role in both natural language processing and in theoretical linguistics. In both disciplines, it stands as a monument to the value of principled mathematical thinking. Two key ideas underlying TAG are, first, that the statement of local syntactic and semantic dependencies can be factored apart from recursion and, second, that a modest increase in power beyond context-free grammar is sufficient to characterize natural language syntax. The TAG adjoining operation, as defined by Joshi, achieves both of these results in a strikingly elegant way, providing a powerful tool for linguistic description that at the same time yields grammars guaranteed to be computationally tractable. A large body of mathematics, computational, empirical linguistic, and psycholinguisitc work by Joshi and numerous others has been developing the consequences of Joshi’s original insight for more than a quarter of century.
Joshi’s work in mathematical linguistics over the years has had an extraordinary impact on linguistic theory, beyond the impact of TAGs themselves. To give just two examples here: (a) Joshi’s generalization of an earlier result of Stan Peters’ to show that arbitrary booleans of context sensitive filters on context free grammars still result in context free languages led directly to the development of Gerald Gazdar’s GPSG framework (actually first developed, we believe, while Gazdar was visiting Penn). (b) the generalization of TAGs to an entire class of languages (the so-called “Mildly Context Sensitive Languages”) provided a natural way to relate a number of superficially distinct linguistic theories from Combinatory Categorial Grammar to Head Grammar and HPSG to Government-Binding Theory and Minimalism.
Another key contribution of Joshi’s to the science of language (along with Weinstein and Grosz) is Centering Theory, a computationally tractable model of attention during discourse. Centering Theory has attracted a wide following among linguists and computer scientists working on formal models of discourse. Its leading idea is that referring expressions can be ranked on the basis of various structural properties and that these rankings can predict the likely coreferents of anaphoric expressions in discourse. These predictions can be used in the automatic processing of discourse but they also have a fine-grained structure, so that the theory can be used to show how different choices of coreference produce different pragmatic effects. The theory is attractive in part because it provides a framework for capturing not only the relationship of a current utterance to previous utterances but also with expectations regarding utterances yet to come. Perhaps most strikingly, it has yielded the first successful objective definition of the notion of “topic” or “theme”, a concept long thought important by linguists but notoriously difficult to nail down. Centering Theory has been found relevant for modeling a number of properties related to discourse coherence, including anaphora resolution, the distribution of various types of pronouns, the felicity condidtions of marked syntactic forms, and aspects of prosody.
Aside from his scientific work, Joshi has played a key organizational role in fostering the development of the new discipline of cognitive science. Over the past two decades and more, the University of Pennsylvania has developed a thriving program in cognitive science, largely due to the outstanding vision and tireless leadership that Joshi contributed to the effort. From his graduate student days onward, he was concerned with the interface between computation and cognition, working with early pioneers like Zellig Harris and Saul Gorn. By the late 1970’s Joshi had established an interdisciplinary faculty seminar that included psychologists and linguists, as well as computer scientists. This seminar was one of the early recipients of support from the Sloan Foundation’s cognitive science initiative. Later he led the effort at Penn to win an NSF Science and Technology Center for cognitive science and was founding co-director (with Lila Gleitman) of the Institute for Cognitive Science at Penn, a post which he held with great success until last year. His approach to these efforts was always to foster the broadest possible participation by researchers in different domains and with different orientations and always to emphasize the importance of educating young researchers and of supporting them morally and materially. IRCS is one of a small number of organizations at Penn that cross school lines and establishing it required great persistence and diplomatic skill. These were supplied by Joshi, whose commitment to a broad view of the field had convinced him that the effort was necessary.
Joshi’s writing has been both prolific and of exceptionally high quality. In the introduction of Gazdar et al.’s bibliography “Natural Language Processing in the 1980s,” they note that “The most prolific author represented, by quite a large margin, is Aravind Joshi.” Several of his most important works are listed below.
Joshi was remarkably creative and prolific. He had most recently turned his attention to the application of ideas from mathematical linguistics to the analysis of DNA and the human genome. This follows an observation some years ago (by others) that while CFGs provide a formal basis for the mathematical analysis of the DNA that generates hairpin structures, the pseudo-knot structures of molecules like tRNA are generated by a non-context free DNA structure that can be nicely modelled by TAG.
Joshi, A. K., Levy, L., and Takahashi, M. (1975). Tree Adjunct Grammars. Journal of Computer and System Sciences.
Joshi, A. K. (1985). How much context-sensitivity is necessary for assiging structural descriptions: Tree adjoining grammars. Natural Language Parsing, (ed. D. Dowty, L. Karttunen, and A. Zwicky), Cambridge University Press.
Joshi, A. K. (1990). Processing crossed and nested dependencies: an automaton perspective on the psycholinguistic results. Language and Cognitive Processses 5(1), 1-27.
Schabes, Y. and Joshi, A. (1991). Parsing with lexicalized tree adjoining grammar. In Tomita, Ed., Current Issues in Parsing Technologies. Kluwer, Boston.
Joshi, A. and Bangalore, S. (1994). Disambiguation of Super Parts of Speech (or Supertags): Almost Parsing. COLING 94.
Joshi, A., Becker, T., and Rambow, O. (1994). Complexity of Scrambling: a new twist to the competence/performance distinction. Tree-Adjoining Grammars: Formalisms, Linguisitc Analysis and Processing) (eds. A. Abeille and O. Rambow), CSLI Publications, Stanford University, pp. 167-182.
Grosz, B., Joshi, A. K., and S. Weinstein. (1995). Centering: A Frmework for modeling local coherence of discourse. Computational Linguistics.
Joshi, A. and Schabes, Y. (1996). Tree Adjoining Grammars. In G. Rosenberg and A. Salomaa, Eds., Handbook of Formal Languages.
Webber, B., Knott, A., Stone, M. and Joshi, A. (1999). Discourse relations: A structural and presuppositional account using lexicalised TAG. ACL 36.
Bangalore, S. and Joshi, A. (1999) Supertagging: an Approach to Almost Parsing. Computational Linguistics.
2002 Recipient - Richard Shiffrin
Shiffrin has made many contributions to the modeling of human cognition in areas ranging from perception to attention to learning, but is best known for his long-standing efforts to develop explicit models of human memory. His most recent models use Bayesian, adaptive approaches, building on previous work but extending it in a critical new manner, and carrying his theory beyond explicit memory to implicit learning and memory processes. The theory has been evolving for about 35 years, and as a result represents a progression similar to the best theories seen in any branch of science.
Shiffrin’s major effort began in 1968, in a chapter with Atkinson  that laid out a model of the components of short- and long-term memory and described the processes that control the operations of memory. The Atkinson-Shiffrin model encapsulated empirical and theoretical results from a very large number of publications that modeled quantitatively the relation of short- to long-term memory. It achieved its greatest success by showing the critical importance—and the possibility—of modeling the control processes of cognition. This chapter remains one of the most cited works in the entire field of psychology.
Shiffrin’s formal theory was taken forward in a quantum leap in 1980  and 1981  with the SAM (Search of Associative Memory) model. This was a joint effort with Jeroen Raaijmakers, then a graduate student. The SAM model quantified the nature of retrieval from long-term memory, and characterized reCALL as a memory search with cycles of sampling and recovery. The SAM theory precisely incorporates the notions of interactive cue combination that are now seen to lie at the heart of memory retrieval. Another major quantum step occurred in 1984  when the theory was extended to recognition memory. With another former student, Gary Gillund, Shiffrin initiated what has become the standard approach to recognition memory, in which a decision is based on summed activation of related memory traces. It was a major accomplishment that the same retrieval activations that had been used in the recall model could be carried forward and used to predict a wide range of recognition phenomena. The next major step occurred in 1990, when Shiffrin published two articles on the list-length effect with his student Steve Clark and his colleague, Roger Ratcliff [5, 6]. This research was of critical importance in that it established clearly that experience leads to the differentiation, rather than the mere stregthening, of the representations of items in memory.
In 1997, the theory evolved in a radical direction in an important paper with another former student, Mark Steyvers . Although the changes were fundamental, the new model retained the best concepts of its predecessors, so that the previous successful predictions were also a part of the new theory. REM added featural representations, to capture similarity relations among items in memory. Building on earlier ideas by John Anderson, and related ideas developed in parallel by McClelland and Chappell, Shiffrin used Bayesian principles of adaptive and optimal decision making under constraints to guide the selection of the quantitative form of the activation functions. In addition, storage principles were set forth that provided mechanisms by which episodic experience could coalesce over development and experience into permanent non-contextualized knowledge. This latter development allowed the modeling of implicit memory phenomena, in work that is just now starting to appear in many journals, including a theory of long-term priming [with Schooler and Raaijmakers, 8] and a theory of short-term priming [with his student David Huber and others, 9]. The short-term priming research showed that the direction of priming can be reversed by extra study given to particular primes, leading to another conceptual breakthrough. A new version of the REM model explains this and other findings by assuming that some prime features are confused with test item features, and that the system attempts to deal with this situation optimally by appropriate discounting of evidence from certain features.
Shiffrin received his Ph.D. from the Mathematical Psychology Program in the Department of Psychology at Stanford University in 1968, the year after Rumelhart received his degree from the same program. Since 1968 he has been on the faculty of the Department of Psychology at Indiana University, where he is now the Luther Dana Waterman Professor of Psychology and Director of the Cognitive Science Program. Shiffrin has accumulated many honors, including membership in the National Academy of Sciences, the American Academy of Arts and Sciences, the Howard Crosby Warren Award of the Society of Experimental Psychologists, and a MERIT Award from the National Institute of Mental Health. Shiffrin has served the field as editor of the Journal of Experimental Psychology: Learning Memory and Cognition, and as a member of the governing boards of several scientific societies.
Cited Publications by Richard M. Shiffrin
 Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. In K. W. Spence and J. T. Spence (Eds.), The Psychology of Learning and Motivation: Advances in Research and Theory (Vol. 2, pp. 89-195). New York: Aaademic Press. Raaijmakers, J. G. W., & Shiffrin, R. M. (1980). SAM: A theory of probabilistic search of associative memory. In Bower, G. H. (Ed.), The Psychology of Learning and Motivation, Vol. 14, 207-262. New York: Academic Press. Raaijmakers, J. G. W., & Shiffrin, R. M. (1981). Search of associative memory. Psychological Review, 88, 93-134. Gillund, G., & Shiffrin, R. M. (1984). A retrieval model for both recognition and recall. Psychological Reviw, 91, 1-67. Ratcliff, R., Clark, S., & Shiffrin, R. M. (1990). The list-strength effect: I. Data and discussion. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 163-178. Shiffrin, R. M., Ratcliff, R., & Clark, S. (1990). The list-strength effect: II. Theoretical mechanisms. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 179-195. Shiffrin, R. M., & Steyvers, M. (1997). A model for recognition memory: REM: Retrieving effectively from memory. Psychonomic Bulletin and Review, 4 (2), 145-166. Schooler, L., Shiffrin, R. M., & Raaijmakers, J. G. W. (2001). A model for implicit effects in perceptual identification. Psychological Review, 108, 257-272. Huber, D. E., Shiffrin, R. M., Lyle, K. B., & Ruys, K. I. (2001). Perception and preference in short-term word priming. Psychological Review, 108, 149-182.
2001 Recipient - Geoffrey Hinton
Geoffrey Hinton received his BA in experimental psychology from Cambridge in 1970 and his PhD in Artificial Intelligence from Edinburgh in 1978. He did postdoctoral work at Sussex University and the University of California, San Diego and spent five years as a faculty member in the Computer Science department at Carnegie-Mellon University. He then moved to Toronto where he was a fellow of the Canadian Institute for Advanced Research and a Professor in the Computer Science and Psychology departments. He is a former president of the Cognitive Science Society, and he is a fellow of the Royal Society (UK), the Royal Society of Canada, and the American Association for Artificial Intelligence. In 1992 he won the ITAC/NSERC award for contributions to information technology.
Hinton is currently Director of the Gatsby Computational Neuroscience Unit at University College London, where he leads an outstanding group of faculty, post-doctoral research fellows, and graduate students investigating the computational neural mechanisms of perception and action with an emphasis on learning. His current main interest is in unsupervised learning procedures for neural networks with rich sensory input.
Cited Publications by Geoffrey E. Hinton
(1) Hinton, G. E. and Anderson, J. A. (1981) Parallel Models of Associative Memory, Erlbaum, Hillsdale, NJ. (2) Hinton, G. E. (1981) Implementing semantic networks in parallel hardware. In Hinton, G. E. and Anderson, J. A. (Eds.), Parallel Models of Associative Memory, Erlbaum, Hillsdale, NJ. (3) Hinton, G. E. and Sejnowski, T. J. (1983) Optimal perceptual inference. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Washington DC. (4) Ackley, D. H., Hinton, G. E., and Sejnowski, T. J. (1985) A learning algorithm for Boltzmann machines. Cognitive Science, 9, 147–169. (5) Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986) Learning representations by back-propagating errors. Nature, 323, 533–536. (6) Jacobs, R., Jordan, M. I., Nowlan. S. J. and Hinton, G. E. (1991) Adaptive mixtures of local experts. Neural Computation, 3, 79-87 (7) Hinton, G. E., Dayan, P., Frey, B. J. and Neal, R. (1995) The wake-sleep algorithm for unsupervised Neural Networks. Science, 268, pp 1158-1161.
The Cognitive Science Society is pleased to announce the establishment of the CogSci Grove which aims to mobilise cognitive scientists to offset carbon emissions associated with their professional activities. To date, 1681 trees have been planted in protected sites in the Scottish Highlands where they will create homes for wildlife and forests for the future.
Podium Conference & Association Specialists #124-4730 University Way NE 104 Seattle, WA 98105 email: phone: 1-888-472-7644