catcomconm2025: 2b. Harnad, S. (2008) The Annotation Game: On Turing (1950) on Computing, Machinery and Intelligence

Wednesday, August 27, 2025

2b. Harnad, S. (2008) The Annotation Game: On Turing (1950) on Computing, Machinery and Intelligence

2b. Harnad, S. (2008) The Annotation Game: On Turing (1950) on Computing,Machinery and Intelligence. In: Epstein, Robert & Peters, Grace (Eds.) Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer. Springer

http://www.youtube.com/watch?v=cDxCBqSm1ys

This is Turing's classical paper with every passage quote/commented to highlight what Turing said, might have meant, or should have meant. The paper was equivocal about whether the full robotic test (T3) was intended, or only the email/penpal test (T2), whether all candidates are eligible, or only computers, and whether the criterion for passing is really total, liefelong equavalence and indistinguishability or merely fooling enough people enough of the time. Once these uncertainties are resolved, Turing's Test remains cognitive science's rightful (and sole) empirical criterion today.

52 comments:

Emily BrownSeptember 9, 2025 at 11:26 AM
Building on my earlier post, I think the main takeaway from the Turing Test and the reading is that we need not be concerned with the Other-Minds Problem. As mentioned whether machines beyond our own selves can feel is undecidable. There is no methodology for resolving that uncertainty. Instead, we can focus on the Turing Test’s capacity for identifying what can and cannot perform as a human allowing it to be valuable in its ability to serve as a framework for reverse-engineering cognition. Yes it doesn’t prove consciousness, but is a useful rigorous test for observable cognitive capacity. I find it interesting that it is readily accepted that we cannot prove that other humans feels but are more hesitant to extend the same to machines. Is it not enough to allow the response to be uncertain? Instead, we struggle with rejecting the idea that machine do or do not think...
ReplyDelete
Replies
ElleSeptember 9, 2025 at 11:41 AM
In my opinion, the introduction of the different levels of the Turing Tests (t0, T2, T3, etc) was helpful in clarifying the underlying importance of the Turing Test as a whole. This hierarchy really helped center Turing’s point into a clearly scientific methodology; this was key because one of the questions I had initially when reading Turing’s paper was, quite frankly, “what’s the point?” Why does it matter if a computer can pass as a human. This paper as a whole helped me to get a better grasp as to the possible applications of the Turing Test and how it can be used to guide research in the field of Cognitive Science further.
ReplyDelete
Replies
Sophie PotvinSeptember 10, 2025 at 2:20 PM
"In the process of trying to imitate an adult human mind we are bound to think a good deal about the process which has brought it to the state that it is in. We may notice three components.
(a) The initial state of the mind, say at birth,
(b) The education to which it has been subjected,
(c) Other experience, not to be described as education, to which it has been subjected."
"An important feature of a learning machine is that its teacher will often be very largely ignorant of quite what is going on inside, although he may still be able to some extent to predict his pupil's behavior."

In this passage, Turing explicitly says that his goal is to imitate an adult mind (not brain!). The steps follow human learning to make the program as “humanly” possible so their performance would be indistinguishable. However, if the program is only going to pass T2, I find it difficult to imagine the kind of experience it could be subjected in a non-physical world because it would be devoid of a physical body (mentions that the machine does not have limbs). Furthermore, Turing mentions that the programmer might not always know what is happening inside the learning machine. Harnad made clear the difference between AI which is the generation of performant tool as a goal and CM (cognitive modelling) whose goal is to explain the generation of human cognition. Even though, Turing claims that performance can lead us to an explanation of cognition — by succeeding at the Turing test and creating a program based on human learning — I do not think that it suffices to explain “why” and “how”. It seems more like AI than CM to me because Turing’s goal might be to explain cognition, but the measurements are observable verbal behaviours and based on performance while lacking a more complex explanation of thinking.
ReplyDelete
Replies
Jesse MillmanSeptember 10, 2025 at 6:28 PM
In this paper, Turing's "imitation game" is reframed as a "methodology for reverse-engineering human cognitive performance capacity." But even if we were to successfully reverse-engineer our cognitive performance capacities, how much would it truly reveal about what thinking is and the mechanisms behind it? Suppose I wanted to understand how a door works and began to reverse-engineer it. I might fill a doorframe with metal sheets and build an electronic pulley system to lift the sheets on command. This door could perform all the functions of a typical door - facilitating movement in and out, insulating heat, and so on - yet its mechanism would be entirely different (most doors open by rotating on an axis, are made of hinges and wood...). This suggests that functional replicability alone does not provide genuine insight into how something actually works.

Marr's three levels of cognition are particularly relevant to this discussion. I believe that to successfully reverse-engineer cognition, a model must resemble humans not only on the ‘computational’ level (what problem the system is solving) but also on the ‘algorithmic’ level (what process or principles the system uses to solve the problem). As the door analogy illustrates, building a system that merely solves the same problem does not guarantee that it does so using the same underlying principles. Harnad’s introduction of T3 and T4 demonstrates his awareness that mere symbol manipulation is insufficient, yet even these stricter tests may not reveal the actual algorithms humans use. Ultimately, can we ever know with certainty that we’ve learned more about what goes on inside our ‘black box’ just by creating another one?
ReplyDelete
Replies
Camille KahwajiSeptember 10, 2025 at 7:58 PM
"A device we built but without knowing how it works would suffice for AI but not for CM [cognitive modelling]."
While reading this passage, and the ones it is adjacent to, I started understanding that there may be different types of 'entities' termed 'machine'. I am inclined to believe that everything could be a machine in the broad sense of "any dynamical system". In the text, it appears like the 'machine classification' is based not necessarily on the type of causal relationship (only that there ought to be at least one), but more so on the level of understanding WE have regarding given dynamical systems. In cases in which we understand the dynamics of the causes, the intermediate states, & the consequences, we can then label a machine as a 'Turing test candidate', which is most probably what most people think of when they think of the word 'machine' (e.g. a robot serving food in a restaurant, which we know for a fact a team of engineers understand totally).
In a recent paper (Kagan et. al., 2022), neurons learned how to play the arcade game 'Pong'. Which one, between an electronic device programmed to play 'Pong', and the neurons (called DishBrain in the paper), are most likely to end up being a better candidate for the Turing test? After all, both systems' way of transmitting information (i.e. turning causes into consequences) are well-understood...
ReplyDelete
Replies
SannahSeptember 10, 2025 at 9:26 PM
In Turing’s framing, the focus is on performance capacity, not on the inner experience of “feeling”. This reminds me of the hard problem of consciousness because even if a machine could pass T2 or T3, we would still not know if it feels anything. Professor Harnad notes that this problem may be insoluble. Here is something that came to mind for me: from a cognitive science perspective, or at least I think is a cognitive science perspective, perhaps leaving the hard problem unsolved preserves the uniqueness of human/animal subjectivity compared to that of non-humans/animals, like robots, while still allowing us to study cognition observationally through behavior and performance.
ReplyDelete
Replies
Jad ToutounjiSeptember 13, 2025 at 12:25 PM
"A reasonable definition of "machine," rather than "Turing Machine," might be any dynamical, causal system. That makes the universe a machine, a molecule a machine, and also waterfalls, toasters, oysters and human beings. Whether or not a machine is man-made is obviously irrelevant. The only relevant property is that it is "mechanical" -- i.e., behaves in accordance with the cause-effect laws of physics"
This passage makes us look at the word “machine” in a much bigger way. If a machine is just something that follows cause-and-effect rules, then everything around us counts. I think that’s kind of relieving because it reminds us that our brains aren’t separate from nature. They follow the same physical laws as everything else. But at the same time, this definition feels a little too broad. If everything is a machine, then just saying “the brain is a machine” doesn’t actually explain much. In class, we brought up the example of a toaster, which along with the brain can both be considered machines. I'd like to think however that they are significantly different. That means the real question is what makes some machines different and maybe capable of thinking and feeling while others are supposedly not. The real challenge is to figure out the special processes that make "machines" like us different from others.
ReplyDelete
Replies
Sierra SmithSeptember 13, 2025 at 1:05 PM
Humans are able to assign meaning to words by interacting with their environment and then use the meaning that they’ve learned when interacting with each other. So it follows that for a machine to be fully indistinguishable from a human, which is the goal of the Turing Test, it must also have experiences interacting with the environment not just through linguistic data sets, which is why it is necessary to extend the definition of the original Turing Test to T3, robotic interaction. As the deeper value of the Turing Test lies in offering a methodology for building and testing models of cognition, this interaction could be crucial in finding out if machines can do what we can do. Though we will never be able to fully know if they fully feel (the other minds problem), this falls out of the jurisdiction of cognitive science and remains a philosophical debate as it applies just as much to other humans just as it does to machines.
ReplyDelete
Replies
Emma RiskeSeptember 14, 2025 at 7:12 AM
In a machine at the T4 level, there is “Total indistinguishability in external performance capacity as well as in internal structure/function.”. The field of linguistics has a distinction between competence, which is one’s subconscious or underlying abilities when it comes to a language, and performance, which is the actual output of what is said, written or expressed. Performance is often poorer than competence due to factors like environment, tiredness, stress, or distraction. Extending this idea to a machine at the T4 level - I know this is beyond the limits of the Turing Test, as it goes beyond looking at only performance - would the machine have to have this same performance/competence distinction if it has to have the same internal structure/function?
ReplyDelete
Replies
RandalaSeptember 14, 2025 at 2:47 PM
What caught my attention was Turing’s decision to call his test the “imitation game.” To me, that makes it sound like a trick or a joke, when really he was laying out a serious scientific test. Isn’t that why people often misunderstand the Turing Test as just “fooling” humans? The real point is to see if a machine can actually do the same things we can do. Calling it a game feels confusing and takes away from how important the test really is for thinking about machines and human cognition.
ReplyDelete
Replies
Lauren StranoSeptember 15, 2025 at 12:33 PM
What I find interesting is how breaking down the Turing test had anticipated today’s debates around large language models: they might pass something close or similar to T2, but without embodiment (T3) or grounded cognition. Turing avoided the consciousness question, but then raised other questions. If an LLM is interacting with a human (who does have sensorimotor experiences) does that mean that their output that is generated based on the input of the user have meaning? Can the LLM then pass a T3 test and be considered grounded in cognition and embodiment but through the user?
ReplyDelete
Replies
Adelka Felcarek-HopeSeptember 15, 2025 at 2:13 PM
This reading helped me understand the distinctions between T0/T1/T2/T3. From what I understand, the LLMs we have access to now (like ChatGPT) are T2 machines, as they lack sensorimotor performance capacity. They can only simulate/describe sensorimotor experience thanks to the “big gulp” of information.

However, I still have a hard time wrapping my head around the differences between T3 and T4 and hoe they can be studied. The paper highlights the “fuzziness” between these hierarchical levels by asking, “Is blushing T3 or T4?” While this question extends beyond cognition (T3), I still find it intriguing. On the one hand, blushing is a sensorimotor response, affecting a person’s “performance capacity,” which is consistent with T3. However, blushing is also the result of an internal physiological response, which aligns with T4 (physical structure/function).
ReplyDelete
Replies
Ayla JabrSeptember 15, 2025 at 8:31 PM
The reading shows how Turing reframes “thinking” as observable performance, yet even a fully embodied, sensorimotor T3 robot would leave the “hard problem” unresolved, the question of whether anything is actually “felt” inside the machine. But perhaps we’ve been assuming the wrong direction of progress. While we build more complex systems to approximate cognition, the reading invites us to consider whether consciousness might instead emerge from the simplicity of spontaneity.Yet spontaneity could be only the surface of something far more complex. And what if consciousness itself is not the end state of this process but a stage for some larger principle we haven’t yet conceived? Could the “hard problem” be not just about explaining subjective experience, but about discovering what it enables beyond cognition as we know it?
ReplyDelete
Replies
Jenna DeCorbySeptember 15, 2025 at 8:51 PM
Something that stood out to me in Harnad’s writing was his response to Lady Lovelace's objection, stating that a mechanical machine can never produce anything new. The notion that nothing has been proven to have originated since the Big Bang demonstrates that machines are cause-and-effect systems. No machine, including humans, is really creating anything original. Rendering Lady Lovelace’s objection moot
ReplyDelete
Replies
Sofia Vaillant ForliniSeptember 16, 2025 at 5:31 AM
“Disabilities and appearance are indeed irrelevant. But nonverbal performance capacities certainly are not. Indeed, our verbal abilities may well be grounded in our nonverbal abilities (Harnad 1990; Cangelosi & Harnad 2001; Kaplan & Steels 1999). (Actually, by "disability," Turing means non-ability, i.e., absence of an ability; he does not really mean being disabled in the sense of being physically handicapped, although he does mention Helen Keller later.)”
I am uncertain that we can truly say in the context of these tests that “disabilities […] are irrelevant”. In this situation, disabilities refer to the computer’s inabilities to practice certain activities that “normal” humans can: walking, dancing, kissing, etc. It seems that these activities do have a certain relevance to what it is to think as a human: the structure of the brain (the center of human thinking) is altered by these kinds of activities, and it seems like a shortcut to state that they do not influence the way we think. Thus, if a computer remains plagued by an absence of the understanding of the phenomenology of human activities (what it feels like to partake in these activities, what it changes in one), it seems unlikely that the Turing Test will be able to accomplish its objective, namely “finding out what kind of machine we are, by designing a machine that can generate our performance capacity, but by causal/functional means that we understand, because we designed them.”
ReplyDelete
Replies
Alexa de ClerckSeptember 16, 2025 at 9:35 AM
Harnad stresses that “thinking is as thinking does”, yet he also cautions that reducing the Turing Test to mere verbal imitation risks collapsing performance into a party trick. What struck me is the hierarchy of T2–T5: the distinction between text-only exchanges and fully embodied robotic capacities. Taken together, the paper suggests that cognition is not just about passing for human but about the scope and grounding of performance. A simulated plane cannot fly, and a simulated robot cannot act in the physical world—real performance requires real-world causality. This leaves me understanding that even if we build a T3-passing robot, indistinguishable from us in action, we still face the “other-minds problem”. So, I wonder, if indistinguishability is all we can test, is cognitive science ultimately about explaining behaviour only? Is the felt reality and neuroscience of thought permanently outside its reach?
ReplyDelete
Replies
Rachel YangSeptember 16, 2025 at 3:48 PM
A point I found myself questioning is Harnad’s strong emphasis on embodiment as the solution to the symbol grounding problem. He suggests that T3 (giving a system sensorimotor capacity) is the step needed for genuine grounding. But I wonder whether this simply moves the problem rather than solves it. A robot might be able to interact with the world and link symbols to sensory inputs, but how do we know that this mapping really produces “understanding” rather than just more sophisticated symbol manipulation? In other words, embodiment may enrich performance, but it may not fully close the gap between doing and truly understanding.
ReplyDelete
Replies
Jean-Remy Alvarenga ArguetaSeptember 16, 2025 at 5:41 PM
I find it very interesting how the Lovelace objection has persisted across time. It is very reminiscent of the AI art debate currently ongoing (although AI art programs are mostly at the t0 and t2 levels). Depending on how she defined what something new meant to her, I wonder if you could say that apps like MidJourney are an answer to this objection. On the other hand, the way that they go about creating art is so beyond how natural human capabilities work by cheating (like chatgpt) that it might not even be worth bringing up as a counter in this context.
ReplyDelete
Replies
Liam CurrySeptember 17, 2025 at 11:30 AM
“But even without having to invoke the other-minds problem (Harnad 1991), one needs to remind oneself that a universal computer is only formally universal: It can describe just about any physical system, and simulate it in symbolic code, but in doing so, it does not capture all of its properties: Exactly as a computer-simulated airplane cannot really do what a plane plane does (i.e., fly in the real-world), a computer-simulated robot cannot really do what a real robot does (act in the real-world) -- hence there is no reason to believe it is really thinking either.”

This passage differentiates between computer simulations and real-world, physical processes, using simulated vs real airplanes as an example, citing this as reason to discard the belief that universal computers might be able to “think.” I do not quite understand this reasoning or analogy, due to the fact that it is not obvious that “thinking” need always emerge the same way in the physical world. We are convinced that many living organisms “think” and that this is somehow related to the common trait of a nervous system, but these systems can present quite differently across species (on some level). Moreover, considering our complete lack of understanding of the Hard Problem, why should we assume that thinking can only emerge as a product of systems made physically similarly to those made up of biomolecules? Do we have any good reasons besides induction to assume that thinking cannot emerge from any systems other than non-simulated physical (perhaps biological) ones? What if the hardware of “thinking” is less important than the software?
ReplyDelete
Replies
SeviSeptember 23, 2025 at 5:14 PM
steven has said before that GPT is 'cheating' the T2 test because it's taking a huge gulp of data from the internet rather than just going off of whatever the user has inputted. Since we know that GPT is just a text predictor, do the statistics that it runs count as computation? What would the symbols be and what are the instructions it follows?

Also, if you were to build a 'real' T2 machine, would it need a better understanding of language (i.e., know what nouns and verbs are and define them as different symbols, and follow the rules of syntax as its algorithm) in order to write back? Without knowing any of the meanings intrinsic to the words it outputs (assuming 'meaning' is for cognizers), how would its response be at all sensical?
ReplyDelete
Replies
MattSeptember 26, 2025 at 7:08 PM
It seems that T3 is necessary to reverse-engineer our consciousness as it is profoundly linked to our sensorimotor experience. Recall that T3 is defined as Total indistinguishability in robotic (sensorimotor) performance capacity by Harnad. However, I am somewhat skeptical whether this would be possible. For a machine to be functionally indiscernible to a human, it would have to have reached at least T4 or T5. Robotics can get very close as we have many examples today, but for it to be indistinguishable in function it would have to be identical to the molecular level.
ReplyDelete
Replies
Rena AtandahOctober 3, 2025 at 10:56 PM
What stood out to me is how much hinges on clarifying what Turing really meant. Was it just the text-only version (T2) or the full robotic version (T3)? And is passing about tricking people sometimes, or genuinely matching human performance? Harnad makes clear that the test is about what a system can do, not what it may or may not feel. The Other Minds Problem matters for philosophy and in animal cognition research, but not for artifacts like computers where the uncertainty is near zero. Still, I wonder: why do we resist treating performance alone as “enough”?
ReplyDelete
Replies
John Pereira FaustinoOctober 15, 2025 at 11:33 AM
Harnad’s Annotation Game is an extension to--it clarifies the goals of Turing’s imitation game: not attempting to deceive but a framework for reverse-engineering cognition. Expanding on 2a, “Can machines think?” becomes “Can we build systems that do what we do, for the same causal reasons?”
Much of the preceding ambiguity is resolved by this hierarchy (T0- T5) after reading Turing's original text. Meaning is grounded at T3, the embodied, sensorimotor level, while T2 assesses language ability. A purely verbal system can mimic speech, but it is unable to explain its own words in the absence of causal relations with the outside world.
We may compare grounded and ungrounded models to determine whether embodied interaction produces generalization that disembodied computation cannot, which is what makes Harnad's hierarchy valuable. Only when these theoretical disparities result in quantifiable performance differences does cognitive science continue to be an empirical field.
ReplyDelete
Replies
Esosa ObanoOctober 20, 2025 at 6:37 PM
Turing argues that the Turing Test evades the “hard problem” by focusing on observable behaviour, and he suggests that the gap between animate and inanimate entities is far greater than that between humans and other animals. Given this, to what extent can the Turing test serve as a reliable measure of machine intelligence? For example, take a machine that perfectly models the behaviour and decision-making of a bald eagle–knowing what it can eat, its optimal sleeping patterns, and the best times to hunt. If human and animal intelligence differ only in degree and not kind, would such a machine be legitimately considered intelligent, or does intelligence require more than a simulation of natural behaviours?
ReplyDelete
Replies

Add comment

catcomconm2025

Wednesday, August 27, 2025

2b. Harnad, S. (2008) The Annotation Game: On Turing (1950) on Computing, Machinery and Intelligence

52 comments:

Closing Overview of Categorization, Communication and Cognition (2025)

Report Abuse