Wednesday, August 27, 2025

2b. Harnad, S. (2008) The Annotation Game: On Turing (1950) on Computing, Machinery and Intelligence

 2b. Harnad, S. (2008) The Annotation Game: On Turing (1950) on Computing,Machinery and Intelligence. In: Epstein, Robert & Peters, Grace (Eds.) Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer. Springer 

This is Turing's classical paper with every passage quote/commented to highlight what Turing said, might have meant, or should have meant. The paper was equivocal about whether the full robotic test (T3) was intended, or only the email/penpal test (T2), whether all candidates are eligible, or only computers, and whether the criterion for passing is really total, liefelong equavalence and indistinguishability or merely fooling enough people enough of the time. Once these uncertainties are resolved, Turing's Test remains cognitive science's rightful (and sole) empirical criterion today.

52 comments:

  1. Building on my earlier post, I think the main takeaway from the Turing Test and the reading is that we need not be concerned with the Other-Minds Problem. As mentioned whether machines beyond our own selves can feel is undecidable. There is no methodology for resolving that uncertainty. Instead, we can focus on the Turing Test’s capacity for identifying what can and cannot perform as a human allowing it to be valuable in its ability to serve as a framework for reverse-engineering cognition. Yes it doesn’t prove consciousness, but is a useful rigorous test for observable cognitive capacity. I find it interesting that it is readily accepted that we cannot prove that other humans feels but are more hesitant to extend the same to machines. Is it not enough to allow the response to be uncertain? Instead, we struggle with rejecting the idea that machine do or do not think...

    ReplyDelete
    Replies
    1. Emily, yes, the Other Minds Problem (OMP) for human beings is not a problem for cognitive science, just for philosophy (epistemology, scepticism).

      The OMP for nonhuman animals is a part of cognitive science, but for mammals, birds, fish, and other vertebrates it is almost as close to certain that they feel as it is that humans feel.

      For certain invertebrates, however, (e.g., octopus, lobsters, bees), the OMP is being actively studied in cognitive science (Week 11).

      For organisms that lack a nervous system, such as single-celled microbes, plants and fungus it is highly improbable that they feel.

      And the uncertainty is near zero for inanimate objects such as rocks, mountains, rivers, clouds, planets, stars, or atoms -- including all human-made artifacts so far, such as clocks, cars, thermostats, computers or robots.

      Certain people ("animists" and "panpsychists") believe that "everything" feels (although it's not clear what counts as a "thing"...).

      ["Matrix" thinking (and perhaps C=C computationalism) may be a form of animism (all of which may be driven by human evolution, including the evolution of mirror-neuron and "mind-reading" capacities, including language (Week 4).]

      Delete
    2. As Emily pointed out, it is “interesting that it is readily accepted that we cannot prove that other humans feels but are more hesitant to extend the same to machines.” I personally definitely have this intuition and I appreciate the reading for clarifying why we feel this way. We feel comfortable “mind-reading” other people because they behave so similarly to us. The reading writes, “we are not worried about the minds of our fellow-human beings, because they behave just like us and we know how to mind-read their behavior. By the same token, we have no more or less reason to worry about the minds of anything else that behaves just like us -- so much so that we can't tell them apart from other human beings.” This helped me better understand the argument that the Turing Test should be T3, where machines interact with the real world, rather than T2 where machines just converse. I like the example on the simulated planes. Simulated planes are not the same as real flying planes and so T3 can reduce our hesitation and help us be more willing to “mind-read” machines since they would be behaving the way we do in life.

      Delete
    3. Annabelle, I think you're beginning to get it...

      Delete
    4. ***EVERYBODY PLEASE NOTE: I REDUCED THE MINIMUM NUMBER OF SKYWRITINGS. BUT THE READINGS ARE **ALL** RELEVANT TO AN OVERALL UNDERSTANDING OF THE COURSE. SO, EVEN IF YOU DO NOT DO A SKYWRITING ON ALL OF THEM, AT LEAST FEED EACH READING YOU DO NOT READ TO CHATGPT AND ASK IT FOR A SUMMARY, SO YOU KNOW WHAT THE READING SAID — OTHERWISE YOU WILL NOT HAVE A COMPLETE GRASP OF THE COURSE TO INTEGRATE AND INTERCONNECT FOR THE FINAL EXAM.***

      Delete
  2. In my opinion, the introduction of the different levels of the Turing Tests (t0, T2, T3, etc) was helpful in clarifying the underlying importance of the Turing Test as a whole. This hierarchy really helped center Turing’s point into a clearly scientific methodology; this was key because one of the questions I had initially when reading Turing’s paper was, quite frankly, “what’s the point?” Why does it matter if a computer can pass as a human. This paper as a whole helped me to get a better grasp as to the possible applications of the Turing Test and how it can be used to guide research in the field of Cognitive Science further.

    ReplyDelete
    Replies
    1. Elle, yes, Turing-Testing the success of the reverse-engineering is important, but the real work is in discovering and designing the causal mechanism that we can go on to T-Test!

      Computationalism -- which would "only" call for discovering the recipe for passing T2 (verbal only) -- would be easier, since it requires only the recipe, whereas T3 (verbal and sensorimotor) would require designing the 3D-Printer of the robot. T4 would require even more...

      Delete
    2. I agree that the clarification of the levels of the Turing Tests was extremely helpful in creating a better understanding. After understanding why the Turing Test rejects T4 and T5, and focuses on performance, it does make me wonder if in a few years this will still be the case. If T4 focuses on neurological equivalence and and T5 on molecular and physical equivalence, what is to say that machines won't master the performative aspect and the next concern will be creating machines that have the same neurological capacities and soon after the physical biology. Though the content and innovation of technology is impressive and progressive, the mere mystery of it is concerning in a way that it can seem to blur comprehension.

      Delete
    3. Kaelyn, cognitive science is trying to reverse-engineer the causal mechanisms that produce people's capacity to do the (cognitive) things they can do. T2/T3 covers what they can do. T4 and T5 are only relevant to cognitive where they are needed to pass T2/T3. They are of course relevant to medical science and clinical psychology.

      Delete
  3. "In the process of trying to imitate an adult human mind we are bound to think a good deal about the process which has brought it to the state that it is in. We may notice three components.
    (a) The initial state of the mind, say at birth,
    (b) The education to which it has been subjected,
    (c) Other experience, not to be described as education, to which it has been subjected."
    "An important feature of a learning machine is that its teacher will often be very largely ignorant of quite what is going on inside, although he may still be able to some extent to predict his pupil's behavior."

    In this passage, Turing explicitly says that his goal is to imitate an adult mind (not brain!). The steps follow human learning to make the program as “humanly” possible so their performance would be indistinguishable. However, if the program is only going to pass T2, I find it difficult to imagine the kind of experience it could be subjected in a non-physical world because it would be devoid of a physical body (mentions that the machine does not have limbs). Furthermore, Turing mentions that the programmer might not always know what is happening inside the learning machine. Harnad made clear the difference between AI which is the generation of performant tool as a goal and CM (cognitive modelling) whose goal is to explain the generation of human cognition. Even though, Turing claims that performance can lead us to an explanation of cognition — by succeeding at the Turing test and creating a program based on human learning — I do not think that it suffices to explain “why” and “how”. It seems more like AI than CM to me because Turing’s goal might be to explain cognition, but the measurements are observable verbal behaviours and based on performance while lacking a more complex explanation of thinking.

    ReplyDelete
    Replies
    1. Sophie, what does "more complex" mean? Are you suggesting that T2 (verbal capacity) is not enough of a test? that T3 is needed? Please explain . (Have you read 2b (annotation game)? (Turing's child T2 suggestion has been discussed in prior comments and replies.)

      Delete
    2. I think I mixed up too many concepts😅

      By more complex, I meant that programmers would know what is going on inside the learning machine because their goal would be to explain human cognition with CM rather than focusing on behaviour (as a goal). T2 and T3 are sufficient when defining cognition as “thinking is as thinking does” because it showcases verbal or sensorimotor abilities.
      The second thing, it seemed to me that Turing’s c) component about experience would be more relevent for a T3. The candidate would at least have sensorimotor experience in the real world to gain experience.

      Delete
  4. In this paper, Turing's "imitation game" is reframed as a "methodology for reverse-engineering human cognitive performance capacity." But even if we were to successfully reverse-engineer our cognitive performance capacities, how much would it truly reveal about what thinking is and the mechanisms behind it? Suppose I wanted to understand how a door works and began to reverse-engineer it. I might fill a doorframe with metal sheets and build an electronic pulley system to lift the sheets on command. This door could perform all the functions of a typical door - facilitating movement in and out, insulating heat, and so on - yet its mechanism would be entirely different (most doors open by rotating on an axis, are made of hinges and wood...). This suggests that functional replicability alone does not provide genuine insight into how something actually works.

    Marr's three levels of cognition are particularly relevant to this discussion. I believe that to successfully reverse-engineer cognition, a model must resemble humans not only on the ‘computational’ level (what problem the system is solving) but also on the ‘algorithmic’ level (what process or principles the system uses to solve the problem). As the door analogy illustrates, building a system that merely solves the same problem does not guarantee that it does so using the same underlying principles. Harnad’s introduction of T3 and T4 demonstrates his awareness that mere symbol manipulation is insufficient, yet even these stricter tests may not reveal the actual algorithms humans use. Ultimately, can we ever know with certainty that we’ve learned more about what goes on inside our ‘black box’ just by creating another one?

    ReplyDelete
    Replies
    1. Jesse, if symbol-manipulation (i.e., computation) is insufficient, then computationalism (C=C) is wrong, and so no algorithm can pass T2: Is that what you mean?

      But that requires showing that C=C is wrong. That's what Searle will try to do in 3a. How?

      (See earlier replies about Marr: Was Marr a computationalist? Is Turing? Does the possibility of more than one algorithm for producing the same outcomes refute computationalism?)

      Delete
    2. Jesse, what stands out in your door example is how reverse-engineering aims to get to the same final result, but might have a completely different process/mechanisms. Now to me it says that when we are trying to reserve-engineer human cognitive performance capacity, we assume that there is only one possible mechanism “allowing” our thinking. I say we “assume” such thing because else, just like you hint to, it wouldn’t truly reveal anything about the mechanisms behind OUR thinking if we engineered thinking through a process different from our own. The issue is, to compare mechanisms, we’d first need to understand the human cognitive mechanism itself. So we end up in a circular problem: we can't tell if our engineered thinking aligns with human thought without already knowing how human thought works. Even if we succeed in reverse-engineering cognition, we may have simply “created” a new form of cognitive capacity, not revealed our own. Hence, I agree that functional replication doesn’t guarantee explanatory insight, and I found your perspective genuinely interesting.

      Delete
    3. Thanks for the replies, Prof. Harnad and Emanuelle.

      As for the first response, I think it’s possible to reconcile computation being insufficient with C=C still being possible (though I lean toward thinking it’s wrong). My aim with the door analogy was to show that reverse-engineering a system to produce the same results via symbol manipulation doesn’t confirm that this is how our minds actually work. This doesn’t falsify computationalism; it just raises an epistemic worry: even if C=C were true, we couldn’t know it simply by building another system that passes the tests. Don’t these tests only reveal what is possible, rather than what is actual? Maybe my logic is flawed, let me know!

      Emmanuelle, I agree with the paradox you point out. I’m starting to think the “easy problem” isn’t one for computer science to solve. How could we ever affirm the accuracy of any computational model? Even if a computational model could one day pass T4/T5, I’d still be skeptical of whether it was truly equivalent to our cognitive processes just because it behaved identically. A similar question arises at every level of the TT: “Although the computer does these things/has these properties, is it REALLY doing them?” This makes me doubt whether we’ll ever be able to distinguish genuine equivalence from mere simulation.

      Delete
    4. Emanuelle, you seem to be missing a few important points:

      Cognitive Capacity: Cognitive Science is not trying to model or explain one particular individual person. It is trying to model generic, average human cognitive capacity, the kinds of (cognitive) things that any normal human can do.

      Indistinguishable Capacity: The Turing Test T2 is verbal-only: the capacity to text with anyone, on anything an average human can text about, completely indistinguishably from a real human (lifelong, in principle). That requires memory capacity and learning capacity too. (Why?) T3 (What's that?) requires a lot more

      Turing-Indistinguishability is an extremely demanding criterion, both in science and in engineering (forward and reverse). It requires producing every observable property, whether of an individual object, or of a kind of object, or of a kind of property of a kind of object. With T2, that means producing the human capacity for chatting with one another in words. If you want to know what that is like, try it out with ChatGPT (even though it's cheating).

      Implementation-Independence. If computationalism (C=C) is correct, then, yes, there may be more than one computational recipe that can produce the same outcome (Turing Indistinguishable T2 capacity.) T3 is not just computation, however, so T3 is not hardware-independent software. T2, according to C=C, is.

      (But when we get to Evolution (Weeks 4 and 7) we'll see there's another form of equivalence, even for dynamical systems that see and move; and evolution has found more than one way to do those things too. (This is called "multiple realizability" and the "underdetermination of function by structure."

      Delete
    5. Jesse, computationalism ("C=C") [cognition is just computation] can't be both insufficient and true. And "cognition is partly computation" is not computationalism (C=C).

      I think you might be conflating implementation-independence (C=S computationalism) with multiple realizability (of dynamical systems, like flying). And it really is true of both algorithms and physical dynamics that there can be more than one way to do things (maybe even gravitation).

      But that's way too far from home, into metaphysics. Searle will just try to show that C is not just C (though he forgets about the "just" too!).

      Your logic is not flawed if you are doing metaphysics (but CogSci is not).

      And it's flawed if you forget that C=C means "C is just C". Your worries are about the underdetermination of all "scientific" explanation (it's never certain):

      Consult Wikipedia or ChatGPT -- perhaps also about the metaphysical doctrine of the "Identity of Indiscernibles": we're just doing cognitive science here...)

      Delete
  5. "A device we built but without knowing how it works would suffice for AI but not for CM [cognitive modelling]."
    While reading this passage, and the ones it is adjacent to, I started understanding that there may be different types of 'entities' termed 'machine'. I am inclined to believe that everything could be a machine in the broad sense of "any dynamical system". In the text, it appears like the 'machine classification' is based not necessarily on the type of causal relationship (only that there ought to be at least one), but more so on the level of understanding WE have regarding given dynamical systems. In cases in which we understand the dynamics of the causes, the intermediate states, & the consequences, we can then label a machine as a 'Turing test candidate', which is most probably what most people think of when they think of the word 'machine' (e.g. a robot serving food in a restaurant, which we know for a fact a team of engineers understand totally).
    In a recent paper (Kagan et. al., 2022), neurons learned how to play the arcade game 'Pong'. Which one, between an electronic device programmed to play 'Pong', and the neurons (called DishBrain in the paper), are most likely to end up being a better candidate for the Turing test? After all, both systems' way of transmitting information (i.e. turning causes into consequences) are well-understood...

    ReplyDelete
    Replies
    1. Camille, yes, any dynamical system is a machine (and not just digital computers manipulating arbitrary symbols according to recipes: "Turing Machines").

      A robot serving food is a dynamical system, and therefore a machine, and (according to the Strong C/TT) it can be modelled or simulated by just a computer manipulating arbitrary symbols according to a recipe.

      But just a computer manipulating arbitrary symbols according to a recipe cannot serve food. Please explain the difference.

      There may be many different recipes that can give exactly the same output. There may be many different recipes for playing Pong. Probably fewer recipes for playing Pong and Chess. Probably a lot fewer for playing Pong and Chess and passing T2. (Why?)

      The difference between AI and CM is not in whether we understand either the dynamics or the algorithm, but what they can do: serve food or play Pong or make drones or bombs or money? -- Or pass T2 (or T3 or T4)?

      According to C=C computationalism, algorithms alone can already pass T2; all they need is a digital computer to execute them. (The computer's dynamics are irrelevant.) We'll soon look at whether that's true.

      But for passing T3 (or T4) the dynamics cannot be irrelevant. (They're not even irrelevant for designing robots to serve food.)

      Delete
    2. Is the difference between a robot serving food and a symbol-manipulating computer modelling food serving the same as the difference “between a real airplane […] and a computer simulation of an airplane”, i.e., embodiment and situatedness? While the robot is actually serving food in the real world, the computer is merely simulating the action of serving food in a virtual world. It is similar to, as mentioned earlier, the difference between T2, which only requires verbal performance capacity that can be demonstrated in a virtual world by a virtual robot, and T3, which requires sensorimotor performance capacity that is done only through interaction with the real world. This would explain the reason why there are fewer recipes for playing Pong and Chess and for passing T2 because they all do not require interactions with the real world, as mentioned in the paper when addressing the Argument from Informality of Behaviour.

      Delete
  6. In Turing’s framing, the focus is on performance capacity, not on the inner experience of “feeling”. This reminds me of the hard problem of consciousness because even if a machine could pass T2 or T3, we would still not know if it feels anything. Professor Harnad notes that this problem may be insoluble. Here is something that came to mind for me: from a cognitive science perspective, or at least I think is a cognitive science perspective, perhaps leaving the hard problem unsolved preserves the uniqueness of human/animal subjectivity compared to that of non-humans/animals, like robots, while still allowing us to study cognition observationally through behavior and performance.

    ReplyDelete
    Replies
    1. Sannah, yes, the Hard Problem (of cognitive science) is to reverse-engineer and explain how and why living organisms (or anything else) can feel. And, yes, not being able to solve the Hard Problem certainly does not prevent us from trying to solve the easy problem.

      ("Subjectivity" and "consciousness" and a lot more words are just weasel-words for feeling. But preserving "uniqueness" is no compensation for not being able to explain how and why organisms feel rather than just do...)

      Delete
  7. "A reasonable definition of "machine," rather than "Turing Machine," might be any dynamical, causal system. That makes the universe a machine, a molecule a machine, and also waterfalls, toasters, oysters and human beings. Whether or not a machine is man-made is obviously irrelevant. The only relevant property is that it is "mechanical" -- i.e., behaves in accordance with the cause-effect laws of physics"
    This passage makes us look at the word “machine” in a much bigger way. If a machine is just something that follows cause-and-effect rules, then everything around us counts. I think that’s kind of relieving because it reminds us that our brains aren’t separate from nature. They follow the same physical laws as everything else. But at the same time, this definition feels a little too broad. If everything is a machine, then just saying “the brain is a machine” doesn’t actually explain much. In class, we brought up the example of a toaster, which along with the brain can both be considered machines. I'd like to think however that they are significantly different. That means the real question is what makes some machines different and maybe capable of thinking and feeling while others are supposedly not. The real challenge is to figure out the special processes that make "machines" like us different from others.

    ReplyDelete
    Replies
    1. Jad, yes, the Easy Problem of cognitive science is to reverse engineer cognitive capacities by discovering (and testing) them (T2, T3...). But that still leaves the Hard Problem...

      Delete
  8. Humans are able to assign meaning to words by interacting with their environment and then use the meaning that they’ve learned when interacting with each other. So it follows that for a machine to be fully indistinguishable from a human, which is the goal of the Turing Test, it must also have experiences interacting with the environment not just through linguistic data sets, which is why it is necessary to extend the definition of the original Turing Test to T3, robotic interaction. As the deeper value of the Turing Test lies in offering a methodology for building and testing models of cognition, this interaction could be crucial in finding out if machines can do what we can do. Though we will never be able to fully know if they fully feel (the other minds problem), this falls out of the jurisdiction of cognitive science and remains a philosophical debate as it applies just as much to other humans just as it does to machines.

    ReplyDelete
    Replies
    1. Sierra, yes, cognitive science has two problems to solve, the Easy and Hard one -- but what does it mean to say the Hard one is just a matter of "philosophical debate"? The capacity to feel is a real biological trait, as real as the capacity to move; it seems reasonable to want to reverse-engineer feeling too, including what it's for.

      Delete
    2. In my opinion, while it is true that feeling is real, the only thing one can be completely sure of is the fact that they are feeling something in this moment. We cannot be fully sure if other people really do feel, not to mention machines. Often we assume others ‘feel’ and have mental states like us because their behavior is ‘indistinguishable’ from ours. By that logic if we have a machine that behaves in a completely indistinguishable way to us then we can just assume it has similar cognitive states to us. Since cognitive science is an empirical science, it must work with what we can observe and test, and indistinguishable performance is something we can measure but ‘feeling’ is not. The other minds problem concerns subjective experience, which cannot be something we can prove or disprove through experimentation.

      Delete
    3. Sierra, yes, and this is one of the major points of Turing testing – because Turing knew that he could not directly observe “feeling” or more notably, “thinking” as an internal process, he could only measure performance capacity. As professor Harnad points out, the test “[does] not explain how thinkers can feel, [but rather] how they can do what they can do”. Therefore, Turing was not trying to solve the hard problem to begin with (i.e., how and why thinking organisms can feel) – he was simply concerned about the doings a machine could execute. That said, while feeling is not directly observable, it is undeniably real and central to human experience. From what I gather, cognitive science strongly relies on observable evidence, but it does not dismiss unobservable phenomena when it can be inferred through indirect means.

      Question for Professor Harnad: I am still a bit confused – how does one differentiate the other minds problem from the easy/hard problem? Or do these problems always overlap?

      Delete
  9. In a machine at the T4 level, there is “Total indistinguishability in external performance capacity as well as in internal structure/function.”. The field of linguistics has a distinction between competence, which is one’s subconscious or underlying abilities when it comes to a language, and performance, which is the actual output of what is said, written or expressed. Performance is often poorer than competence due to factors like environment, tiredness, stress, or distraction. Extending this idea to a machine at the T4 level - I know this is beyond the limits of the Turing Test, as it goes beyond looking at only performance - would the machine have to have this same performance/competence distinction if it has to have the same internal structure/function?

    ReplyDelete
    Replies
    1. Emma, yes, it's exactly the same distinction in the rest of cognitive science as a whole as in linguistics (which is a part of cogsci): The "performance/competence" distinction is the performance/performance-capacity distinction. Cogsci is about reverse-engineering and explaining how and why organisms are able to do all the (cognitive) things they are able to do. And this is true whether the performance-capacity is learned or evolved (innate). We'll be discussing this in Weeks 8 and 9).

      Delete
  10. What caught my attention was Turing’s decision to call his test the “imitation game.” To me, that makes it sound like a trick or a joke, when really he was laying out a serious scientific test. Isn’t that why people often misunderstand the Turing Test as just “fooling” humans? The real point is to see if a machine can actually do the same things we can do. Calling it a game feels confusing and takes away from how important the test really is for thinking about machines and human cognition.

    ReplyDelete
    Replies
    1. Randala, yes -- as has been said over and over in all the preceding commentary (which is why everyone should always read before posting their own commentary, so they can build on it, not just repeat it.)

      Delete
  11. What I find interesting is how breaking down the Turing test had anticipated today’s debates around large language models: they might pass something close or similar to T2, but without embodiment (T3) or grounded cognition. Turing avoided the consciousness question, but then raised other questions. If an LLM is interacting with a human (who does have sensorimotor experiences) does that mean that their output that is generated based on the input of the user have meaning? Can the LLM then pass a T3 test and be considered grounded in cognition and embodiment but through the user?

    ReplyDelete
    Replies
    1. Lauren, I has the same thoughts as I was reading! However, I think the reading suggests the answer is no, an LLM interacting with a human who has embodiment does not itself gain grounding. As Harnad explains, T2 is total verbal indistinguishability (like an email exchange), while T3 requires total robotic performance capacity in the real world. The key is that the system itself must generate its own sensorimotor interactions, not borrow them from a user. A human’s embodied input might make the LLM’s text seem more meaningful, but the machine still lacks independent grounding. Harnad also mentions that Turing’s restriction to written answers accidentally excluded these nonverbal capacities, even though they are central to how humans ground language in perception and action. So while LLMs may approach T2, they still fall short of T3’s requirement for autonomous embodiment if I understood correctly!

      Delete
  12. This reading helped me understand the distinctions between T0/T1/T2/T3. From what I understand, the LLMs we have access to now (like ChatGPT) are T2 machines, as they lack sensorimotor performance capacity. They can only simulate/describe sensorimotor experience thanks to the “big gulp” of information.

    However, I still have a hard time wrapping my head around the differences between T3 and T4 and hoe they can be studied. The paper highlights the “fuzziness” between these hierarchical levels by asking, “Is blushing T3 or T4?” While this question extends beyond cognition (T3), I still find it intriguing. On the one hand, blushing is a sensorimotor response, affecting a person’s “performance capacity,” which is consistent with T3. However, blushing is also the result of an internal physiological response, which aligns with T4 (physical structure/function).

    ReplyDelete
    Replies
    1. I understand the confusion surrounding the distinction between a machine that passes T3 and one that passes T4. However, I believe I understood the difference between T3 and T4 by understanding T5, which “rules out any functionally equivalent but synthetic nervous systems”. If T5 is one step above T4 then, if I’m correct, a machine that passes T4 must have the same structure of information processing as the human brain. However, they can be synthetic. In other words, a T4-passing robot must have mechanical processes exactly modelling our human neuronal connections, fiber tracts, signaling pathways, etc. but in synthetic physical form. This is what differentiates T3 from T4 as the former only requires that the performance capacity (i.e., “output” beyond email communications) be identical to that of a human. That is, the connections and pathways through which a T3-passing machine produces the same performance as a human are not relevant so long as the output is that which would have resulted from human cognition and action.

      Delete
  13. The reading shows how Turing reframes “thinking” as observable performance, yet even a fully embodied, sensorimotor T3 robot would leave the “hard problem” unresolved, the question of whether anything is actually “felt” inside the machine. But perhaps we’ve been assuming the wrong direction of progress. While we build more complex systems to approximate cognition, the reading invites us to consider whether consciousness might instead emerge from the simplicity of spontaneity.Yet spontaneity could be only the surface of something far more complex. And what if consciousness itself is not the end state of this process but a stage for some larger principle we haven’t yet conceived? Could the “hard problem” be not just about explaining subjective experience, but about discovering what it enables beyond cognition as we know it?

    ReplyDelete
  14. Something that stood out to me in Harnad’s writing was his response to Lady Lovelace's objection, stating that a mechanical machine can never produce anything new. The notion that nothing has been proven to have originated since the Big Bang demonstrates that machines are cause-and-effect systems. No machine, including humans, is really creating anything original. Rendering Lady Lovelace’s objection moot

    ReplyDelete
  15. “Disabilities and appearance are indeed irrelevant. But nonverbal performance capacities certainly are not. Indeed, our verbal abilities may well be grounded in our nonverbal abilities (Harnad 1990; Cangelosi & Harnad 2001; Kaplan & Steels 1999). (Actually, by "disability," Turing means non-ability, i.e., absence of an ability; he does not really mean being disabled in the sense of being physically handicapped, although he does mention Helen Keller later.)”
    I am uncertain that we can truly say in the context of these tests that “disabilities […] are irrelevant”. In this situation, disabilities refer to the computer’s inabilities to practice certain activities that “normal” humans can: walking, dancing, kissing, etc. It seems that these activities do have a certain relevance to what it is to think as a human: the structure of the brain (the center of human thinking) is altered by these kinds of activities, and it seems like a shortcut to state that they do not influence the way we think. Thus, if a computer remains plagued by an absence of the understanding of the phenomenology of human activities (what it feels like to partake in these activities, what it changes in one), it seems unlikely that the Turing Test will be able to accomplish its objective, namely “finding out what kind of machine we are, by designing a machine that can generate our performance capacity, but by causal/functional means that we understand, because we designed them.”

    ReplyDelete
    Replies
    1. I get what you mean that disabilities are irrelevant, but Turing's argument (and Harnad's in the Annotation Game) is actually different, I think. The real question is: how do we prove thought? Turing wasn't interested in the question of whether a machine could dance or walk, he was interested in whether it can perform the same as we do as signs of understanding. The video also makes this point: he defined the "imitation game" to highlight performance, not looks. You're right that our non-linguistic activities like moving, touching and sensing govern how people think. That's why Harnad and others later argued that maybe the real test isn't even just the pen-pal test (T2), but the complete robotic test (T3), where a machine must utilize its body and senses in the manner we do. So, in a technical sense, you are right: Turing did leave it open, and Harnad illustrates that thought in real sensorimotor ability matters. But what makes the Turing Test so valuable is that it gives us a concrete way to ask, "Can this system really do everything that we can do?

      Delete
  16. Harnad stresses that “thinking is as thinking does”, yet he also cautions that reducing the Turing Test to mere verbal imitation risks collapsing performance into a party trick. What struck me is the hierarchy of T2–T5: the distinction between text-only exchanges and fully embodied robotic capacities. Taken together, the paper suggests that cognition is not just about passing for human but about the scope and grounding of performance. A simulated plane cannot fly, and a simulated robot cannot act in the physical world—real performance requires real-world causality. This leaves me understanding that even if we build a T3-passing robot, indistinguishable from us in action, we still face the “other-minds problem”. So, I wonder, if indistinguishability is all we can test, is cognitive science ultimately about explaining behaviour only? Is the felt reality and neuroscience of thought permanently outside its reach?

    ReplyDelete
  17. A point I found myself questioning is Harnad’s strong emphasis on embodiment as the solution to the symbol grounding problem. He suggests that T3 (giving a system sensorimotor capacity) is the step needed for genuine grounding. But I wonder whether this simply moves the problem rather than solves it. A robot might be able to interact with the world and link symbols to sensory inputs, but how do we know that this mapping really produces “understanding” rather than just more sophisticated symbol manipulation? In other words, embodiment may enrich performance, but it may not fully close the gap between doing and truly understanding.

    ReplyDelete
  18. I find it very interesting how the Lovelace objection has persisted across time. It is very reminiscent of the AI art debate currently ongoing (although AI art programs are mostly at the t0 and t2 levels). Depending on how she defined what something new meant to her, I wonder if you could say that apps like MidJourney are an answer to this objection. On the other hand, the way that they go about creating art is so beyond how natural human capabilities work by cheating (like chatgpt) that it might not even be worth bringing up as a counter in this context.

    ReplyDelete
  19. “But even without having to invoke the other-minds problem (Harnad 1991), one needs to remind oneself that a universal computer is only formally universal: It can describe just about any physical system, and simulate it in symbolic code, but in doing so, it does not capture all of its properties: Exactly as a computer-simulated airplane cannot really do what a plane plane does (i.e., fly in the real-world), a computer-simulated robot cannot really do what a real robot does (act in the real-world) -- hence there is no reason to believe it is really thinking either.”

    This passage differentiates between computer simulations and real-world, physical processes, using simulated vs real airplanes as an example, citing this as reason to discard the belief that universal computers might be able to “think.” I do not quite understand this reasoning or analogy, due to the fact that it is not obvious that “thinking” need always emerge the same way in the physical world. We are convinced that many living organisms “think” and that this is somehow related to the common trait of a nervous system, but these systems can present quite differently across species (on some level). Moreover, considering our complete lack of understanding of the Hard Problem, why should we assume that thinking can only emerge as a product of systems made physically similarly to those made up of biomolecules? Do we have any good reasons besides induction to assume that thinking cannot emerge from any systems other than non-simulated physical (perhaps biological) ones? What if the hardware of “thinking” is less important than the software?

    ReplyDelete
  20. steven has said before that GPT is 'cheating' the T2 test because it's taking a huge gulp of data from the internet rather than just going off of whatever the user has inputted. Since we know that GPT is just a text predictor, do the statistics that it runs count as computation? What would the symbols be and what are the instructions it follows?

    Also, if you were to build a 'real' T2 machine, would it need a better understanding of language (i.e., know what nouns and verbs are and define them as different symbols, and follow the rules of syntax as its algorithm) in order to write back? Without knowing any of the meanings intrinsic to the words it outputs (assuming 'meaning' is for cognizers), how would its response be at all sensical?

    ReplyDelete
    Replies
    1. also, another thought inspired by what anne-sophie said outside of class - if we are following the rules of syntax, are we using computation when we use language? could i say that words are just symbols if i argue that semantics is divorced from syntax (i.e., colourless green ideas sleep furiously)? at the end of the day, aren't words just arbitrary combinations of sounds that we assign meaning to?

      Delete
    2. Sevi, I hummed the first verse of the answer to an LLM (Claude) and then asked it to finish the song:

      Sevi, the LLM next-token predictor algorithm is definitely computation, but the data on which the algorithm performs its computation (the "Big Gulp" database contents) is not part of the algorithm...

      ...The algorithm just does pattern-matching and retrieval on that database. It's like a student taking an exam with crib notes: the algorithm is the looking-up, but the answers are coming from the database, not from understanding.

      That's why it's cheating on the Turing Test. A real T3 robot would need to have grounded its words in sensorimotor experience with their referents—not just learned correlations between word-shapes. LLMs only have the word-shapes and their statistical patterns. They don't know what the words are about.

      What LLMs have shown us is that these statistical patterns in "Language Writ Large" are surprisingly powerful—more than anyone expected. But powerful pattern-matching isn't the same as understanding. To really pass T2, a T3 robot would need to learn categories through direct interaction with the world, then ground words in those categories. That's the tough challenge still ahead.

      Sevi(2), yes, grammar is just syntax, hence symbol-manipulation rules. But Chomsky's famous

      "Colourless green ideas sleep furiously"

      is not just senseless syntax, as the poet John Hollander famously replied:

      "Curiously deep, the slumber of crimson thoughts:
      While breathless, in stodgy viridian.
      Colorless green ideas sleep furiously."


      Tell Anne-Sophie that the "autonomy of syntax" (from semantics) in natural language is not altogether a done deal, the way it is in computation.) (Weeks 8 and 9)

      Delete
  21. It seems that T3 is necessary to reverse-engineer our consciousness as it is profoundly linked to our sensorimotor experience. Recall that T3 is defined as Total indistinguishability in robotic (sensorimotor) performance capacity by Harnad. However, I am somewhat skeptical whether this would be possible. For a machine to be functionally indiscernible to a human, it would have to have reached at least T4 or T5. Robotics can get very close as we have many examples today, but for it to be indistinguishable in function it would have to be identical to the molecular level.

    ReplyDelete
  22. What stood out to me is how much hinges on clarifying what Turing really meant. Was it just the text-only version (T2) or the full robotic version (T3)? And is passing about tricking people sometimes, or genuinely matching human performance? Harnad makes clear that the test is about what a system can do, not what it may or may not feel. The Other Minds Problem matters for philosophy and in animal cognition research, but not for artifacts like computers where the uncertainty is near zero. Still, I wonder: why do we resist treating performance alone as “enough”?

    ReplyDelete
  23. Harnad’s Annotation Game is an extension to--it clarifies the goals of Turing’s imitation game: not attempting to deceive but a framework for reverse-engineering cognition. Expanding on 2a, “Can machines think?” becomes “Can we build systems that do what we do, for the same causal reasons?”
    Much of the preceding ambiguity is resolved by this hierarchy (T0- T5) after reading Turing's original text. Meaning is grounded at T3, the embodied, sensorimotor level, while T2 assesses language ability. A purely verbal system can mimic speech, but it is unable to explain its own words in the absence of causal relations with the outside world.
    We may compare grounded and ungrounded models to determine whether embodied interaction produces generalization that disembodied computation cannot, which is what makes Harnad's hierarchy valuable. Only when these theoretical disparities result in quantifiable performance differences does cognitive science continue to be an empirical field.

    ReplyDelete
  24. Turing argues that the Turing Test evades the “hard problem” by focusing on observable behaviour, and he suggests that the gap between animate and inanimate entities is far greater than that between humans and other animals. Given this, to what extent can the Turing test serve as a reliable measure of machine intelligence? For example, take a machine that perfectly models the behaviour and decision-making of a bald eagle–knowing what it can eat, its optimal sleeping patterns, and the best times to hunt. If human and animal intelligence differ only in degree and not kind, would such a machine be legitimately considered intelligent, or does intelligence require more than a simulation of natural behaviours?

    ReplyDelete

Closing Overview of Categorization, Communication and Cognition (2025)

Note: the column on the right    >>>  starts from the 1st week on the top  downward to the last week at the bottom. Use that right ...