Bayesian Models

'I would distinguish three different ways that one can use the Bayesian framework: evaluation (assessing how an ideal rational agent should behave), guidance (using the framework to guide your own decision-making), and description (using the framework to describe an agent, a psychological system, or an artifact).'

'I defend a broadly representationalist interpretation of Bayesian cognitive science, focusing especially on perception, motor control, and navigation. The basic idea is that Bayesian models posit representational mental states and so, when the models are explanatorily successful, we have good reason to believe that representational mental states exist. This is a kind of intentional realism (realism about representational mental states). It’s opposed to eliminativism (there are no representational mental states) or instrumentalism (postulation of representational mental states is just a useful way of talking).'

'It’s not surprising that you thought computational transitions are sensitive to syntax but not semantics, because that’s what a million research articles, survey articles, and encyclopedia articles say, and it’s what usually gets taught in philosophy of mind classes. However, I think that this is actually an ungrounded philosophical dogma that we should reject.'

Michael Rescorla's main research areas are philosophy of mind, philosophy of logic, epistemology, and philosophy of language. Much of his current research investigates the nature of mental representation, often drawing on cognitive science, computability theory, probability theory, and other neighboring disciplines. More specific research topics include: non-propositional mental representation (with cognitive maps as the main case study); the computational theory of mind; foundations of Bayesian decision theory (with a focus on conditional probability and Conditionalization); Bayesian modeling of the mind (especially perception, motor control, and navigation); norms of assertion; the structure of epistemic justification. Here he discusses Bayesian decision theory, perception, directed bodily motion, intentionality and normativity, non-factive and factive formulations, Andy Clark and predictive processing, nomological and mechanistic theories of psychological explanation, interventionism, Fodor's Language of Thought, the epiphenomenalist challenge, semantics and syntax in computational transitions, and deductive inferences.

3:16: What made you become a philosopher? 

Michael Rescorla: Philosophy blends the speculative and the rigorous, the literary and the formal, in a way that I find quite congenial. It’s great fun to introduce students to these timeless philosophical questions and to be reminded of how gripping the questions are. Philosophy also engages with many other disciplines, which gives me a chance to learn about those disciplines as part of my work. 

3:16:  Bayesian decision theory seems to be a key element in much of your work so I wonder if you could sketch for us what this is for the uninitiated and why it is such a useful philosophical tool? 

MR: Bayesian decision theory is a mathematical framework for modeling inference and decision-making under uncertain conditions. The framework aims at a “mathematization of rationality.” The key notion is subjective probability --- a quantitative measure of the degree to which an agent believes a proposition. Bayesian decision theory tells you how your subjective probabilities should fit together with each other, how you should revise them in response to new evidence, and how you should bring them to bear when deciding what to do. The framework is named after Rev. Thomas Bayes, and this is probably fair because Bayes had several important elements of the framework in mind, but the great French mathematician and physicist Pierre-Simon Laplace was the person who first articulated it in a systematic way. Later, many other authors helped develop the framework further (including Frank Ramsey and Bruno de Finetti). I would distinguish three different ways that one can use the Bayesian framework: evaluation (assessing how an ideal rational agent should behave), guidance (using the framework to guide your own decision-making), and description (using the framework to describe an agent, a psychological system, or an artifact). All three uses have proved fruitful. For example, game theory and formal epistemology routinely make evaluative use of the Bayesian framework. In terms of guidance, the Bayesian framework is frequently used for practical applications. 

Just to pick one of my favorite examples, it was used to locate the wreckage of Air France 447 after the plane crashed in the Atlantic Ocean in 2009. In terms of description, Bayesian cognitive science has produced empirically successful models of numerous psychological tasks. So, overall, the Bayesian framework is an extraordinarily useful tool for analyzing normative and descriptive aspects of psychological activity. One reason why it’s so fruitful is that it codifies the agent’s mental states in precise mathematical terms --- using numbers. This enables rigorous mathematical models, far more rigorous than one can achieve using only verbal tools. 

3:16:  One area you’re interested in is perception, and one question is how the perceptual system transits from proximal sensory inputs to perceptual states that represent the distal environment as being a certain way. You’ve examined Bayesian models to illuminate the computations that perceptual systems use. Can you say how these systems work and why a Bayesian approach seems fruitful? 

MR: This is an example of a successful descriptive application of the Bayesian framework. The perceptual system has a problem to solve: it needs to estimate conditions in the external world based upon proximal stimulations of sensory organs (such as retinal stimulations). The basic idea behind Bayesian perceptual psychology is that the perceptual system solves this problem through a Bayesian inference. Roughly, the perceptual system is trying to compute which environmental conditions are most probable in light of the sensory stimulations it’s receiving. For example, what is the probable size of this object based on the retinal stimulations currently being received? This isn’t an inference that you the perceiver execute or are aware of. It’s an inference executed by your perceptual system. Bayesian models of perception can explain a wide range of perceptual phenomena, such as some perceptual illusions, that otherwise have been difficult or impossible to explain in a satisfying way. Just to give one example: there are very successful Bayesian models of how the perceptual system fuses visual and haptic cues to the size of an object. In my opinion, this is our most promising scientific framework for studying perception. 

3:16: Another area where you think Bayesian models are fruitful is in working out the mental processes that control goal directed bodily motion. To understand this we need to understand what you mean by mental representation don’t we, alongside other stuff. So first, what do you mean by mental representation, and why rebut eliminativism and instrumentalism and defend a version of intentional realism regarding it? 

MR: This is another successful descriptive application of the Bayesian framework. Here we’re studying how the motor system chooses motor commands that promote the agent’s goals. For example, I want to lift a book. There are a lot different ways to lift the book. My motor system needs to a choose a sequence of motor commands that result in me lifting the book and that hopefully are pretty efficient. The basic idea behind Bayesian sensorimotor psychology is that the motor system uses Bayesian inference and decision-making to choose motor commands. As in Bayesian modeling of perception, these models have proved very explanatorily successful. One of the main things I want to argue regarding Bayesian models of the mind, including Bayesian sensorimotor models, is that they assign a crucial role to mental representation. By “mental representation,” I basically mean a type of mental state that can be assessed as veridical or non-veridical. For example, a perceptual illusion results in a perceptual state that inaccurately represents the world --- the state is non-veridical. So the very notion of a perceptual illusion presupposes that the perceptual state can be assessed as veridical or non-veridical. 

Or, take the case of motor control: if the motor system has a goal of picking up some object, then that goal can be fulfilled or thwarted, so again we’re dealing with a mental state that can be assessed against the world as veridical or non-veridical. A subtler example involves the probability assignments that figure in Bayesian modeling. To illustrate, suppose the perceptual system assigns a certain probability to the hypothesis that an object has a certain size. We can ask whether the object actually has that size. So we are dealing with a hypothesis (that the object has a certain size) that can be assessed against the world for veridicality. In several of my publications, I defend a broadly representationalist interpretation of Bayesian cognitive science, focusing especially on perception, motor control, and navigation. The basic idea is that Bayesian models posit representational mental states and so, when the models are explanatorily successful, we have good reason to believe that representational mental states exist. This is a kind of intentional realism (realism about representational mental states). It’s opposed to eliminativism (there are no representational mental states) or instrumentalism (postulation of representational mental states is just a useful way of talking). My most recent discussion is in the monograph Bayesian Models of the Mind, a contribution the Cambridge Elements in Philosophy of Mind series. The Element also gives a more general philosophical introduction to Bayesian cognitive science, including exposition of a lot mathematical details that are important for fully understanding these models. 

3:16:  What does your analysis show us about the link between intentionality and normativity? 

MR: The most basic moral I would draw is that intentional attribution is tightly linked to normative assessment. You might think that normativity has no place in scientific psychology; some authors have suggested as much. We’re studying how the mind works, not how it should work. Well, not so fast. In all the Bayesian psychological models I’ve been describing, the descriptive theory isn’t constructed in isolation from normative considerations. Rather, we begin with a normative model: how should an ideal rational agent proceed? The normative model serves as a crucial guide to finding the empirically correct descriptive psychological theory. In that sense, Quine and especially Davidson were correct to emphasize the methodological centrality of normative assessment to intentional attribution. What I’m suggesting is that normativity is central to the discipline of psychology. 

In psychology, we are studying mental states (such as belief, or degree of belief, or subpersonal analogues thereof) that are subject to rational norms. The norms help us understand the mental states, and they help us describe the mental processes that the states participate in. That’s the most basic moral. Can one draw a stronger moral? Perhaps something to the effect that normativity is constitutive of mental representation? I’m not sure about that, although I think it’s worth discussing. 

3:16:  And how can it be better to assume non-factive rather than factive formulations for philosophical and scientific applications of Bayesian decision theory? 

MR: A key idea behind the Bayesian framework is that the agent receives evidence E and then updates her subjective probabilities on that basis. There’s a norm called Conditionalization that governs these updates. A fundamental question arises: is E true or not? Factive formulations of Conditionalization assume that E is true. Non-factive formulations allow that E may be either true or false. Philosophers usually give a factive formulation. They say things like “suppose the agent learns E.” Learning is a factive state. So this formulation assumes that E is true. However, some philosophers explicitly formulate Conditionalization in non-factive terms, and in my opinion this is preferable. The reason is that non-factive formulations have much wider coverage. Obviously, agents can make mistakes, and in particular they can update based on an E that is false. If you formulate Conditionalization factively, you are simply excluding such cases from consideration. But that doesn’t sound like a good idea, whether your goals are normative or descriptive. 

From the normative viewpoint, we can still rationally assess a credal update that’s based on a false E: it’s not like anything goes just because E is false. So we need a rational norm governing credal updates even when E is false. In other words, we need a non-factive formulation of Conditionalization. From a descriptive perspective, we want our descriptive theory to apply widely, including to someone who makes a mistake and updates based upon a false E. So again, we want a non-factive formulation. In my opinion, then, the non-factive approach is far superior to the factive approach for both normative and descriptive applications of the Bayesian framework. One of the main things I’ve tried to do over the past few years is explore the non-factive approach and its advantages. Let me give you a concrete example of the advantages. One of the main arguments for Conditionalization, stemming from work by David Lewis and Brian Skyrms, is that any other update rule will leave you vulnerable to a sure loss. In other words, if you employ any other update rule, then a clever bookie can induce you to accept a series of bets that guarantee you will lose money. 

In contrast, if you follow Conditionalization, you’re not vulnerable in this way to a sure loss. The contrast seems to show that Conditionalization is in some way superior to all alternative update rules. Now, there are many ways one can object to this argument, but one of the biggest objections is that the argument seems to overgeneralize. As Bas van Fraassen showed, the same basic argument given by Lewis and Skyrms also supports the Reflection principle, which says roughly that you should defer to your future subjective probabilities. If you violate Reflection, you are vulnerable to a sure loss similar to the one discussed by Lewis and Skyrms for Conditionalization. van Fraassen concluded that Reflection is a rational norm you should obey. However, most epistemologists are fairly suspicious, because there seem to be fairly obvious counterexamples to Reflection, e.g. situations where you expect to update your credences based on misleading evidence. So we have a problem, which is that one of the main arguments for Conditionalization seems to overgeneralize into an argument for the very implausible Reflection principle. 

In a recent paper (“Reflecting on Diachronic Dutch Books”), I argue that things look quite different once we shift to a fully non-factive setting. In a non-factive setting, you can give a version of the Lewis-Skyrms argument for Conditionalization, but the argument no longer generalizes into an argument for Reflection. It doesn’t generalize because in a non-factive setting we now must take into consideration scenarios where your future credences are based on a false E. We ignore those scenarios in a factive setting, and it’s only because the original Lewis-Skyrms argument ignored those scenarios that it generalized to Reflection. Once we take those scenarios into account (and these are among the main scenarios that show Reflection to be implausible), one can show that any update rule other than Conditionalization gives rise to a sure loss and that violations of Reflection do not necessarily give rise to a sure loss. This is a concrete example of how it helps to formulate things non-factively. 

3:16:  Andy Clark defends the view of the mind as geared most fundamentally towards prediction error minimization. He’s a trend setter, so this is important but you think there are problems with the models at the heart of this idea, the so-called Predictive Processing (PP) models. So why are you dubious about these models, and if they fail, is this fatal to the view of the mind Clark is defending? 

MR: PP models are neural network models that develop the following idea: the mind makes predictions about what sensory input it will receive from the world; it then compares these predictions with actual sensory input, computing a prediction error term that can then serve as a basis for further computation. As you observe, Andy Clark has now authored many publications (including a book) that explore PP models and draw philosophical conclusions. Jakob Hohwy has also written a book that does something similar. In some cases, I think the PP models that Clark and Hohwy discuss are quite problematic. In particular, the models of motor control that they emphasize are not compatible with the most successful theories found in the motor control literature. I discuss this case in a lot of detail in several places, including my review of Clark’s book. Aside from the case of motor control, though, it’s not so much that I have an objection to PP modeling as that I think it’s being oversold. For example, the PP framework is one way you might try to ground Bayesian modeling of perception in something more neurophysiological, but it’s just one way --- there are several other options. My paper “Neural Implementation of (Approximate) Bayesian Inference” discusses some of the other options. Over the last few years, I’ve seen a lot of attention being paid by philosophers of mind to PP models. Lots of publications, workshops, reading groups, etc. I find this attention somewhat disproportionate to the scientific importance of these models. There are alternative neural implementation models that are equally worthy of discussion and that are receiving far less attention. 

3:16:  There are some prominent nomological and mechanistic theories of psychological explanation that you take issue with. Before looking at your preferred approach, can you sketch for us what some of these alternatives are, and what you see as their main problems? 

MR: This is connected to the more general philosophy of science literature on scientific explanation. Nomological theories say that scientific explanation is a matter of subsuming the explanandum (the thing being explained) under a scientific law. Mechanistic theories say that explanation is a matter of limning mechanisms that produce the explanandum. Both perspectives have been developed extensively in general philosophy of science and also as applied to the special case of psychological explanation. There’s so much one could say about the pros and cons of each approach, but I’ll just be relatively brief. Regarding nomological theories, various counterexamples show that subsumption under a law doesn’t sufficient for explanation. The most famous counterexample is the flagpole case: you can “explain” the height of the flagpole by citing the length of the shadow it casts, the position of the sun, and the laws of optics. This doesn’t seem like a good explanation at all! The flagpole case is a staple of undergraduate philosophy of science courses, with good reason: it’s a devastating counterexample. And you can give examples that make a similar point for psychological explanation. Like most philosophers nowadays, I see the nomological approach as pretty much hopeless. Mechanistic theories aren’t subject to quite such clear-cut counterexamples, but there are definitely cases that look like strong counterexamples. A pretty standard example is the ideal gas law. We can use this to explain, say, the pressure exerted on the interior walls of a container that contains a gas. Yet the explanation doesn’t look remotely mechanistic. We’re not giving any kind of mechanism that produces the pressure. Instead, we’re isolating causally relevant factors (volume, temperature, number of moles of the gas) along with a generalization (the ideal gas law) that describes how those factors influence the explanandum (pressure). 

We aren’t saying anything about the mechanism through which that influence occurs. So it looks like there are pretty clear cases of non-mechanistic causal explanation. And, for that reason, some mechanists will concede that mechanism isn’t really the whole story about scientific explanation, that it only covers certain explanations. To my mind, though, that concession doesn’t save the mechanistic theory, because there’s a second problem: namely, some mechanistic details are explanatory and others are not. For example, suppose I want to explain why inflation was so high over the past few years. There are lots of factors I could mention, such as interest rates, supply chain issues, and so on. Various factors like that may turn out to explanatory. However, if I start adducing lots of mechanistic details, many of those details are going to be irrelevant. For example, if I start talking about gears in the currency printing presses, or locking mechanisms on cargo ships, those mechanistic details are in some sense part of what produced high inflation, but they don’t seem to add anything to the explanation and in fact in many cases they look like they detract from the explanation. We need some kind of principled basis for distinguishing between explanatory versus non-explanatory mechanistic details. And, at that point, it looks like the resources we employ to draw those principled distinctions are going to be real core of our theory of explanation. The appeal to mechanism is just a distraction from the real theory of explanation. 

3:16:  You see a theory called Interventionism as an advance over and your example of Bayesian perceptual psychology helps us understand it. So what is interventionism and why is it a superior approach to scientific psychological explanations? 

MR: Interventionism is a theory of causal explanation that is espoused, for example, by James Woodward in his book Making Things Happen. The basic idea is that you explain an explanandum by adducing causally relevant factors, where causal relevance is cashed out in terms of interventions: what makes a factor causally relevant is that intervening on it is a way of manipulating the causal effect. When a gas is inside a container, you can manipulate the pressure exerted by a gas by intervening on temperature; that’s what makes temperature causally relevant to pressure, and it’s also why citing temperature is a way of explaining pressure. A good causal explanation gives you counterfactual information about how the explanandum would have been different had you intervened upon the explanantia (the causally relevant factors that you cite to explain the explanandum). You don’t need to provide a mechanism: in the ideal gas law example, we give useful counterfactual information without citing a mechanism. This also explains where the nomological approach goes wrong: intervening on the length of the shadow (e.g. by tinkering with the light source) is not a way of manipulating the height of the flagpole, so you can’t cite the shadow length to explain the flagpole height. Woodward works this out into what I regard as a quite impressive theory of causal explanation. 

In my paper on this topic (“An Interventionist Approach to Psychological Explanation”), I apply Woodward’s theory to psychology. As you say, one of the main examples I give is Bayesian perceptual psychology. In a Bayesian model, we cite subjective probabilities encoded by the perceptual system to explain perceptual estimates. For example, a pretty standard posit is that the perceptual system assigns high prior probability to the hypothesis that objects move slowly. This posit explains quite a number of motion estimation illusions in a unified way. It turns out that if we manipulate the “slow speed” prior probability, then perceptual motion estimates change (stimuli look like they’re moving faster). This case and others like it fit very well with an interventionist approach. Here we have a model that tells us how manipulating a causally relevant factor (the “slow speed” prior probability) is a way of manipulating the explanandum (the motion estimate). However, the explanation is non-mechanistic: the Bayesian model doesn’t tell us anything detailed about the computational or neural mechanisms through which the perceptual system converts the “slow speed” prior probability into the motion estimate. So this looks like a compelling causal explanation that fits well with the interventionist theory but not with the mechanist theory. 

3:16:  Turing computation is familiar from analysis of linguistic domains. Fodor and his Language Of Thought is the parade case of this I suppose. So firstly, what’s the relationship between Turing computation and representation – and what new problems have to be solved when we try applying it to a non-linguistic domain? 

MR: A Turing machine is an idealized mathematical model of a computing device. The Turing machine operates over a language, by which I just mean a collection of primitive symbols and complex expressions formed from those primitive symbols. In the simplest case, there is a single primitive symbol --- say, a stroke mark --- and the complex expressions are strings of strokes. The Turing machine can inscribe symbols in memory locations, erase symbols from memory locations, and move from one memory location to another. To a first approximation, all the digital computing devices used in our society can be understood as Turing-style computing machines. They are usually a lot more complicated than Turing machine in terms of their details and their architecture, but in essence they doing the same basic computations as Turing machines. In most applications, we want to consider computation over a non-linguistic domain. For example, the main focus of computability theory is computation over natural numbers, i.e. the number 0, 1, 2, 3, 4, etc. We want to do things like add numbers, multiply them, and so on. If we want a Turing machine to do this, there’s a problem because the Turing machine operates over symbols, not numbers. To bridge this gap, we need to treat the symbols as representing numbers. The Turing machine can’t directly manipulate numbers. It can only manipulate symbols that represent numbers. For that reason, representation becomes absolutely central to our understanding of Turing computation as soon as we move beyond computation over strings and consider instead computation over numbers. 

A similar analysis applies much more widely throughout computability theory, computer science, robotics, and other fields that study computation. In probabilistic robotics, for example, we want the robot to find its way autonomously through the physical environment, and the robot does this through a Bayesian computation over possible spatial layouts of the environment. But it’s not like the robot can directly manipulate spatial layouts. What it can do instead is manipulate symbols that represent possible spatial layouts. So we can’t fully understand the robot’s computations until we understand the robot as representing possible spatial layouts. In general, then, Turing computation doesn’t necessarily involve representation (no representation need be involved if we only are considering computation over strings over symbols), but Turing computation over a non-linguistic domain always involves representation. 

Now, turning to the mind and mental computation. In the 1960s, several philosophers and cognitive scientists proposed that we could model the mind as a computing system, something like a Turing machine. This proposal is usually called the classical computational theory of mind (CCTM). For a long time, CCTM guided research in cognitive science --- it is much less influential on scientific research than it used to be, but it still finds some proponents and still receives a lot of attention from philosophers. Jerry Fodor’s development of CCTM was especially notable. Fodor proposed that mental computation operates over a language of thought, containing mental representations analogous to natural language words and sentences. This is supposed to be a “language” in a much more robust sense than I used the phrase in my initial description of Turing machines. In my initial description, a “language” is a collection of primitive symbols and complex expressions formed from those symbols. The “language of thought,” by contrast, is supposed to contain logically structured expressions. It contains elements like conjunction, disjunction, negation, the quantifiers, and so on. So a language composed of strings of strokes would not count as a Fodorian language of thought, because no logical structure is present in that simple stroke language. A Fodorian language of thought requires some kind of logical structure. If we’re studying high-level propositional attitudes --- like belief and desire --- then the language of thought is a plausible hypothesis. We know that propositional attitudes can have logical complexity (e.g. I can think a logically complex thought such as If John is a biologist, then either Mary is a physicist or Drew is not a chemist), and so the language of thought hypothesis is a natural fit. 

However, the language of thought hypothesis looks less plausible to me when we study relatively low-level psychological domains, such as perception, motor control, and navigation. I don’t see any evidence here for anything like the kind of logical structure posited by Fodor. I agree with him that these cases involve computation over mental representations, but I don’t think we should posit logically structured mental representations in these cases. If you accept my analysis, the question now arises how we should think about mental representations in these low-level domains. This is a topic I worked on a while ago; the main case I discussed was cognitive maps, which are mental representations posited in cognitive science to explain navigation in various species (especially mammals). A cognitive map is a mental representation that represents the spatial layout of the animal’s environment. This would be an example of a mental representation that doesn’t have the logical structure posited by Fodor: it doesn’t contain quantifiers, negation, disjunction, or (probably) even conjunction and predication. One then wants to understand in more detail what cognitive maps are, how they are structured, how they differ from a Fodorian language of thought, and so on. I’ve tried to say something about this in my past work, but there’s a lot more to be said, and I hope to return to the topic in my future work. 

Of course, cognitive maps are just one example. We also want to look at the non-logical representations that figure in perception, motor control, and other domains. There is more and more work on this by philosophers (for example, Ned Block and Tyler Burge in their recent books both discuss non-logical representation in perception), and I’m hopeful that collectively as a discipline we can make some progress. It’s a good example of the kind of topic where scientifically-informed philosophical reflection has the potential to make a contribution. 

3:16:  How does the classical computational model of mind avoid epiphenomenalism, if indeed it does? 

MR: The basic idea behind the classical computational theory of mind (CCTM) is that the mind executes Turing-style computations over symbols. On most versions of CCTM, such as Fodor’s, we think of these symbols as mental representations. Maybe elements drawn from a language of thought, maybe non-logical representations such as cognitive maps --- but mental representations in some sense. In a very influential series of papers and books, Fodor presented a particular version of this picture. On Fodor’s version, the symbols are pieces of formal syntax. They are individuated in a purely non-representational, non-semantic fashion. Just like the strings of strokes manipulated by a simple Turing machine, they are pieces of pure formal syntax. They might come to acquire representational properties, just as strings of strokes can come to represent numbers or other entities. But, inherently, they are just pieces of syntax. Mental computation manipulates these pieces of syntax, and it’s sensitive to their syntactic properties but not to any semantic or representational properties that they may have. So the mind is a “syntactic engine”: it crunches pieces of syntax and is not sensitive to semantics. 

During the 1980s, it gradually became clear that this picture gives rise to a pretty serious epiphenomenalist worry. Suppose I want to drink some water, so I walk to the sink to get some water. Intuitively, it seems that the content of my desire is causally relevant to why I walked to the sink. It wasn’t just an accident that I walked to where I could get water. I walked there because it was water that I wanted. Unfortunately, it’s hard to see how we can vindicate these ideas on a formal syntactic version of CCTM. On the formal syntactic picture, mental computation is sensitive to syntax, not semantics. So the mental computations that convert my desire (to drink water) into my behavior (walking to the sink) aren’t sensitive to the content or semantics of my desire. But then it looks like the content of the desire is causally irrelevant to my behavior. It’s just syntax, not semantics, that does all the causal work. This seems like a kind of epiphenomenalism. 

Fodor and lots of other people devoted quite a lot of energy to trying to avoid this epiphenomenalist consequence. I won’t go into all the details, but suffice it to say that I don’t think these attempts succeed. Instead, I think that we need to revisit the initial premise that mental computation is sensitive to syntax but not to semantics. That premise is already so close to the epiphenomenalist conclusion that there is extremely limited room for maneuvering to avoid the epiphenomenalist conclusion once we accept the premise. I think we should reject the premise. We should question whether computation is always sensitive to syntax but not semantics. Only then can CCTM avoid the undesirable epiphenomenalist conclusion. 

3:16:  I thought that computational transitions were syntax sensitive systems not semantic sensitive, which just shows how little I know! So what do you say: are computational transitions sensitive to semantics? 

MR: It’s not surprising that you thought computational transitions are sensitive to syntax but not semantics, because that’s what a million research articles, survey articles, and encyclopedia articles say, and it’s what usually gets taught in philosophy of mind classes. However, I think that this is actually an ungrounded philosophical dogma that we should reject. It’s standard in the literature to distinguish between original versus derived intentionality. I don’t think this is the greatest terminology, but I’ll use it here because it’s fairly familiar. Roughly: original intentionality arises when a system helps generate its own semantics (maybe with lots of help from evolution, causal history, etc.), whereas derived intentionality arises when the system has its semantic assigned to it by external observers. Digital computers are a paradigm of derived intentionality. Nothing about the computation of the digital computer helps generate the semantics of the computer’s states. We need to interpret the states --- say, as representing numbers, or spatial layouts, or whatnot. (The situation might be different with a really sophisticated robot, but let’s set that case aside.) In contrast, humans are examples of original intentionality. For example, the fact that I’m thinking about water isn’t a matter of external assignment or interpretation by an observer. It flows partly from how my mental states figure into my overall mental functioning and my causal interactions with the world. 

Okay, there’s a lot more one could say about this distinction, but let’s grant it for present purposes. My basic view is that, if you’re dealing with a Turing-style computational system that only has derived intentionality, then the system’s transitions are sensitive to syntax but not semantics; however, if you’re dealing with a Turing-style computational system that has original intentionality, then the system’s transitions are at least potentially sensitive to semantics. Intuitively, this is because in the former case the semantics floats free from the underlying computations, so you can arbitrarily change the semantics without affect the computations. But the same is not true in the latter case, because there the semantics is so intertwined with the computations. I develop this view using interventionist ideas about causal relevance. Basically, in the case of derived intentionality, you can intervene to change the semantic properties of a mental state (simply by reinterpreting the state), and this intervention won’t affect downstream computation; hence, semantics is irrelevant. In the case of original intentionality, though, it’s not so simple. Here, I argue, at least some interventions on semantic properties will affect downstream computation, because of how tightly intertwined the semantics is with the system’s computations; hence, the semantics is causally relevant. 

For example, if someone manipulates me from desiring water into desiring orange juice, then this manipulation must also change the underlying mental representations (I switch from a representation of water to a representation of orange juice), which will lead me to behave differently. Manipulating the semantics does at least sometimes lead to changes in behavior, so the semantics is causally relevant. I try to spell this out in quite a lot of detail (in my paper “The Causal Relevance of Content to Computation”), where I also answer a lot of objections that some of your readers may have in mind such as Twin Earth considerations or Jaegwon Kim’s well-known “exclusion” argument. So my answer to your question is: yes, some computational transitions are sensitive to semantics. But to clarify why this is so you really need to attend carefully to things like the nature of causal relevance and how the computational system comes to have its semantics. 

There’s another point I’d like to make, which goes to the heavy emphasis upon syntax in the first place. I don’t think all this talk about syntax is helpful or appropriate for many mental computations. There’s a second philosophical dogma that I reject: the dogma that Turing-style computation operates over syntactic entities. (This is weaker than the previous dogma, the dogma that computation is sensitive to syntax but not semantics. You could accept that computation operates over syntactic entities while insisting that it’s sensitive to semantic properties of those entities.) This second dogma is often presented by philosophers as if it somehow follows from the definition of Turing machine. However, it doesn’t follow from the definition of Turing machine! If you read any good book on computability theory, you’ll see that the notion of Turing machine is defined in a very general way that makes no appeal to any notion of “formal syntax” or anything like that. The Turing machine operates over symbols, but the definition doesn’t really tell you anything about what the symbols are. It certainly doesn’t tell you that the symbols have any “formal syntactic” properties, or that they are individuated non-representationally. Moreover, if you look at many areas of cognitive science, they don’t seem to feature anything like formal syntax. Perception, motor control, navigation, many other areas. Those areas don’t feature anything like the formal syntactic items posited by Fodor. Thus, while I agree with Fodor that many areas feature computation over mental representations, I don’t agree that we should regard the mental representations as pieces of formal syntax that then come to be somehow endowed with a semantics.

 Instead, I favor a different approach to mental representations. I think that, in many cases, mental representations are individuated partially through their representational properties. For example, we can posit a mental representation that represents water, and the fact that the mental representation represents water is part of what makes it the representation that it is. Similarly, we can posit a perceptual representation that represents sphericality, and this representational relation to sphericality is inherent to the nature of the perceptual representation. I think that in most cases it’s better regard to mental computation as operating over mental representations construed in this way rather than in the formal syntactic way favored by Fodor. We should think of mental representations as “semantically permeated.” Several of my publications over the past decade (e.g. “From Ockham to Turing --- and Back Again”) have been devoted to developing this “semantically permeated” conception of mental representations, trying to work out what it involves, how it fits together with various models of computation, and how it interfaces with contemporary cognitive science. 

3:16:  Does your approach to minds and cognition help us understand whether animals can make non-linguistic deductive inferences? Is Fodor right to suppose cognitive processes as computations defined over the language of thought (or Mentalese) and do computational models of navigation drawn from probabilistic robotics help support his view that his cat is rational even if working in a non-logical representational medium?  

MR: This question is a great way to end the interview, because it ties together several strands in my research: Bayesian modeling; the structure of mental representations; and mental computation. I would say that deductive inference requires some kind of logical structure. For example, you infer p from p & q, and there you exploit the conjunctive structure of the premise. So, when we look at deductive reasoning, we are looking at a mental process that exploits some kind of logical structure. How exactly to think about that structure is one of the big questions in philosophy of language and mind over the past century, but one natural way to think about it is in terms of a Fodorian language of thought. Then you can model deductive inference in terms of computations over logically structured mental representations. That’s how Fodor thought of it, and I think something like that picture is quite attractive (minus the formal-syntactic commitments that Fodor emphasized and that I expressed some skepticism about in my answer to the previous question). What about non-deductive inference? Well, our best developed theory of non-deductive inference is Bayesian decision theory. And here things are a lot more complicated. 

Some versions of the Bayesian framework assume that probabilities attach to sentences with logical structure. If we take those versions as a guide, we’ll end up positing logically structured mental sentences to serve as the domain of Bayesian inference. However, other versions of the Bayesian framework don’t assume that probabilities attach to logically structured items. Specifically, the Russian mathematician Kolmogorov offered an extremely influential axiomatization of probability that doesn’t assume logically structured items. Kolmogorov’s axiomatization opens up the possibility of Bayesian computations over non-logical mental representations, such as cognitive maps. This is a suggestion I’ve explored since fairly early in my career, beginning with a paper called “Cognitive Maps and the Language of Thought.” Thus, I think you can have rational mental computations in a non-logical representational medium. The computations are rational because they’re Bayesian (or one might say that in some cases they’re approximately rational because they’re only approximately Bayesian). Yet they don’t operate over a Fodorian language of thought. They operate over map representations with a different kind of representational structure. Probabilistic robotics gives a pretty concrete sense of how this could work, because here you have quite detailed algorithms grounded in Bayesian decision theory. The algorithms help the robot navigate through the world, by locating itself on a map and in some cases building the map as well. These computations are (approximately) Bayesian, hence (approximately) rational, yet non-deductive and non-logical. 

My suggestion is that many actual mental processes, especially in low-level domains like perception and navigation, are similarly Bayesian and non-logical. This isn’t to deny that there’s a Fodorian language of thought. It’s just to say that lots of other kinds of mental representation and mental computation are occurring as well. For example, in the case of navigation, there’s mounting psychological and neuroscientific evidence that human navigation involves something like the Bayesian algorithms found in robotics. And, like I said earlier, the representations at work in those algorithms don’t look anything like a Fodorian language of thought. So here you have mental processes that are approximately rational (or at least they satisfy some kind of subpersonal analogue to approximate rationality), yet they operate over non-logical representations. The same is true, I would suggest, of perception, motor control, and many other domains. 

3:16:  And finally, are there five books that you can recommend to the curious readers here at 3:16 that will take us further into your philosophical world? 

MR: 

Sharon McGrayne, The Theory that Would Not Die A great introduction to the history of how the Bayesian framework has developed and been applied since its inception. It’s one of the best popular science books I’ve read. Formal epistemologists talk all the time about the Bayesian framework, but they don’t always emphasize how useful it’s been in so many domains. McGrayne does a great job conveying that utility. 

Sebastian Thrun, Wolfram Burgard, and Dieter Fox, Probabilistic Robotics The book that opened my eyes to how much one can accomplish, in practical terms, through the Bayesian framework. It’s a textbook introduction to autonomous robotic navigation using Bayesian computation. There have been many advances in the field since this book came out. But the book still provides an exceptionally clear introduction to several algorithms (such as the Kalman filter, the Extended Kalman filter, and the particle filter) that are fundamental to how the Bayesian framework gets used in machine learning and robotics and also in many areas of Bayesian cognitive science. 

Gottlob Frege, The Foundations of Arithmetic One of the most profound works of philosophy ever written, and certainly among the most enjoyable. And the J. L. Austin translation has sparkling prose. There’s a reason so many people come to analytic philosophy through Frege’s pioneering monograph. Although the specific topics discussed in the book aren’t directly relevant to my research, the book is a continual source of inspiration --- I teach parts of it almost every year. 

Jerry Fodor, The Language of Thought Still well worth reading for its brilliant ideas, well-constructed arguments, and innovative methodology. This is where Fodor introduces the hypothesis that mental activity is Turing computation over a language of thought. I will admit that the book is somewhat dated by subsequent scientific and philosophical developments (including Fodor’s own later work). Still, it’s one of the most important 20th century books in philosophy of mind. 

James Woodward, Making Things Happen Develops an interventionist theory of causation and explanation. Beautifully demonstrates that general philosophy of science in the Hempelian tradition is alive and well, continuing to make impressive progress and casting light upon scientific practice. One of the best philosophy books of the last 25 years.

ABOUT THE INTERVIEWER

Richard Marshall is biding his time.

Buy his third book here,  his second book here or his first book here to keep him biding!

End Time series: the themes

Peter Ludlow's 'Wait... What?'

Walter Horn's Hornbook of Democracy Book reviews series

Huw Price's Flickering Shadows series.

Steven DeLay's Finding meaning series

Josef Mitterer's The Beyond of Philosophy serialised