Authors:
(1) Raphaël Millière, Department of Philosophy, Macquarie University ([email protected]);
(2) Cameron Buckner, Department of Philosophy, University of Houston ([email protected]).
Table of Links
Abstract and 1 Introduction
2. A primer on LLMs
2.1. Historical foundations
2.2. Transformer-based LLMs
3. Interface with classic philosophical issues
3.1. Compositionality
3.2. Nativism and language acquisition
3.3. Language understanding and grounding
3.4. World models
3.5. Transmission of cultural knowledge and linguistic scaffolding
4. Conclusion, Glossary, and References
3. Interface with classic philosophical issues
Artificial neural networks, including earlier NLP architectures, have long been the focus of philosophical inquiry, particularly among philosophers of mind, language, and science. Much of the philosophical discussion surrounding these systems revolves around their suitability to model human cognition. Specifically, the debate centers on whether they constitute better models of core human cognitive processes than their classical, symbolic, rule-based counterparts. Here, we review the key philosophical questions that have emerged regarding the role of artificial neural networks as models of intelligence, rationality, or cognition, focusing on their current incarnations in the context of ongoing discussions about the implications of transformer-based LLMs.
Recent debates have been clouded by a misleading inference pattern, which we term the “Redescription Fallacy.” This fallacy arises when critics argue that a system cannot model a particular
cognitive capacity, simply because its operations can be explained in less abstract and more deflationary terms. In the present context, the fallacy manifests in claims that LLMs could not possibly be good models of some cognitive capacity 𝜙 because their operations merely consist in a collection of statistical calculations, or linear algebra operations, or next-token predictions. Such arguments are only valid if accompanied by evidence demonstrating that a system, defined in these terms, is inherently incapable of implementing 𝜙. To illustrate, consider the flawed logic in asserting that a piano could not possibly produce harmony because it can be described as a collection of hammers striking strings, or (more pointedly) that brain activity could not possibly implement cognition because it can be described as a collection of neural firings. The critical question is not whether the operations of an LLM can be simplistically described in non-mental terms, but whether these operations, when appropriately organized, can implement the same processes or algorithms as the mind, when described at an appropriate level of computational abstraction.
The Redescription Fallacy is a symptom of a broader trend to treat key philosophical questions about artificial neural networks as purely theoretical, leading to sweeping in-principle claims that are not amenable to empirical disconfirmation. Hypotheses here should be guided by empirical evidence regarding the capacities of artificial neural networks like LLMs and their suitability as cognitive models (see table 1). In fact, considerations about the architecture, learning objective, model size, and training data of LLMs are often insufficient to arbitrate these issues. Indeed, our contention is that many of the core philosophical debates on the capacities of neural networks in general, and LLMs in particular, hinge at least partly on empirical evidence concerning their internal mechanisms and knowledge they acquire during the course of training. In other words, many of these debates cannot be settled a priori by considering general characteristics of untrained models. Rather, we must take into account experimental findings about the behavior and inner workings of trained models.
In this section, we examine long-standing debates about the capacities of artificial neural networks that have been revived and transformed by the development of deep learning and the recent success of LLMs in particular. Behavioral evidence obtained from benchmarks and targeted experiments matters greatly to those debates. However, we note from the outset that such evidence is also insufficient to paint the full picture; connecting to concerns about Blockheads reviewed in the first section, we must also consider evidence about how LLMs process information internally to close the gap between claims about their performance and putative competence. Sophisticated experimental methods have been developed to identify and intervene on the representations and computations acquired by trained LLMs. These methods hold great promise to arbitrate some of the philosophical issues reviewed here beyond tentative hypotheses supported by behavioral evidence. We leave a more detailed discussion of these methods and the corresponding experimental findings to Part II.