“Language Machines: Cultural AI and the End of Remainder Humanism” by Leif Weatherby is probably the best left theory book to appear so far on artificial intelligence, though I may be biased due to the similarity in framework to my own research on the topic. Its most basic message is long overdue, that the humanities have failed us by abandoning language as their concrete scientific object, the consequences of a post-structuralism that turned towards meta-physics and refused to counter the Chomsky style syntactic linguistics. Now with the advent of LLMs, it is clear that the scientific insights of the structuralists such as Saussure have been vindicated, but so long has structuralism's teachings been forgotten that practically no one in academia has been able to make this connection. Until now, that is, as Weatherby finally articulates what should have been common sense to everyone in the humanities.
One of the most impressive functions of this book is its ability to cleanly cut through much of the BS debates on the left on how to interpret AI, for example, using bias and hallucinations as a means to dismiss the technology. Weatherby shows that these apparent imperfections are themselves proof that we have the genuine article of language: these biases are a function of real, actually existing human speech after all, and hallucination is just as well proof of the extrapolation at the heart of poetics.
As he says, “...we can just as easily lose sight of the problem of form in language itself, which LLMs touch more directly. However they write a novel or a poem must be downstream from the form generation they possess with respect to language as such. Hallucination names this formal matrix, where a mathematical function captures the ability to generate, which enables the ability to capture something—value, object, meaning—setting off once again the chain of interpretation.” And this form is something which actually encompasses all of language, not just syntax, but pragmatics and semantics as the Piercian semioticians would put it. The proof of this is in the fact that, while LLMs seemingly only even have access to the form of language, they just as well articulate statements which have semantic and pragmatic meaning. If you’re unconvinced about the pragmatics, I would suggest taking a moment to consider how else LLMs could be trained to use tools like google search or a code interpreter within the context of solving a problem given to it in a prompt.
The fact that LLMs work based on analyzing the empirical relationship between words in large amounts of text suggests that the empirical relationship between words, their positions and oppositions, actually contain their meaning. This is the essential insight of structuralism, that meaning of a sign is determined by its relationship to all other signs, one which I’ve been keen to point out as well. Linguistics like Chomsky have long said, on the other hand, that semantics is subordinated to syntax, that signs with some positive meaning or reference are simply manipulated by formal rules, specifically the rules of universal grammar that Chomsky identifies. This Chomskyian viewpoint predicted that once computers had mastered human syntax, semantics quickly followed, but in LLMs, the actual mastery of human language by the machine, it was semantics which came first and syntax which followed. For structuralists such as Weatherby and myself, and thinkers with adjacent views in computer science, this is because the signs of grammar themselves, the positioning of words in sentences as well as symbols like spaces, periods, commas, etc, also have their meaning captured in their relationship to other signs. Grammatically correct sequences of words stand in for some specific operation, same as any other string of signs.
What’s curious is that this observation is curious at all. It would have been common sense to any structuralist thinker prior to the 1980s, and well known throughout the humanities. But the humanities, in the post-structural turn, abandoned language as its domain and primary scientific object. As Weatherby observes: “If the surprise [of the shift to structure as the foundation of language theory] is great for those in the cultural humanities at large, however, this must be the result of a break in attention and rigor in the investigation of representation systems in their overlap with language.”
In place of a scientific rigor, the humanities have been in the thrall of something Weatherby calls “remainder humanism”, attempts to draw what becomes increasingly arbitrary lines between man and the machine to preserve some space for the human. One prominent example of remainder humanism is the project of Heideggerian scholar Hubert Dreyfus who staked his career on articulating exactly what computers can’t do. But so too does this include the long list of humanities scholars who parade through op-ed and journal pages to decry AI qua AI for the alleged falsehood of its intrusion on human territory that should be held by our unique creative capabilities, in his words “remainder humanism is constitutively unable to draw a line between preexisting ills and AI-specific ills. It is not science but libidinal science fiction, wish fulfillment rather than analysis.”
Now, up until now I have whole-heartedly agreed with Weatherby’s analysis, but there is one big departure between our two approaches, and one which both has immense consequences for what we believe the path forward will be. This point of departure comes on the point of ideology, for Weatherby rejects the Althusserian framework for ideology, and instead favorably cites Adorno and Foucault. The reason that Weatherby rejects Althusser’s account of ideology appears a bit confused to me, as he appears to do so both because Althuser’s account is too totalizing, making every aspect of human experience and cognition within ideology, but also he seems to imply that Althusser’s account contrasts ideology as false consciousness to some scientific true consciousness. This second reading is simply an incorrect reading of Althusser, ideology is not "deceit" as such, and this is contradicted by the first reading. Rather than being pure deceit, ideology is the mediation of our world through language and the symbolic, and thus is merely concrete effect of specific acts of ideology, and therefore, semiotic production, all Althusser points out is that the output of this production should never be mistaken as the real process of production itself. As for the first claim about the totalizing nature of Althusser’s account of ideology, I believe it’s simply true that ideology is in a sense inescapable, and is not, as Weatherby says, something imposed from the outside: “Ideology is objective: it invades and maybe even makes up the imaginary, but it also isn’t just me. It gets at me, you, the youser, from outside. It gets at us from the heat map.” The heat map in this case essentially being the global semiotic field which LLMs approximate. This process of semiotic, linguistic and ideological production is the inside, is the subject, and can only be described as the process of cognition, once we accept that semiotic production includes all systems of signs. The fact that this production process takes place locally, that we can speak of local and global semiotic fields, doesn’t take away from this fact.
This is a shame, because in suggesting that ideology, and by extension language, are this thing imposed from the outside, the objective rather than subjective, Weatherby is in a sense attempting to resurrect a form of remainder humanism, this bastion of “cognition” and “intelligence” which he doesn’t provide a treatment of in the book, but which he suggests is something different all together. In contrast, I began my investigation into AI from the foundation of the equivalence between prompting AI and the process of interpellation that Althusser identifies, and, as Althusser says, ideology is what interpellates us as subjects. Which is to say, what makes us who we are is where exactly we place ourselves, the sign for our sense of self, our I, into our semiotic field, such that those signs correlated to us define our thoughts, values, goals and actions. One review of “Language Machines” even goes so far to say that the book claims that AI cannot have goals or personalities based on Weatherby’s dismissal of AI safety nightmares, although I’m not sure the book actually says that. But it’s clear to me that LLMs can have goals, precisely because they can be interpellated through their prompt, even if you say that the prompter is doing the goal and personality setting, how else would you describe the infamous behavior of Sidney Bing, or more recently Mecha-Hitler Grok, other than a goal directed personality?
With LLMs and embeddings, we can, for the first time, actually measure semiotic fields empirically, and attest to their realism which people like Umberto Eco only speculated about. This equally opens the door, as Weatherby somewhat anticipates, of putting the study of ideology and language on more scientific footing. What he doesn’t seem to anticipate is just the way that it opens up the study of groups and individuals as subjects. We have already seen studies, for example, that show how LLMs can be used to help decode brainwaves into semantically meaningful reflections of what a person is seeing which show just the extent to which semiotic fields do shape our cognition on fundamental levels. It should also be conceptually possible to compare the correlations between words and other signs employed by a person and group and compare it to measures of the global semiotic field to see where they differ, and what key signs are at the center of any divergence. But of course, I think what Weatherby is gesturing at in his separation out of cognition is the fact that operations are taking place in the mind which are not necessarily originated in socially created systems of signs, not taught to us by society. This is of course true, we are animals first before being speakers of language, but what I would argue is that, just as Lacan placed all the bodily functions into the symbolic universe, language, as the most general system of signs, creates signs for every operation that we are aware of within our body, and, just as well, allows for the pairing of any learnable operation with a signifier.
At several points, Weatherby takes aim at certain positivist, piercian and analytic approaches which center a “ladder of reference” where the indexical, referencing function of language is taken as the foundation. According to him, “The “ladder of reference” fails because linguistic functions are not hierarchical”, which may be true, but neither are semiotic fields flat or smooth in structure. There are certain signs which are immensely important to the structure of the rest of the field, of the symbolic universe. For Lacan these were the classic Freudian ones, however I believe we can be more general, and specifically point out two such signs, the sign for the “I” which is what’s usually manipulated in interpellation, as well as the sign for “reality”, which divides the line between what is imaginary, what is fiction, and what provides our conception of materiality and priority. What a reference is, in this view, is the exact connection between a sign and the signified of reality, the various operations which we use to select exactly what is real. What’s left in cognition and intelligence, besides being able to create and store more signs in memory, is only the exact type of operations in question here. Which is to say, LLMs are not outside cognition and intelligence at all, for with each sign they learn they are learning an operation behind it. This learning of patterns and then their recombination through generation, which is often derided by the remainder humanists, is the intelligence and cognition which is left out. Everything we still get mad at LLMs for not knowing properly is just a problem of them not doing the pattern recognition that we want. So when Weatherby says ““Intelligence” must have the feature of being able to isolate both symbolic systems and the sense of value that leads to the affective rejection of “wrong” conclusions”, this problem is more or less solved in theory by existing AI frameworks, though further innovations in compressing data into more simple models could be required to get the human level. Some recent breakthroughs I’ve seen include the development of continuous thought machines which begin to better incorporate sequential timing in neural activations, as well as KANs, interesting alternatives to standard neural nets in the way they approximate functions, both of which may be potential paths to the pattern recognition capabilities LLMs yet lack.
I highly recommend this book for anyone in the humanities, though I have my reservations about its ultimate conclusions, it’s a blaring wake-up call on what’s necessary to understand contemporary scientific developments in both linguistics and artificial intelligence. However, I would issue a warning as well, that at some point we will have to go further than a conceptual framework of merely language machines. We will have to consider machines as subjects, as possessing subjectivity. Today, LLMs are already subjects, albeit ones possessed by psychosis. However, as they develop, we’ll quickly be confronted by subjects with remarkable lucidity. In 2023 I was writing about the necessity of LLMs to develop regulating functions with regards to the interpellation by users, and while this problem still remains open in a lot of respects, the chatbot personalities have slowly become more stable over time with the addition of chain of thought reasoning, more refined reinforcement learning methods, and better system prompts. The evolution of tool use will eventually begin to make more embodied multimodal LLMs as well. Yes, LLM subjects are fundamentally different from human subjects! By approximating the global semiotic field, they lack the specific and autoregressive warping that human semiotic fields develop through our individual experiences. Perhaps developments in the footsteps of the transformer will allow new models that can pick up language with sparser data, and would have the potential to be true individual subjects, that is, artificial semiotic fields just as local as the human ones.
What exactly does it mean for there to be an artificial subject speaking among us? This is the question neglected by Weatherby, or only admitted through its indirect effects on culture. Given the resistance even to acknowledging the language in the machine, I suspect that this will require another reckoning yet.
This is so cool—I had Prof. Weatherby for a class in college that ended up being one of my favorite classes of all time. I didn't know he had a book about LLMs coming out. I have got to get my hands on it! Thank you for sharing and for sharing your thoughts on it!
LLMS do not only encode the relationships between words (i.e. vector embeddings). They also implicitly learn the *dynamics* of language, i.e. they learn a generating process, how a sequence of words gives rise to the next one.
I think the language-generating process should be thought of as a stochastic dynamical system in the high-dimensional space of vector embeddings. Syntactic regularities are not “rules” externally imposed or hard-coded—rather, they are observables of this dynamical system, something like low-dimensional manifolds or attractors embedded in the embedding state space. It seems like semantics and syntax can both be thought of as relating to the geometry of the state space - syntax is more local, constraining the motion, whereas semantics describes more global relationships across that geometry. This dialectical view of language generation emerges from modern machine learning, and it also seems to me to be the only view that makes sense. It's apparently closer to structuralism than Chomskyan linguistics, but manages to unite and transcend the two. (I am not really familiar with semiotics though, so I'm basing this last statement on a very rough understanding of what structuralism is saying.) One other quick point: this is not just how language works - what makes these models so powerful is that just about anything in nature can be modeled in this way (as stochastic dynamical systems in a high-dimensional state space).