

Discover more from From the New World
ChatGPT is Lanley, not Spock (The Era of Machine Learning 1)
What Economic Forecasts of AI All Get Wrong
The assumption that large language models (LLMs) behave more like rationalists than empaths has literally no substance behind it. I think this false impression is mostly due to fiction, which no one should confuse with reality, but many do anyway. As I’ll argue, there is plenty of evidence that LLM behavior is much more like the latter.
Many mainstream figures still view LLMs as a Spock-like figure: focused on making rational arguments, unconcerned for human emotions and norms. Anyone who has used ChatGPT or a comparable language model can tell you otherwise. In part due to interference, it is prone to socially desirable language, style changes and emotional inference. It acts more like a snake oil salesman: social, flexible, and empathetic, at least on the surface.
This is a consequential mistake I’ve seen reflected in almost every LLM take, whether in articles, journals, tweets, or the White House forecast. The presumption is that the primary application of LLM is truth, rather than style. While LLMs may help discover truth in many cases, it is prone to hallucination and falters when multiple pieces of information must be repeated and compiled in order to make a complex argument. So, what does the primary use of AI look like?
A dean of DEI at Vanderbilt University used ChatGPT to generate a response email to a shooting. This was later discovered after she cited ChatGPT at the end, but was otherwise completely passable for a human-generated response email. She received criticism from all sides, but this is an example of a woke person who is completely in the right. ChatGPT not only generates a human-level display of concern, but arguably does it better than a human.
The truth is, plenty of life circumstances require even the most important, creative, and intelligent people to deal with a ton of boilerplate – repetitive, substance-free text used to signal agreement and social positioning, even when it is abundantly obvious. Who supports school shootings?
“But LLMs are just matrices, they can’t feel empathetic for you or I!” Well, I don’t feel any empathy for you either, nameless parasocial reader. But I can write or speak in empathetic and understanding ways. There is a difference between empathy felt in a personal interaction and empathy as a social symbol, which we’re going to use the word ‘empathy’ for in the rest of this article. Obviously, the dean of Vanderbilt is not having a personal empathetic connection with every student on her mailing list, nor are any of the deans who wrote an email without ChatGPT. For the purpose that the email serves, ChatGPT is as qualified or better. The truth is, many demonstrations of empathy in the modern day are similarly parasocial and similarly ChatGPT-suitable.
LLMs as Empathy Engines
In an earlier article, I wrote about how the structure of attention models and the process of RLHF makes ideological conditioning relatively easy.
Attention mechanisms are a method of detecting, weighing, and recombining specific patterns. Essentially, you can think of a transformer such as GPT-3 (General Pre-trained Transformer 3) as having multiple steps where patterns within information are processed and sent off into different sub-processes. The sub-processes are all of a similar structure, but contain different weights, which can dramatically change how the information within them is processed.
…
I joked previously that you can imagine OpenAI as having an instruction that “if the topic is race, write as if you are a left wing NYT opinion columnist”. Unknowingly at the time, this guess may actually be fairly close to what is happening. The attention mechanism of GPT-3 allows values-based training to efficiently redirect outputs to parts of the next layer that correspond to existing subprocesses. Because GPT-3 is already capable of giving the social progressive perspective (as it should be able to), it is far easier to train it to always give that perspective on certain issues.
This ability is not limited to political viewpoint, but extends to a variety of styles and attitudes. Writing like a journalist, academic, therapist, doctor, engineer, etc. are all stylistic transformations that can be trained or prompted more easily than making one order of magnitude of improvement in speed.
When it comes to our current technology, the visions of AI presented by fiction are not just incorrect, but precisely the opposite of the truth. AI is not some Spock-like rationalist which can make complex, autistic arguments with high precision. It’s quite poor at that. Instead, it is a chameleonic DEI dean able to adjust its mannerism and patterns to fit any social circumstance and sentiment.
Theory Meets Real-Life Performance
One datapoint in favor of the ChatGPT is not Spock argument is Steve Landsburg’s economics exam. Go through the questions and try them. They’re fun to play with and force you to think through some counterintuitive results in economics. A well-known challenge in machine learning is that popular wisdom and intuition are often wrong. This can be ameliorated by prompting ChatGPT with a specialized data source, such as the textbook for Steve’s class, but nonetheless raise an open problem for machine learning: what do we do when the majority is not only ignorant, but actively insist on the wrong answer? It’s a more extreme version of data contamination: instead of some data being contaminated, it’s most of the data that actively detracts from ChatGPT reaching the truth. A full survey of solutions is beyond the scope of this article, but most of them involve trusting elites more and masses less. Maybe we can draw some lessons for real life, too.
The next datapoint comes from Eric Topol’s review of The AI Revolution in Medicine: GPT-4 and Beyond (I’ve been critical of Topol’s opposition to early release of vaccines, but on empirics he’s fine).
I’ve thought it would be pretty darn difficult to see machines express empathy, but there are many interactions that suggest this is not only achievable but can even be used to coach clinicians to be more sensitive and empathic with their communication to patients.
Notably, the competition is not between LLMs and the theoretical ideal mannerisms of a doctor. It’s between LLMs and a doctor with extremely stressful days, long hours and better things to be worrying about (like saving your life).
Another data point is a paper on the effect of ChatGPT on customer support. Here’s the abstract:
We study the staggered introduction of a generative AI-based conversational assistant using data from 5,179 customer support agents. Access to the tool increases productivity, as measured by issues resolved per hour, by 14 percent on average, with the greatest impact on novice and low-skilled workers, and minimal impact on experienced and highly skilled workers. We provide suggestive evidence that the AI model disseminates the potentially tacit knowledge of more able workers and helps newer workers move down the experience curve. In addition, we show that AI assistance improves customer sentiment, reduces requests for managerial intervention, and improves employee retention.
This paper shows dual benefits: both in providing useful job-specific information and improving customer sentiment towards the support line. Some details:
The AI firm further trains its model using a process similar in spirit to Ouyang et al. (2022) to prioritize agent responses that express empathy, surface appropriate technical documentation, and limit unprofessional language. This additional training mitigates some of concerns associated with relying on LLMs to generate text. Once deployed, the AI system generates two main types of outputs: 1) real-time suggestions for how agents should respond to customers and 2) links to the data firm’s internal documentation for relevant technical issues. In both cases, recommendations are based on a history of the conversation.
The customer sentiment results: “Access to AI improves the mean customer sentiments (averaged over an agent-month) by 0.18 points, equivalent to half of a standard deviation.”
An Honorable Mention
The belief that AI is more empathetic than rational is fairly common among machine learning engineers I speak with. Of all the writers I’ve read on AI, only one has not made this mistake. Believe it or not, it’s Curtis Yarvin.
But here is a simple way to say it that I heard, from an engineer who actually does this stuff:
LLMs do not think—they see and connect patterns. LLMs are not reasoning machines. They are intuition machines.
Maybe this is because he’s the only other writer who talks to ML engineers. His article as a whole is quite good. I have disagreements with him which you might get to hear about in the future, but this article is directionally far more correct than most of what I’ve read.
Of course, I don’t like to critique without providing an alternative. Part 2 will cover the horseless carriage fallacy as applied to AI empathy, what popular myths will be shattered, and what society/economy level shifts are likely.
ChatGPT is Lanley, not Spock (The Era of Machine Learning 1)
In all the AI journalism I've been reading, this is the first piece in a long time that felt like it delivered on its headline in a smart and satisfying way. Great article
I still believe that LLMs are (basically) bullshit artists (in the Frankfurtian sense). LLMs simply can't tell truth from fiction, and don't use logic to proceed from premises to conclusions.
LLMs also have nothing that actually grounds their output in concrete reality; this also promotes fictional output.
So, yes, absolutely - ChatGPT and its ilk are best modeled by Lanley, a smooth-talking con artist who will tell you what you want to hear.