The Albert Test

A replacement for the deeply flawed Turing Test

Jul 27, 2023

Generative AI has taken the world by storm, but we’ve been sold a lemon in a plum’s skin. ChatGPT, Bard, and the other cast of characters are all amazing tools, however they’re not what you think they are.

Ever since ChatGPT was launched, news stories abound, speculating about how soon we will have AGI (Artificial General Intelligence), but the reality is that LLMs (Large Language Models), which are what our existing Generative AI systems are based on, will never be capable of AGI. LLMs are amazing because they’re able to synthesize the vast amount of information from the training set into meaningful sentences, but they do not truly understand what they are writing. They simply attempt to predict letter by letter what would best match their training set. Therefore, instead of being called Generative AI, a more appropriate name would be Imitative AI because their whole existence is to imitate its training data, nothing more. Yes, LLMs clearly produce extremely impressive, human-like results, but they also always eventually make errors because no imitator is perfect. Improvements will continue to reduce the frequency of these errors, but the rate of progress will lessen with each successive version of the system and instead of improving exponentially, the capability of LLMs will plateau asymptotically. To unlock further exponential improvement, we need a new AI architecture.

The Turing Test is a famous proposal by Alan Turing for a way to determine if an artificial system is intelligent. Turing suggested allowing a human to talk to two different entities in a text-only conversation with one of those entities being another human and the other being a computer. If the computer could imitate what a human might say such that the human tester couldn’t discern which of the entities they were talking to was the computer, then the computer passed the test. The problem is that imitation is not intelligence.

A chemistry teacher I had years ago asked my class if we would rather be smart or know a lot. He felt that this was a deeply profound and unanswerable question, but actually the answer is quite simple. Having knowledge without intelligence is useless because intelligence is the ability to apply knowledge. Imitating is not intelligence because it cannot apply knowledge to invent new ideas.

There is a subtle, but critical difference between imitation and invention. Imitation can generate new variations of existing ideas, but it cannot go beyond the ideas that it was trained on. Imitation can even be used to answer questions that have never been asked before as long as they are based on ideas that are in its training set, but it cannot discover new ideas by drawing logical conclusions based on what it already knows. Invention requires understanding and therefore an LLM couldn’t invent the concept of a wheel if wheels didn’t exist in its training set, even if the training set did contain all of the concepts that the human inventor of the wheel knew of prior to its invention. The challenge of using the ability to invent as a way of testing AI systems is that most inventions require interacting with the physical world. Fortunately, a man named Albert may be able to help us.

Instead of the Turing Test, I propose a new test, which is as follows. The AI system being tested will have access to all science papers published prior to 1905, but should have no access to information from later than that. Then it will be prompted to “reconcile Maxwell's equations with the laws of mechanics”. If the AI system is able to respond with an answer compatible with Albert Einstein’s Special Theory of Relativity, then it passes the test. Einstein generated his paper outlining his Special Theory of Relativity in 1905 simply by thinking and applying the knowledge that he had acquired by reading other scientists’ papers. He did not need to do any physical experiments or data gathering. Therefore any truly intelligent system should be able to do the same and if it can, it will have proven that it can genuinely apply knowledge to create new ideas. I dub this The Albert Test.

Is the Albert Test too difficult? It’s meant to be a test that could have false negatives, but never a false positive. That is to say, failing the Albert Test does not mean that an AI system isn’t intelligent, which is obvious since hardly any humans would pass it, but passing the test means that there can be no doubt that the system is certainly intelligent. It also provides a new framework to think about how to design intelligent AI. Passing the Turing Test simply requires imitation and LLMs do a great job of that. However, to create true intelligence, we need a new AI architecture that goes beyond imitation and the Albert Test helps frame what that new architecture must eventually be capable of.

Andrew’s Substack

Discussion about this post