The AI Knowledge Problem: Leibnitz, Wolfram, and LLMs

Gottfried Wilhelm Leibnitz (1646-1716), who famously invented calculus at the same time as Isaac Newton (Newton got the credit; we got the Leibnitz notation, thankfully), also believed that if we could discover the correct way of representing our world in symbols, we could calculate the world using logic and mathematics. He also invented binary mathematics and made significant advances in logic to this end.

Stephen Wolfram is on a similar mission to make the world computable. For about 40 years, he has been working diligently on this project, starting with Mathematica, then several years ago expanding that to the Wolfram Programming Language, and more recently to the Wolfram Physics Project. Along the way, Wolfram has expanded on the ideas of von Neumann, Turing, and Godel by describing the principle of computational equivalence, and by carefully differentiating between the traditional, formula-based methods of math and physics, and computational or algorithmic methods.

Large Language Models (LLMs) like GPT-4 (ChatGPT) provide text that sounds like their training data, given prompts that they can use to predict what text most likely comes next. They neither encodes things about the world based on knowledge of how the world works for possible later computation (Leibnitz), nor do they compute answers to questions about the world by feeding knowledge into a compute engine (Wolfram). They simply predict what they think a human would write based on about 1500-3000 words of written prompt.

What do these have to do with the AI Knowledge Problem?

They represent three fundamentally different ways of dealing with it.

Leibnitz, based on the belief of his time that the world was inherently knowable, started with quantifying knowledge about the world. He believed that if we could solve the problem of describing the world in a consistent, logical, and rigorous manner, we could then perform computation based on that knowledge to make new discoveries. Alas, Leibnitz was burdened with living in a time before computing machines were available for him to fully experiment with his ideas.

Wolfram, who is burdened with the additional knowledge of the past 300 years, most specifically that of Godel, who showed that there is a limit to what is knowable within any system and that nothing is 100% knowable, chose to start with computation. He seems to believe fundamentally that if we build a good enough compute system, then we can feed it knowledge and get good answers. He has had some success with this approach. Most of us use his technology, even if we don’t know it. Wolfram Alpha, his “search engine” built on Wolfram Language, powers big chunks of Siri and Bing.

LLMs basically ignore the knowledge problem altogether. The most generous thing that can be said about LLMs and knowledge is that they regress to the mean of the training data, and if the training data overwhelmingly supports one hypothesis, the LLM will likely reproduce that hypothesis and some supporting statements. Of course, it can be prompted in a way to argue against a prevailing hypothesis. Unless the OpenAI safety engineers have decided that doing so causes a problem.

So we have a few choices about how to approach the AI Knowledge Problem.

Option 1 – The Leibnitz Paradigm. We can start with knowledge, properly quantified and qualified, and add compute and a friendly interface. The challenges here are several, not the least of which is that there are fundamental limitations on knowability which will require us to qualify our knowledge with a measure of uncertainty. Luckily, we have developed the mathematics necessary to do this. There are societal barriers as well, especially that those who profit from certainty do not welcome uncertainty, and they certainly do not gladly accept falsification. But to advance our knowledge, we need both – to acknowledge uncertainty and to welcome falsification.

Option 2 – The Wolfram Paradigm. We can start with computation and add knowledge and a friendly interface. This has the advantage of appearing to be useful more quickly by providing answers earlier in the process of making the world computable. But we have a GIGO (Garbage In, Garbage Out) problem when we scrape the internet for the knowledge to feed our compute engine. Without a rigorous understanding of the quality and certainty of the information going into the compute engine, we cannot assess the quality of the outputs. The risk here is that we attribute rigor to the process based on the computation, but forget that the computation depends on the rigor of the data.

Option 3 – The ChatGPT Paradigm. We can ignore the knowledge problem and use AI and human intervention to ensure that the outputs of our tools conform to some vaguely-defined standard of acceptability. There are many problems here. First, what is the standard of acceptability? If we are building tools to solve problems, as OpenAI claims in their advertising, then we risk overlooking solutions to difficult problems because of an arbitrary standard of acceptability, which may very well rest upon the belief that “we already know the right answer.” Second, if OpenAI continues to refuse to make their tool Open, then the users will not be able to assess the rigor of the output at all. Third, and perhaps most fundamentally, LLMs regress to the mean of the training dataset. As LLMs get more capable, the training data is becoming something like “a weighted representation of the entire internet.” The predictable result of LLMs becoming authoritative sources is that we continue to divide into our increasingly polarized camps and throw experts at each other. Even if we escape the polarization, we are still regressing to the mean of what we already think we know, which is a terrible way of inventing new solutions to difficult problems.

Which Option do you think holds the most promise for helping us solve our most difficult problems?