Rebooting A.I.

Mar 01, 2023

upklyak image on Freepik

I borrowed the title to this article from a book published by Gary Marcus and Ernest Davis in 2019 called Rebooting A.I. [1]. Despite the “homage”, the text is not exactly about the book, although it is one of its references. It is about the zeitgeist of most people researching and developing A.I. these days.

From 2021 onwards, I have been reflecting a lot on how to address this subject to a broader audience. Do I use more or less mathematics? More or less theoretical concepts? More or less practical examples? How much computing do I need to show? Is it worth including programming codes? Anyway, these are some of the many questions that crossed (and still cross) my mind.

But one of the (few) certainties that I have had is the need to pass the “spirit of the times” that drives the current A.I. development. Mainly because the discussions that drive artificial intelligence are (almost) exclusively done by experts.

In the last couple of months, reports have appeared in the press about the achievements of large language models, in particular ChatGPT. I addressed my concerns about them in a text written in Portuguese, which I didn’t translate to English, sorry. But to make a long story short, it concerns two worries that I have about 1)these large language models being co-opted by malicious actors to produce disinformation on a massive scale; and 2) that the reckless use of chatbots causes distress (or even death) to more sensible users. Anyway, there is another discussion that precedes the way we relate to these models, which concerns the way we develop them.

There are basically two different approaches to think about A.I., one of which we could call the classical paradigm, which argues that artificial intelligence should be inspired by natural intelligence. Examples of this line of thought are John McCarthy, one of the co-founders of A.I., who wrote articles about why an A.I. would need common sense [2]; Marvin Minsky, another of the co-founders of the field, who wrote a book about the need to look at the human mind for inspiration and clues on how to build a better A.I. [3]. Herbert A. Simon, one of the few to win both the Nobel Prize in economics and the Turing prize in computing, wrote about A.I. in one of his major books, Models of Thought [4]. In the aforementioned book, Simon explains how “newly developed computer languages” needed to express theories of mental processes, so that they could allow computers to simulate “predicted human behavior” in the A.I. models that were being developed at the time.

This was the main focus of A.I. development until about the middle of the 2010s, when the “perfect storm” involving fast internet; increased processing capacity (and the decreasing of its cost); and access and use of large amounts of data allowed the practical implementation of artificial neural networks.

Artificial neural networks (and the deep learning method) allowed a large fraction of the A.I. community to start championing a new development paradigm, which was called by neuroscientist Naveen Rao Alt Intelligence (or alternative intelligence).

Alt Intelligence, unlike the classical paradigm, is not interested in building machines that solve problems in ways similar to human intelligence. It's about using massive amounts of data, often derived from human behavior, as a substitute for intelligence [8]. Right now, the predominant belief at Alt Intelligence is the notion of scale, which is the idea that the larger the system, the closer we can get to “true” intelligence, perhaps even consciousness [8].

There is nothing new per se in studying Alt Intelligence. We have known, for some time now, that human intelligence is not the only form of intelligence present on our planet, as the studies concerning the sensory intelligence of trees had shown [5]. The problem, in my view, is the disregard with which human cognition has been treated by the current paradigm, ignoring areas such as linguistics, cognitive psychology, anthropology and philosophy.

Alt Intelligence represents an intuition, or more properly, a family of intuitions, about how to build intelligent systems [8]. Since no one knows how to build any kind of system that matches the flexibility and resourcefulness of human intelligence yet, it's only fair that those working in the field have different hypotheses about how to get there. It is true that scaling has seen some success lately (ChatGPT is there to prove it). But it is also known, even by those who defend scaling, that one cannot simply increase the volume of data to feed the models or their structure and wait for success. For example, since November 2022 Microsoft has known that Sidney could “behave badly” (see here the complaint posted on Microsoft's own Q&A site on November 23rd).

I’m aware that “swing” at Bing’s Sidney has become commonplace nowadays, especially after the chatbot claimed to be in love with New York Times columnist Kevin Roose and tried to make him divorce his wife [7]. So here are some other examples:

Gary Marcus, Ernest Davis and Scott Aaronson published a paper in 2022 in which they evaluated the semantic capacity of the DALL-E 2 system [6]. In fourteen evaluation tests applied, the authors noticed the A.I.'s inability to deal with complex specifications, as shown in Fig.1.

Fig.1: DALL-E 2 response to the specification “a red ball on top of a blue pyramid with the pyramid behind a car that is above a toaster”. Source: [6]

Similarly, Flamingo, an awesome model from DeepMind, has its own bugs (not noted in the presentation paper), captured in a candid Twitter thread by Murray Shanahan, a research scientist at DeepMind:

Murray Shanahan @mpshanahan

A great 🧵 here from @jalayrac. Here's a short one of my own 1/9

JB Alayrac @jalayrac

It is great to see the excitement about 🦩! As shown in different examples during the last few days, interacting with 🦩 has been quite fun, unique and sometimes mind blowing. However,🦩 has clear limitations as detailed in this 🧵! 1/11

Shanahan showed Flamingo the following image (Fig.2):

Fig.2: Image presented to the Flamingo system. Source: Murray Shanahan

Next, the dialogue described by Murray Shanahan himself between the A.I. and a user about the image presented in Fig.2:

Murray Shanahan @mpshanahan

And this was the conversation we had about it: Flamingo: This is a traffic light. It is red. User: Can you see three circles? Flamingo: Yes, I can see three circles. User: What colours are they? Flamingo: They are red, blue and green. 5/9

Murray Shanahan @mpshanahan

User: Where is the green circle? Flamingo: The green circle is on the right. User: Really? On the right of what? Flamingo: The green circle is on the right of the blue circle. 6/9

Gato, another DeepMind system, is capable of multimodal feats. Yet, when you read the “fine print” the system demonstrates the same unreliability, alternating moments of brilliance and absolute lack of understanding (Fig.3):

Fig.3: Images created by the Gato system. Source: Gary Marcus

By analyzing Fig.3, it is possible to identify that the system, which should be a generalist agent, according to DeepMind's own blog, has a semantic understanding, at best, limited. I ask, where is the surfer's board? Is the player throwing or catching the ball? Why isn't he on top of the mound? Is it a banana bread donut? Where is the white plate?

Of course, it is not uncommon to argue (quite reasonable, by the way) that even human beings make mistakes, let alone a deep learning model. But, it is necessary to sincerely look at the examples shown above. Let's recognize that these types of mistakes reveal that something is still deeply wrong. Imagine that the examples refer to semantic understanding mistakes made by your young children, wouldn't you be concerned?

Literally billions of dollars have been invested in Transformers, the technology that powers ChatGPT, GPT-3, Gato and so many others; training datasets expanded from megabytes to gigabytes; and parameter counts now from millions to billions (already reaching trillions) [8]. And yet, semantic misunderstanding errors, well documented in countless works since 1988 [1], remain.

For some (and I include myself in that group), the continuing problem of misunderstanding may, despite immense progress, signal the need for a fundamental reassessment. The dream of artificial general intelligence, or AGI, which is the community's shorthand for A.I. that is at least as good, ingenious, and comprehensive as human intelligence [8], is still a long way off. The hallmark success of current artificial intelligence, and Alt Intelligence in particular, has been games like chess and Go [8]. Both DeepBlue and AlphaGo owe little to nothing to human intelligence. But its emphasis on technical tools to accommodate larger datasets is misplaced. Symbol manipulation needs to be re-integrated into cognitive science and A.I. It is a fact that symbolic descriptions such as “red ball on top of a blue pyramid” still escape the state-of-the-art in 2023.

Yes, machines can play certain games very well. Yes, deep learning has made great contributions to multiple domains such as speech recognition. But no current A.I. is even remotely close to reading arbitrary text with enough understanding to be able to build a mental model based on semantic communication alone (e.g., understanding what a person says she wants to accomplish). Nor is it able to reason about an arbitrary problem and produce a cohesive answer.

Success in solving some problems with a specific strategy does not, in any way, guarantee that we can solve all problems in a similar approach. Again, I take Murray Shanahan's reasoning: nothing suggests that "just scaling will get us to human-level generalization".

Murray Shanahan @mpshanahan

My opinion: Maybe scaling is enough. Maybe. And we definitely need to do all the things @NandoDF lists. But I see very little in Gato to suggest scaling alone will get us to human-level generalisation. It falls so far short. Thankfully we're working in multiple directions

Nando de Freitas 🏳️‍🌈 @NandoDF

Someone’s opinion article. My opinion: It’s all about scale now! The Game is Over! It’s about making these models bigger, safer, compute efficient, faster at sampling, smarter memory, more modalities, INNOVATIVE DATA, on/offline, … 1/N https://t.co/UJxSLZGc71

The field of artificial intelligence should be encouraged to be open-minded enough to work in multiple directions without prematurely discarding ideas that have not yet been fully developed. It may be that the best path to artificial general intelligence (AGI) is not through Alt Intelligence.

The fundamental issue is something the A.I. community once valued but forgot: if we want to build AGI, we need to learn something from ourselves, human beings. How we reason and understand the physical world and how we represent and acquire language and complex concepts.

REFERENCES

[1] Marcus, Gary; Davis, Ernest. 2019. Rebooting AI: Building Artificial Intelligence We Can Trust. Pantheon Books, USA.

[2] McCarthy, John. 1958. Programs with common sense. Symposium on Mechanization of Thought Processes. National Physical Laboratory, Teddington, England.

[3] Minsky, Marvin. 1986. The Society of Mind. New York: Simon & Schuster.

[4] Simon, Herbert A. 1979. Models of Thought: Vol. I. Yale University Press.

[5] Keim, Brandon. 2019. Never Underestimate the Intelligence of Trees. Nautilus, October 30, 2019, https://nautil.us/never-underestimate-the-intelligence-of-trees-237595/.

[6] Marcus, Gary; Davis, Ernest; Aaronson, Scott. 2022. A very preliminary analysis of DALL-E 2. arXiv. arXiv.org, https://doi.org/10.48550/arXiv.2204.13807.

[7] Roose, Kevin. 2023. A Conversation With Bing’s Chatbot Left Me Deeply Unsettled. The New York Times, February 16, 2023. NYTimes.com, https://www.nytimes.com/2023/02/16/technology/bing-chatbot-microsoft-chatgpt.html.

[8] Marcus, Gary. 2022. The New Science of Alt Intelligence. The Road to AI We Can Trust, May 14, 2022. https://garymarcus.substack.com/p/the-new-science-of-alt-intelligence.

Marcelo Tibau

Discussion about this post