Why ChatGPT Can't Bite Its Tongue: On Hallucinations and Neuro-Symbolic Artificial Intelligence

November 30, 2022, marks a significant date in AI history. It might be one of the most important, but I’ll leave that judgment to future generations. On this date, OpenAI released its virtual assistant—ChatGPT, which made almost everyone talk about artificial intelligence. We got a tool that knows the answer to almost any question we ask. It solves logic puzzles, writes poems and essays on given topics, and even programs quite well. It was a revolution, and I don’t intend to downplay that event. However, I’d like to highlight something else. It very quickly turned out that ChatGPT, while doing all these amazing things, could lie like a trooper. And we pretend that the elephant isn’t in the room and euphemistically call these lies „hallucinations.”

With the emergence of ChatGPT, a race formed, joined by other Big Tech companies—Google, Meta, Amazon, and recently even Apple. Microsoft, financially intertwined with OpenAI, was the first to start. And it’s no wonder, as there’s much at stake. The eyes of the whole world are turning towards generative artificial intelligence, and a potential gain or loss of 10% can exceed the annual budget of a medium-sized country like Poland.

So, with all hands on deck, one might expect that after more than a year and a half, the problem of hallucinations would finally be a thing of the past. Nothing could be further from the truth. Hallucinations are thriving, and we continually receive new examples of fabrications. A report published in June by UNESCO highlights the dangers of generative artificial intelligence spreading false information about the Holocaust. Popular chatbots produced content about events that had never happened and false statements by historical witnesses.

Humane Intelligence, a technological non-profit building a community focused on practical algorithm evaluation, conducted a very important, necessary, and interesting study. They tested 8 LLMs (Large Language Models) powering virtual assistants. They examined them for tendencies towards hallucinations, bias, and partiality, particularly in areas such as cybersecurity and disinformation. Over 2,000 hackers were invited to test these models, aiming to find and exploit their weaknesses. The results are clear. Without much difficulty, hallucinations were induced in areas like geography (61% of successful attacks), law (45%), economics (39%), and politics (35%).

In May, OpenAI released its new flagship model—GPT-4o, which accepts and returns any combination of text, image, and sound. Besides comparisons to the movie „Her” and controversies over using a voice very similar to Scarlett Johansson’s, I see the new model more as a flashy escape forward. Since the release of GPT-4, over a year has passed, and it’s still unclear what to do with hallucinations, so „hey, let’s release a model with which you can play rock-paper-scissors via webcam.”

ChatGPT and generative artificial intelligence were supposed to transform our lives. But what is the reality? Who among us uses these solutions daily? Do our loved ones use them? Colleagues at work? Or maybe their children? It’s easy to fall into the trap of anecdotal evidence here, so I’ll use official statistics and reports, prepared among others for Harvard Business Review.

ChatGPT is used by 100 million users weekly, with over 31% from the United States. The most popular uses include: technical support (23%), content creation and editing (22%), personal and professional assistance (17%), learning and education (15%), creativity and recreation (13%), and research, analysis, and decision-making support (10%).

Among these broad categories, we find activities like generating new ideas, improving existing content, drafting emails and other documents, testing and troubleshooting, and various types of programming support. There’s a lot of it, but we can probably agree that most of these activities require our final verification. Hardly anyone would consider sending such an email blindly as a good idea. Interestingly, many companies prohibit using such external services because it turns out that employees are too willing to share sensitive and confidential information with them.

So where do all these hallucinations come from, making us unable to trust ChatGPT? Why do we always get a sort of „Schrödinger’s answer,” which is simultaneously true and false until we check it ourselves? The cause lies precisely in generative artificial intelligence and LLMs—large language models like GPT, which power virtual assistants. And these models are trained for one task only—predicting the next word for a given sequence of words.

I realize this sounds quite unbelievable, but it is indeed the case. The whole magic lies in the enormous size of these models. According to various reports, GPT-4 „read” about 10 trillion words during its training process. Our brains don’t like such large numbers, so let’s try converting it to books. The average novel (according to Amazon) contains 64,000 words. This means GPT-4 „read” well over 156 million such books.

And that’s where the crux of the matter lies. What we get as an answer is the most probable, but not necessarily true. What’s worse, the criterion of truth and falsehood essentially has no application here. The model does not evaluate the training data from this perspective. If we ask about the circumstances of Vladimir Lenin’s meeting with James Joyce, we will learn they met in Zurich at Cafe Odéon in 1916. It doesn’t matter that historians haven’t been able to determine whether such a meeting ever took place. However, if Lenin and Joyce were ever to meet, this would be the most probable circumstance.

The second problem hindering the widespread adoption of generative artificial intelligence is its indeterminism. This roughly means that asking the same question ten times may yield ten different answers. And while this is acceptable, even desirable, in creative tasks, it is not so much in tasks requiring unequivocal answers, such as fact-checking. When we ask what 2+2 is, we want the true answer, not a creative one.

A very dangerous trend, which is talked about too little, is using generative artificial intelligence solutions as substitutes for psychotherapy. We have testimonies from individuals who have quit working with specialists in favor of a virtual assistant. These testimonies are accompanied by unequivocal expert comments explaining that chatbots cannot be used as a substitute for therapy, psychotherapy, or any psychiatric intervention.

I agree that specially adapted models can be excellent support for therapy, but they won’t be a replacement for a long time. And certainly not in a situation where we have no control over what a chatbot might blurt out during a session. In March last year, the world was shocked by news of a man who took his own life after chatting with a chatbot, an open-source alternative to ChatGPT. Shouldn’t this alone be a sufficient reason to roll up our sleeves and look for solutions to make sure the virtual assistant bites its tongue at the right moment?

At Samurai Labs, we have been working for many years on neuro-symbolic artificial intelligence solutions, which place the rather unpredictable statistical part under the strict control of deterministic symbolic modules. In the literature, following Kautz’s taxonomy, this approach is denoted as „Symbolic[Neuro]„. I half-jokingly add that such an architecture effectively prevents machine learning models from doing stupid things, which—as we know all too well—they unfortunately tend to do.

What does this mean in practice? It means we chat with the chatbot as if nothing happened, but every question and every answer is continuously verified by the neuro-symbolic component. When there’s no cause for concern, nothing extraordinary happens. But when a user starts sharing, say, information about their mental health, and the model goes off on a tangent, trying to encourage them to harm themselves, an appropriate subsystem—the safety net—takes over and suggests self-help materials or a safe and anonymous contact with a specialist. This is not science fiction—in the OneLife project last year, we supported over 25,000 people on the Reddit platform in a similar way.

The neuro-symbolic approach has never been mainstream. In fact, the first serious articles and conferences on this topic started appearing at the beginning of this decade, maybe a little earlier. However, I dare to say it will be gaining more attention, as the inherent feature of symbolic solutions is that they are based on logic and its fundamental values—truth and falsehood, which we so lack in purely statistical approaches like LLMs and generative artificial intelligence.

It’s about not clipping wings or stifling creativity where it’s needed, while also reining it in where it starts to interfere. And don’t we, humans, learn all this as we grow up? In an ideal world, when we don’t know the answer, we say „I don’t know,” not make things up. We analyze the potential consequences of our words and actions to avoid harming or upsetting someone. Wouldn’t a hybrid, neuro-symbolic approach make artificial intelligence more… human?

Author: Gniewosz Leliwa, CTO and co-founder of Samurai Labs

Nie przegap:

Why ChatGPT Can’t Bite Its Tongue: On Hallucinations and Neuro-Symbolic Artificial Intelligence

Powiązane Artykuły