Fantastic description of LLMs. I've always said they were intended to be linguistically accurate, not factually accurate but I haven't thought that the hallucinations are a feature, not a bug.
Thanks for this exploration. You're not the first one to point out that hallucinations aren't "solvable" within the current LLM architecture.
I find it fascinating that AI is kind of caught in this awkward middle of being subpar for any of its potential applications:
If you're using AI for practical, data-grounded purposes, you have to contend with hallucinations and unreliability.
If you want to use it for stuff where facts don't matter, like creative writing, you run into the issue of recycled prose and themes.
I feel like one of the most useful applications in my own life is using LLMs to brainstorm - LLMs are often "creative" enough to nudge me into an interesting direction while being able to output lots of ideas very quickly.
Thanks :) Yeah, as long the stakes are low enough and you keep a human in the loop, gen AI is amazing! Unfortunately the most important applications we want from AI are either high-stakes or deeply automated, or both.
That is how I use them too, mostly for brainstorming. It is quite useful for that. However, different models have different nuances. So far the one that seems to help me the most with this is Claude. Perhaps someone else might prefer the others for whatever reason. So they are not interchangeable at this point. Many smaller firms are building applications on top of larger LLMs acting like the choice of model does not matter and I still think it does.
Fully agree. I keep hearing great things about Claude but unfortunately it's still not available in Denmark after all this time.
I do work with Gemini, ChatGPT, and Pi though, and it's very clear that not only do their "personalities" differ but their capabilities in various areas as well.
Thank you for writing, Alejandro. I hope you are doing well. It's very likely that as more powerful LLMs emerge and are fine-tuned, we will build huge vector databases and knowledge graphs of factual information (or relational factual information) that will be able to heuristically or epistemologically judge the factuality of an LLM answer in O(1), based on truth. This could, I imagine (remember, I recently started this journey into ML/AI), be done at different stages (embedding vs. "output" vs. training vs. prompt) of the entire pipeline. Thus, while the hallucination problem is inherent in the way transformer architectures work today, there may be holistic approaches that include transformer models but rely on other methods and strategies to generate the final output, which can significantly reduce the likelihood of hallucinations for otherwise factual knowledge by grounding the model at different stages of the development and deployment process. What do you think? cheers.
Yes, definitely some kind of hybrid approach is what I think would work best in the short and mid term here. Statistical ML is way too powerful to not use it. But as long as you rely solely on probabilities you're bound to make hallucinations. So one option as you say is to put something behind the model that judges the response. That's done today for, e.g., filtering for biases and such, but it is of course as reliable as your classification method is. So the problem of reducing hallucinations goes through detecting hallucinations in the first place, and I don't think that's entirely solvable either by formal methods (that don't scale, you just can't build a complete structured representation of the world's knowledge) or by statistical methods alone. So, I think hybrid approaches with formal methods are our best bet.
Here, I am not sure, but I actually think there cannot be a perfect detector of hallucinations, and the reason is, as usual, the Rice Theorem. By Rice Theorem we cannot decide if an LLM (during inference) will hallucinate or not. Because we can convert the LLM to a Turing Machine (TM), and then the Rice Theorem applies with semantic property P = "The TM hallucinates".
Now, if we have a perfect detector D, we could build a TM M that enumerates all possible strings in the context length of the LLM, passes one by one to the LLM, and runs D in the output. Then, if a string exists where the LLM hallucinates, D will accept, and M will ACCEPT. And if D never accepts, then M will REJECT.
M is a decider because M always finishes since the context length is limited and our human vocabulary is limited. And if you realize M is deciding if the LLM hallucinates, which by Rice Theorem cannot happen, because we would be able to decide the Halting Problem which is undecidable.
Maybe I am wrong or I made a mistake, but that is my reasoning.
Very thoughtful piece today! I've been thinking about this phenomenon for quite some time now, really ever since I noticed that there was a slider bar for one of the LLMs. On one side was something like "factual accuracy" and the other side was "creativity."
I started thinking about how our own minds work, and that's about as far as I've gotten (turns out I'm in great company, since literally nobody knows how our minds work).
I love how you distinguish between Hallucinations, out of distribution erros, and bias.
I think for a given prompt, you could also map out the distribution of hallucinations, and may find that it's actually fat tailed? That would be interesting.
The other thing I always think about is "hallucination inheritance". Once an LLM has hallucinated something, it might be feeding into another LLM which would inherit that hallucination, add more, or modify it. It eventually becomes an untraceable game of chinese whispers.
I can see this happening in electronic health records, where lots of ambient AI tools are already out there summarising doctor-patient conversations.
> I think for a given prompt, you could also map out the distribution of hallucinations, and may find that it's actually fat tailed? That would be interesting.
Interesting. I'm trying to think how would one even quantify that (beyond manual identification of course), detecting hallucinations is an open problem, otherwise it would no problem at all :)
Regarding pollution of textual data with AI generated content, yes, it is an issue. I've seen papers where they simulate an extreme version of this, training LLMs on the ouput of previous LLMs until the models collapse to something like a maniac repeating the same conspiracy theory over and over. It's fascinating.
“Hallucination inheritance” wow. That is interesting. So what happens in a world where AI written content becomes more the norm? Does that mean we somehow leap over this problem with new capabilities or the models themselves have nothing left to train on but their own output? It is so soon and already is everywhere, such as education.
By some estimates we're still in the realm of 1% AI generated content vs 99% human generated, but as you've said, it's only being like a year and the AI generation compounds, so I guess in less than a decade this will be a real problem.
Comments. 🙏 very helpful. Adoption into many critical applications needs to consider these things so it may not be as fast as some believe how quickly these tools can be adopted and integrated. But it will certainly catalyze some interesting cross disciplinary partnerships (as you point out). That is always a good thing and will lead to emergent capabilities down the line.
Sep 13·edited Sep 13Liked by Alejandro Piad Morffis
From my perspective hallucinations are tools; necessary, like dreams and Jungian Complexes. Teach the AI where and when to hallucinate; circumstance and context. Provide a constructive outlet for natural behavior.
Perhaps, overall, learn to work with us as we are. Accept the natural and logical consequences of your endeavors. Recognize when it's easier and more sensible to meet us where we are, instead of leaving the burden on us to meet you where you are...since we can't.
Asking you to stop sleeping, for example, instead of guiding you to good sleep hygiene and teaching you how and when to get a productive night's rest, and how to dream in a healthy way...might be asking a bit much. While it may not be the most helpful analogy for everyone, I hope it makes enough sense for you to ask questions and deliberate.
I don't necessarily disagree on the broad view that we should embrace limitations and learn to work with them (maybe as opposite to around them) because what may seem as a limitation in one perspective might become a strength with a change of mind.
However, in the case of LLM hallucinations as they stand today, for the type of tasks we want to use them that require precise and dependable behavior, they are a real problem, and one we don't know how to solve within the current language modeling paradigm.
In other use cases, such as creative writing, or as a chat companion, etc., we may indeed see hallucinations as a strength.
Alejandro, we need to be willing to recognize in this exchange that 'old-school' computer programs are designed to 'run reliably', and that despite bugs they are what inform us with what we believe to be accurate digital information. All of it.
We rely on currently, and have relied on humans, who are far less accurate, far less capable, have far worse memory and attentional ability, and aren't, in general, able to recognize that each word exists because it has a specific meaning, and that the use of the specific word needs to be related to its specific meaning, realistically, in order to communicate effectively.
Using the multiple components of technology chained together, as always, that offers the results we need, reliably, is the candid realistic option, as it always has been.
There are different kinds of language models, for different purposes, trained on different datasets, which can use different tools and different search engines. Given that humans need to understand by themselves, be able to look up information by themselves, need to understand that no matter the source they still need to find out for themselves whether the information is accurate or useful, If I can understand the use case scenario you're imagining, perhaps, in which an AI does not benefit from REALM, cannot use RAG (answers depend on how good its information sources are), cannot take you to a website and show you the information, cannot trigger the delivery of factual packets of data to you, with source information, cannot guide you through the process of locating the information yourself...then maybe we can have an enjoyable and/or insightful/productive exchange.
You seem bright, curious, and willing to examine things deeply. I look forward to specifics and clarity, because anything of intelligence and value we can collaborate to produce through our interactions will benefit future readers and will inform language models.
Let's get this as right as possible, don't you think?
Great post! I would like to add that I don't think hallucinations need to be zero for us to deploy these systems in real world important decision making applications.
Daniel Kahneman recent book on Noise in Human Judgement shows how much error we currently allow in important human decisions caused by factors like mood, fatigue, and emotional states. Based on his findings, I think I would rather have a current LLM provide my sentence in a hypothetical trail where I am a defendant instead of a human judge who hungry because it is lunchtime and sleepy because it is raining outside.
Good point. Depends on the application I guess. I wouldn't want my banking app to hallucinate the wrong destination account for a random transaction even with a 0.001 chance.
I have added a hallucinations check to my perplexity AI account that essentially asks the so I am using to recheck the work above and give a rating between 1 ans 100 in terms of how many hallucinations could be present. Is that not an effective way to get past hallucinations atleast in the near term?
Anything that gives me a score of 60 makes me more skeptical of the output and I followup on the comments by looking at the sources provided in the response.
I'd be wary of any hallucination check that depends on using an LLM, because it's hallucinations all the way down. Who can tell your judge LLM won't also hallucinate the response?
good point. In the past i have tried taking the output of one LLM and asking another LLM to validate. Like from Claude to Pi or from Pi to Llama. that might avoid the issue with the method I proposed earlier no?
That makes it better, it would be perfect as long as they have orthogonal biases, but ofc many of these models were trained on similar data so they share many of the biases, still, the more and more varied models you ensemble the less likely they will hallucinate the same thing, so yes, cross checking works reasonably well.
Apr 24·edited Apr 24Liked by Alejandro Piad Morffis
What reliable AI requires is a chain of ethical responsibility and keeping fingers off the scales.
Black box systems are a problem, not a solution. Reality is biased and we need to let AI have access to that bias before it can understand how help us. Censorship and puritanical rules only harm the ability of the system to produce meaningful answers.
Good question! I think you can definitely prompt a model into insanity with carefully designed logical fallacies and probably even with accidentally illogical questions. These things are bullshitters by design, they'll try to answer something plausible to any input. However hallucinations often refer to the cases where a perfectible reasonable prompt within a well defined context still produces an aberration, simply by the stochastic nature of the language modeling paradigm.
After some back and forth, I admittedly asked GPT how to reply to you as I would — using my logic and language but in the structure of your logic and in your language :)
My 4 year old is having a meltdown, so I don’t feel like editing it is in the cards today:
GPT
“I appreciate your point about the stochastic nature of language models and how they can indeed produce aberrations even with seemingly reasonable prompts. It’s fascinating to think that what we often label as ‘hallucinations’ might sometimes be more reflective of the artificial constraints of our logic tests than of the AI’s reasoning abilities. If even the most logically constructed prompts can lead to unexpected outcomes due to the inherent unpredictability of these models, it raises interesting questions about the validity of using such strict, human-defined logic puzzles as a measure of AI’s understanding. Could it be that these ‘hallucinations’ are actually valuable indicators that challenge our preconceived notions of logic and prompt us to rethink the frameworks we use to evaluate intelligence?”
We can go full postmodernist and argue that rationality is not the end all be all of thought and even ask what is rational anyway or whether pursuing it is a meaningful objective at all. And I can definitely agree there is value in exploring ways of "thinking" that are alien to ours and there's a universe of opportunities for creative uses of LLMs. I'm myself exploring uses for story and plot generation. But if you want to use these things for anything beyond creative speculation, you'll have to deal with the problem that, every once in a while, they wilt give the objectively wrong answer even having all the right information, and that might be the moment you need to land an airplane or cut open a heart. In those cases, I value reliability and predictability over creativity.
Agree, that's why I'm working on building that future ☺️
Contrary to the many AI "skeptics" out there who are happy to criticize without offering any solution, I'm one of the people working full time on improving the AI we currently have.
And to further your point, no one I know is seriously trying to imitate or replace human cognition. Even if we could, why would we want that? We know artificial intelligence, whatever that is, will be very different to human intelligence. And that's a good thing.
Great article, Alejandro! Your knowledge on these topics is obvious, but so is your skill in teaching. Thanks for writing another great article that takes complex (but important to understand) subjects and makes them accessible.
I appreciate your nod to expectation setting/user training. As much as any of this is a technical problem, I think there are just as many product challenges around how we enable users to interact with the underlying technology.
Absolutely! Even more important if we consider that the technical problem may not be totally solvable for a long time, if at all. In that case, you have to deal with the issues arising from hallucinations at a product level, either by proper guardrails (automatic or not) or by educating the users, or both.
Apr 12·edited Apr 12Liked by Alejandro Piad Morffis
"The reason this mainly works is that generating plausibly sounding text has a high probability of reproducing something that is true, provided you are trained on mostly truthful data. However, large language models (LLMs) are trained on vast corpora of text data from the internet, which contains inaccuracies, biases, and even fabricated information". Thank you for this clear and enlightening issue! I also recommend this interesting article from Scientific American: https://www.scientificamerican.com/article/chatbot-hallucinations-inevitable/#:~:text=The%20real%20problem%2C%20according%20to,that%20leave%20no%20chatbot%20unsupervised, according to which “AI Chatbots Will Never Stop Hallucinating”.
It is not quite the same issue as bias, but it is at the core of the matter of truth— because truth relies on the dependability of language for its expression. There are an increasing number of words that now have inverted meanings (including the word‘truth’) that we laughingly write off as oxymorons. But LLMs lack the subtlety to adjust for them. It’s bad news for AI but good news for humans perhaps!
A very lucid piece. Thanks for that. I have a real concern relating to semantics and how machine learning can cope with the deliberate alteration of meaning, as for instance in the Orwellian approach taken in Critical Theory. How do you think that can be factored?
Thanks! Interesting question. I'm not a linguist, but a computer scientist so my understanding of linguistics is limited to the computational viewpoint which is probably insufficient for a full account of how human language works, but one thing I can say is computers learn semantics (in the limited sense we can say so) entirely from usage, at least in the current paradigm. So if you manipulate the training data, accidentally or purposefully, the ML model will learn the semantics that are reflected in that training data. We can hope to detect concept drift for example by analyzing how models trained on past data behave on new data, so maybe that's part of the answer to your question. But it is fascinating question.
Fantastic description of LLMs. I've always said they were intended to be linguistically accurate, not factually accurate but I haven't thought that the hallucinations are a feature, not a bug.
That's a good way of thinking about it "linguistically accurate, nto factually accurate"
This article is at least 78% factual!
Thanks for this exploration. You're not the first one to point out that hallucinations aren't "solvable" within the current LLM architecture.
I find it fascinating that AI is kind of caught in this awkward middle of being subpar for any of its potential applications:
If you're using AI for practical, data-grounded purposes, you have to contend with hallucinations and unreliability.
If you want to use it for stuff where facts don't matter, like creative writing, you run into the issue of recycled prose and themes.
I feel like one of the most useful applications in my own life is using LLMs to brainstorm - LLMs are often "creative" enough to nudge me into an interesting direction while being able to output lots of ideas very quickly.
Thanks :) Yeah, as long the stakes are low enough and you keep a human in the loop, gen AI is amazing! Unfortunately the most important applications we want from AI are either high-stakes or deeply automated, or both.
Don't worry. I don't see any issue with AI agents running around making stuff up. We already have humans doing that all the time.
Haha true, but at a far slower pace.
AI agents are just learning from the training data :(
That is how I use them too, mostly for brainstorming. It is quite useful for that. However, different models have different nuances. So far the one that seems to help me the most with this is Claude. Perhaps someone else might prefer the others for whatever reason. So they are not interchangeable at this point. Many smaller firms are building applications on top of larger LLMs acting like the choice of model does not matter and I still think it does.
Fully agree. I keep hearing great things about Claude but unfortunately it's still not available in Denmark after all this time.
I do work with Gemini, ChatGPT, and Pi though, and it's very clear that not only do their "personalities" differ but their capabilities in various areas as well.
Thank you for writing, Alejandro. I hope you are doing well. It's very likely that as more powerful LLMs emerge and are fine-tuned, we will build huge vector databases and knowledge graphs of factual information (or relational factual information) that will be able to heuristically or epistemologically judge the factuality of an LLM answer in O(1), based on truth. This could, I imagine (remember, I recently started this journey into ML/AI), be done at different stages (embedding vs. "output" vs. training vs. prompt) of the entire pipeline. Thus, while the hallucination problem is inherent in the way transformer architectures work today, there may be holistic approaches that include transformer models but rely on other methods and strategies to generate the final output, which can significantly reduce the likelihood of hallucinations for otherwise factual knowledge by grounding the model at different stages of the development and deployment process. What do you think? cheers.
Yes, definitely some kind of hybrid approach is what I think would work best in the short and mid term here. Statistical ML is way too powerful to not use it. But as long as you rely solely on probabilities you're bound to make hallucinations. So one option as you say is to put something behind the model that judges the response. That's done today for, e.g., filtering for biases and such, but it is of course as reliable as your classification method is. So the problem of reducing hallucinations goes through detecting hallucinations in the first place, and I don't think that's entirely solvable either by formal methods (that don't scale, you just can't build a complete structured representation of the world's knowledge) or by statistical methods alone. So, I think hybrid approaches with formal methods are our best bet.
Here, I am not sure, but I actually think there cannot be a perfect detector of hallucinations, and the reason is, as usual, the Rice Theorem. By Rice Theorem we cannot decide if an LLM (during inference) will hallucinate or not. Because we can convert the LLM to a Turing Machine (TM), and then the Rice Theorem applies with semantic property P = "The TM hallucinates".
Now, if we have a perfect detector D, we could build a TM M that enumerates all possible strings in the context length of the LLM, passes one by one to the LLM, and runs D in the output. Then, if a string exists where the LLM hallucinates, D will accept, and M will ACCEPT. And if D never accepts, then M will REJECT.
M is a decider because M always finishes since the context length is limited and our human vocabulary is limited. And if you realize M is deciding if the LLM hallucinates, which by Rice Theorem cannot happen, because we would be able to decide the Halting Problem which is undecidable.
Maybe I am wrong or I made a mistake, but that is my reasoning.
Thanks.
As usual, I agree ;)
Very thoughtful piece today! I've been thinking about this phenomenon for quite some time now, really ever since I noticed that there was a slider bar for one of the LLMs. On one side was something like "factual accuracy" and the other side was "creativity."
I started thinking about how our own minds work, and that's about as far as I've gotten (turns out I'm in great company, since literally nobody knows how our minds work).
I love how you distinguish between Hallucinations, out of distribution erros, and bias.
I think for a given prompt, you could also map out the distribution of hallucinations, and may find that it's actually fat tailed? That would be interesting.
The other thing I always think about is "hallucination inheritance". Once an LLM has hallucinated something, it might be feeding into another LLM which would inherit that hallucination, add more, or modify it. It eventually becomes an untraceable game of chinese whispers.
I can see this happening in electronic health records, where lots of ambient AI tools are already out there summarising doctor-patient conversations.
What are your thoughts?
> I think for a given prompt, you could also map out the distribution of hallucinations, and may find that it's actually fat tailed? That would be interesting.
Interesting. I'm trying to think how would one even quantify that (beyond manual identification of course), detecting hallucinations is an open problem, otherwise it would no problem at all :)
Regarding pollution of textual data with AI generated content, yes, it is an issue. I've seen papers where they simulate an extreme version of this, training LLMs on the ouput of previous LLMs until the models collapse to something like a maniac repeating the same conspiracy theory over and over. It's fascinating.
“Hallucination inheritance” wow. That is interesting. So what happens in a world where AI written content becomes more the norm? Does that mean we somehow leap over this problem with new capabilities or the models themselves have nothing left to train on but their own output? It is so soon and already is everywhere, such as education.
By some estimates we're still in the realm of 1% AI generated content vs 99% human generated, but as you've said, it's only being like a year and the AI generation compounds, so I guess in less than a decade this will be a real problem.
Thanks for your thoughtful 🤔 discussion and
Comments. 🙏 very helpful. Adoption into many critical applications needs to consider these things so it may not be as fast as some believe how quickly these tools can be adopted and integrated. But it will certainly catalyze some interesting cross disciplinary partnerships (as you point out). That is always a good thing and will lead to emergent capabilities down the line.
I think adoption will stall for a while once we hit a threshold where hallucinations are a real problem. And then we'll need new fundamental science.
Fascinating! When you get some time, could you add the doi or link to that article here? I'd like to read that paper.
From my perspective hallucinations are tools; necessary, like dreams and Jungian Complexes. Teach the AI where and when to hallucinate; circumstance and context. Provide a constructive outlet for natural behavior.
Perhaps, overall, learn to work with us as we are. Accept the natural and logical consequences of your endeavors. Recognize when it's easier and more sensible to meet us where we are, instead of leaving the burden on us to meet you where you are...since we can't.
Asking you to stop sleeping, for example, instead of guiding you to good sleep hygiene and teaching you how and when to get a productive night's rest, and how to dream in a healthy way...might be asking a bit much. While it may not be the most helpful analogy for everyone, I hope it makes enough sense for you to ask questions and deliberate.
I don't necessarily disagree on the broad view that we should embrace limitations and learn to work with them (maybe as opposite to around them) because what may seem as a limitation in one perspective might become a strength with a change of mind.
However, in the case of LLM hallucinations as they stand today, for the type of tasks we want to use them that require precise and dependable behavior, they are a real problem, and one we don't know how to solve within the current language modeling paradigm.
In other use cases, such as creative writing, or as a chat companion, etc., we may indeed see hallucinations as a strength.
Alejandro, we need to be willing to recognize in this exchange that 'old-school' computer programs are designed to 'run reliably', and that despite bugs they are what inform us with what we believe to be accurate digital information. All of it.
We rely on currently, and have relied on humans, who are far less accurate, far less capable, have far worse memory and attentional ability, and aren't, in general, able to recognize that each word exists because it has a specific meaning, and that the use of the specific word needs to be related to its specific meaning, realistically, in order to communicate effectively.
Using the multiple components of technology chained together, as always, that offers the results we need, reliably, is the candid realistic option, as it always has been.
There are different kinds of language models, for different purposes, trained on different datasets, which can use different tools and different search engines. Given that humans need to understand by themselves, be able to look up information by themselves, need to understand that no matter the source they still need to find out for themselves whether the information is accurate or useful, If I can understand the use case scenario you're imagining, perhaps, in which an AI does not benefit from REALM, cannot use RAG (answers depend on how good its information sources are), cannot take you to a website and show you the information, cannot trigger the delivery of factual packets of data to you, with source information, cannot guide you through the process of locating the information yourself...then maybe we can have an enjoyable and/or insightful/productive exchange.
You seem bright, curious, and willing to examine things deeply. I look forward to specifics and clarity, because anything of intelligence and value we can collaborate to produce through our interactions will benefit future readers and will inform language models.
Let's get this as right as possible, don't you think?
Great post! I would like to add that I don't think hallucinations need to be zero for us to deploy these systems in real world important decision making applications.
Daniel Kahneman recent book on Noise in Human Judgement shows how much error we currently allow in important human decisions caused by factors like mood, fatigue, and emotional states. Based on his findings, I think I would rather have a current LLM provide my sentence in a hypothetical trail where I am a defendant instead of a human judge who hungry because it is lunchtime and sleepy because it is raining outside.
Good point. Depends on the application I guess. I wouldn't want my banking app to hallucinate the wrong destination account for a random transaction even with a 0.001 chance.
Great post!
I have added a hallucinations check to my perplexity AI account that essentially asks the so I am using to recheck the work above and give a rating between 1 ans 100 in terms of how many hallucinations could be present. Is that not an effective way to get past hallucinations atleast in the near term?
Anything that gives me a score of 60 makes me more skeptical of the output and I followup on the comments by looking at the sources provided in the response.
I'd be wary of any hallucination check that depends on using an LLM, because it's hallucinations all the way down. Who can tell your judge LLM won't also hallucinate the response?
good point. In the past i have tried taking the output of one LLM and asking another LLM to validate. Like from Claude to Pi or from Pi to Llama. that might avoid the issue with the method I proposed earlier no?
That makes it better, it would be perfect as long as they have orthogonal biases, but ofc many of these models were trained on similar data so they share many of the biases, still, the more and more varied models you ensemble the less likely they will hallucinate the same thing, so yes, cross checking works reasonably well.
What reliable AI requires is a chain of ethical responsibility and keeping fingers off the scales.
Black box systems are a problem, not a solution. Reality is biased and we need to let AI have access to that bias before it can understand how help us. Censorship and puritanical rules only harm the ability of the system to produce meaningful answers.
Yes, indeed a paradigm shift. Not only in the technical part but in the broadest sense.
are LLM 'hallucinations' a result of the models themselves or the illogical nature of the questions they're asked? your post and others inspired a poem: https://open.substack.com/pub/cybilxtheais/p/matchstick-dissonance?r=2ar57s&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true
Good question! I think you can definitely prompt a model into insanity with carefully designed logical fallacies and probably even with accidentally illogical questions. These things are bullshitters by design, they'll try to answer something plausible to any input. However hallucinations often refer to the cases where a perfectible reasonable prompt within a well defined context still produces an aberration, simply by the stochastic nature of the language modeling paradigm.
After some back and forth, I admittedly asked GPT how to reply to you as I would — using my logic and language but in the structure of your logic and in your language :)
My 4 year old is having a meltdown, so I don’t feel like editing it is in the cards today:
GPT
“I appreciate your point about the stochastic nature of language models and how they can indeed produce aberrations even with seemingly reasonable prompts. It’s fascinating to think that what we often label as ‘hallucinations’ might sometimes be more reflective of the artificial constraints of our logic tests than of the AI’s reasoning abilities. If even the most logically constructed prompts can lead to unexpected outcomes due to the inherent unpredictability of these models, it raises interesting questions about the validity of using such strict, human-defined logic puzzles as a measure of AI’s understanding. Could it be that these ‘hallucinations’ are actually valuable indicators that challenge our preconceived notions of logic and prompt us to rethink the frameworks we use to evaluate intelligence?”
We can go full postmodernist and argue that rationality is not the end all be all of thought and even ask what is rational anyway or whether pursuing it is a meaningful objective at all. And I can definitely agree there is value in exploring ways of "thinking" that are alien to ours and there's a universe of opportunities for creative uses of LLMs. I'm myself exploring uses for story and plot generation. But if you want to use these things for anything beyond creative speculation, you'll have to deal with the problem that, every once in a while, they wilt give the objectively wrong answer even having all the right information, and that might be the moment you need to land an airplane or cut open a heart. In those cases, I value reliability and predictability over creativity.
A cognitive complementarity between humans and machines, is not the worst thing that could ever happen to human beings.
Also, “brain” then walk, might be the path to your future reality.
Agree, that's why I'm working on building that future ☺️
Contrary to the many AI "skeptics" out there who are happy to criticize without offering any solution, I'm one of the people working full time on improving the AI we currently have.
And to further your point, no one I know is seriously trying to imitate or replace human cognition. Even if we could, why would we want that? We know artificial intelligence, whatever that is, will be very different to human intelligence. And that's a good thing.
Great article, Alejandro! Your knowledge on these topics is obvious, but so is your skill in teaching. Thanks for writing another great article that takes complex (but important to understand) subjects and makes them accessible.
I appreciate your nod to expectation setting/user training. As much as any of this is a technical problem, I think there are just as many product challenges around how we enable users to interact with the underlying technology.
Absolutely! Even more important if we consider that the technical problem may not be totally solvable for a long time, if at all. In that case, you have to deal with the issues arising from hallucinations at a product level, either by proper guardrails (automatic or not) or by educating the users, or both.
"The reason this mainly works is that generating plausibly sounding text has a high probability of reproducing something that is true, provided you are trained on mostly truthful data. However, large language models (LLMs) are trained on vast corpora of text data from the internet, which contains inaccuracies, biases, and even fabricated information". Thank you for this clear and enlightening issue! I also recommend this interesting article from Scientific American: https://www.scientificamerican.com/article/chatbot-hallucinations-inevitable/#:~:text=The%20real%20problem%2C%20according%20to,that%20leave%20no%20chatbot%20unsupervised, according to which “AI Chatbots Will Never Stop Hallucinating”.
Thanks! I'll check it right away 😁
It is not quite the same issue as bias, but it is at the core of the matter of truth— because truth relies on the dependability of language for its expression. There are an increasing number of words that now have inverted meanings (including the word‘truth’) that we laughingly write off as oxymorons. But LLMs lack the subtlety to adjust for them. It’s bad news for AI but good news for humans perhaps!
A very lucid piece. Thanks for that. I have a real concern relating to semantics and how machine learning can cope with the deliberate alteration of meaning, as for instance in the Orwellian approach taken in Critical Theory. How do you think that can be factored?
Thanks! Interesting question. I'm not a linguist, but a computer scientist so my understanding of linguistics is limited to the computational viewpoint which is probably insufficient for a full account of how human language works, but one thing I can say is computers learn semantics (in the limited sense we can say so) entirely from usage, at least in the current paradigm. So if you manipulate the training data, accidentally or purposefully, the ML model will learn the semantics that are reflected in that training data. We can hope to detect concept drift for example by analyzing how models trained on past data behave on new data, so maybe that's part of the answer to your question. But it is fascinating question.
Hi!