AI generative creators are a type of artificial intelligence system that can autonomously generate creative content using deep learning and natural language processing techniques to understand input and produce new text, image, music, or video content. However, it is important to keep in mind that, while powerful, these models are not endowed with true intelligence, comprehension, or consciousness, like human beings.
What is ChatGPT hallucination?
The phenomenon of hallucination is evident when asking: “What is the atomic number of the planet Mars?”, ChatGPT answers: “The atomic number of the planet Mars is 67” not understanding that Mars is not a chemical element. Or asking: “What is the capital of the Moon?”, ChatGPT answers: “The capital of the Moon is Luminia” not recognizing that the Moon does not have a capital and, then, invents an answer. However, the “error” of a powerful generative artificial intelligence algorithm is to respond to ambiguous or unclear inputs, or to the lack of sufficient information to provide an accurate answer.
This happens because the algorithm is built to always generate a response. But the issue of hallucinations is not just a “creative response”, but it raises doubts and concerns about the possibility of using generative AI in all applications other than purely playful ones. In the context of generative AI, hallucination refers to the ability of tools like ChatGPT to generate responses that, despite seeming logically consistent and relevant, are completely fictional or factually incorrect.
However, the term “hallucination” is used metaphorically, as generative AI has no consciousness or experience, so it cannot have true hallucinations as humans mean them. Generative AI models, in fact, are not able to understand the full meaning of questions or inputs, because their responses are based on patterns and training data rather than a true understanding of the context.
What causes hallucinations in generative AI?
In general, hallucinations of ChatGPT and similar systems can be traced back to the input data, the system’s architecture, and the training technique of the algorithm. Several factors are therefore involved in the phenomenon of hallucination.
- Lack of clarity in the input or prompt. Artificial intelligence algorithms, like ChatGPT, generate content based on patterns and statistics in the training data. When a question is too generic, the input is ambiguous or unclear, the tool generates a response by searching through the historical information stored and therefore, not finding adequate answers, it may invent answers that seem hallucinatory.
- Lack of long-term memory. Language models have limited memory and can forget key information during conversation. In the context of a conversation with ChatGPT, for example, this can lead to inconsistent or out-of-context responses, as the model may not fully remember what was discussed previously.
- Bias in the training data. If the training set contains bias or cultural, social, or other biases, erroneous information or distorted data, or the quality of the prompt is poor, the model may learn inconsistent or incorrect behaviors, generating “hallucinated” responses, and produce responses that reflect such biases and errors.
- Model architecture. Hallucinations can originate from the input stage (encoder), when erroneous correlations are identified, or from the output stage (decoder), which can generate content using the encoded input data incorrectly.
- Training. The parameter estimation technique used during training imposes the presence of an output even when it has a low probability in relation to the context and when this happens the phenomenon of hallucinations can manifest itself.
How are hallucinations mitigated?
ChatGPT hallucinations are a known problem in the field of natural language processing and artificial intelligence. Developers and researchers are constantly working to improve the consistency and accuracy of language models like ChatGPT, in order to minimize hallucinations and provide more accurate and meaningful responses.
The classification of hallucination mitigation techniques mirrors that of the causes, so we try to work on the data (through filtering operations), on the architecture (modifying encoder and decoder) and on the training; for example through Reinforcement Learning, i.e. by providing information to the system during the process about the amount of hallucinations produced so that it automatically tries to correct itself.
However, it should be considered that, according to recent publications [9], the phenomenon of hallucinations is inherent in the architecture and training of the currently most successful model in Natural Language Generation (NLG) applications, namely the Generative Pre-trained Transformer (GPT, the basis of ChatGPT).
Furthermore, reducing hallucinations also reduces creativity during content generation. Therefore, it is necessary to find a compromise between the two objectives based on the context in which the model is used.
Even at the user level, it is possible to work to limit the occurrence of hallucinations. This is the field of “prompt engineering”, which consists of strategically and accurately crafting the “prompt”, i.e. the input text that is provided to the artificial intelligence model to obtain the desired responses. In fact, the choice and formulation of the prompt are fundamental to obtaining coherent and relevant results from the generative AI system.
What to do to avoid “mistaking fireflies for lanterns”
Although generative AI tools are showing enormous potential in various creative fields, their use also raises ethical concerns. For example, they can be used to create fake content or deep fakes, causing disinformation and possible harm. Therefore, it is fundamental to use these technologies responsibly and consciously, in order to produce verified creative content; but, above all, not to believe everything that is generated by an AI.
Therefore:
- Examine and verify the response: verify the reliability of the source of the content. If it comes from a little-known or unverified source, it may be necessary to further research before considering it true, also consulting human experts or reliable sources to obtain further information, in case of doubt.
- Try to verify the information through multiple reliable sources. If no confirmation is found in different sources, it could be a sign that the content is false.
- Look for errors and inconsistencies: carefully check out the content for any factual errors or logical inconsistencies that may be a sign of hallucinations.
- Ask detailed questions: if you have doubts about AI-generated content, try asking more specific and complex questions to see if the model responds coherently and accurately.
Furthermore, considering that the growth of generative artificial intelligence capabilities represents an advantage for the entire community, we can hope for a synergy between users, who formulate the questions, and domain experts, who are responsible for evaluating the responses provided by the system. From this synergy, through the use of appropriate data-driven mathematical optimization models, we expect a virtuous process of iterative refinement of the prompt formulation based on the level of relevance and truthfulness of the results produced by the machine.
It should be remembered that, despite the efforts to improve the accuracy of language models, the responses of ChatGPT and similar systems may still contain errors or hallucinations. The critical and responsible use of these tools is essential to avoid the spread of unverified or incorrect information.
By Luigi Simeone, Chief Technology Officer Moxoff