The term “hallucination” has gained prominence as a fascinating yet challenging aspect of generative artificial intelligence (AI). A hallucination in this context refers to a model’s output that is inaccurate, false, irrelevant, or entirely out of touch with reality. To illustrate this, imagine asking a generative AI application to provide the top three drugs currently used in the market for treating heart diseases. The model might generate a response based on historical training data, even if one of the drugs has been withdrawn from the market, rendering the result irrelevant. Hallucination isn’t limited to regurgitating old data; sometimes, the AI system creates entirely fictitious information. To address this issue, we must first investigate why generative AI systems hallucinate.
Why Does Generative AI Hallucinate?
While AI experts worldwide continue to explore the root causes of hallucination, several factors contribute to this phenomenon:
- Built to generate new content: Generative AI applications are designed to be creative and innovative. They are trained on extensive datasets from the internet and learn intricate patterns to predict the next word or phrase in a given context. This creativity is a hallmark of their capability. However, it’s essential to recognize that this trait can sometimes lead to the generation of entirely new content, even if it’s inaccurate or irrelevant. While the massive volume of data used for training can be a strength, identifying and addressing issues within it can be challenging.
- Encoding and decoding challenges: Large language models (LLMs) convert training texts and prompts into vector encodings, numerical representations of words and phrases. Words with multiple meanings often receive distinct encodings, adding complexity to the process. When the model generates text, it must decode these vectors into human-readable language. Problems can arise in this encoding and decoding process, leading to errors in content generation. For example, the word “drug” can have multiple interpretations, such as a medication or an illegal substance that causes addiction.
- Training data issues: The quality and accuracy of training data are critical in shaping the behavior of AI models. Outdated or erroneous data, information gaps, or the inclusion of false information can mislead the model during training. These inaccuracies in the training data can subsequently lead to hallucinations when the model generates content.
- Lack of context: Context is essential in language comprehension and generation. If a model’s training does not adequately expose it to the desired context during training, it may struggle to provide relevant responses during inference. Without a comprehensive understanding of the context, the AI system may produce content that is disjointed or unrelated to the given prompt, contributing to hallucinations.
- Lack of constraints: Language models should operate within specific constraints or guardrails to ensure that the content they generate aligns with expected standards. Without these constraints, AI models can produce outputs that are overly creative or entirely unrelated to the context, further exacerbating the hallucination problem.
How to Limit Hallucination?
Addressing the issue of hallucination in generative AI requires a multifaceted approach, often involving a combination of methods. Let’s explore some effective strategies:
- Prompt engineering: Prompt engineering involves tailoring the instructions and context provided to the AI model to elicit the desired output. This can include adding constraints and relevant context to guide the model’s responses. By offering specific and clear directives, you can help the AI system generate content that more closely aligns with your expectations. Additionally, assigning the model a particular role or context within the prompt can further improve the accuracy and relevance of its responses.
- Multishot prompting: Multishot prompting is a technique that involves supplying the AI model with several examples of the desired output format or context. This approach allows the model to recognize patterns and gain a more comprehensive understanding of the intended content. Exposing the AI system to multiple instances of the desired response can enhance its accuracy and consistency in generating content.
- Filtering and ranking: LLMs often come with adjustable parameters that control the degree of hallucination. For instance, the “temperature” parameter affects the creativity of the text generated. Higher values lead to more diverse and creative output, while lower values make the output more deterministic and focused. Similarly, the “top p” parameter allows dynamic vocabulary selection based on the context, ensuring the generated content remains contextually relevant and accurate.
- Using AI frameworks like RAG: Retrieval-augmented generation (RAG) is an advanced AI framework that enhances the quality of responses generated by LLMs. RAG combines generative capabilities with the ability to retrieve information from external sources. Grounding the model in external knowledge can supplement the LLM’s internal data representation. This approach helps reduce hallucination by ensuring the generated content is accurate and contextually relevant.
- Impose guardrails: Implementing guardrails for generative models is an effective strategy to mitigate hallucination. These guardrails serve as constraints or rules that guide the AI’s output generation process, ensuring the content remains within acceptable boundaries. Several categories of guardrails exist, including topical guardrails that prevent AI from commenting on specific sensitive topics, safety guardrails that ensure accurate information and trustworthy sources, security guardrails that restrict connections to third-party apps and services that may introduce false or harmful input, and others.
- Human in the loop: Incorporating human oversight is crucial to preventing and limiting model hallucination. This strategy involves humans at various stages of the AI process, including during training and inference. Continuous human feedback to the model is vital in fine-tuning and ensuring that the AI system’s outputs align with desired standards. Human intervention serves as a safety net to catch and correct any hallucinatory content generated by the model.
By employing these strategies, we can significantly enhance the reliability and accuracy of generative AI while harnessing its creative potential and innovation. Collectively, these approaches help balance the encouragement of AI creativity while maintaining content quality.
Generative AI, like any technology, is not without its imperfections and can make mistakes. Interestingly, there are similarities between how humans and AI systems hallucinate, as both attempt to “fill in the gaps” when faced with missing information. Nevertheless, generative AI wields immense power, and we can fully harness its potential with proper control and governance. By understanding the causes of hallucinations and implementing appropriate strategies, we can significantly improve the reliability and accuracy of generative AI applications.
Although we’re still unraveling the underlying causes, it’s heartening to know that we can employ a myriad of strategies to curb hallucinations. From prompt engineering to implementing guardrails and human oversight, these approaches can collectively pave the way for a more reliable and trustworthy AI landscape. As we navigate the frontiers of this transformative technology, it’s our responsibility to ensure that generative AI adheres to the highest standards of accuracy and relevance while continuing to inspire us with its creativity. Through ongoing research, innovation, and ethical implementation, we can benefit from the full potential of generative AI, creating a future in which the hallucinations are easier to distinguish from reality.
- Jakindah, D. Top P, temperature and other parameters. Medium. May 18, 2023. Accessed November 17, 2023. https://medium.com/@dixnjakindah/top-p-temperature-and-other-parameters-1a53d2f8d7d7#:~:text=Top%20p%2C%20also%20known%20as,of%20tokens%20to%20generate%20output