Research Guides: Generative Artificial Intelligence: Why Fact Check Generative AI Outputs

Errors Built into Generative AI

Errors, Biases and "Hallucinations"

Generative AI creates materials that are similar to those that were in the training data, so errors and biases in the training data, as well as information that is superficially similar to correct information but that is no longer correct ("hallucinations") can happen in the process.

General chatbots, such as ChatGPT and Gemini

The predictive models behind chatbots are using probability to determine the most likely words and phrases to come next. As such, responses to questions for factual information generally sound just as plausible when they are correct as when they are incorrect. One way to reduce this problem is to require the chatbot to provide references to where that answer can be found.
But references provided by chatbots may be fabrications themselves. Chatbots sometimes give real sources that provide text similar to what they generated when asked, but they also sometimes give sources that don't present the ideas that the chatbot claimed and sometimes give citations to sources that don't exist but that resemble what a real citation would look like.
So, for factual information, if you don't know the fact already, it's best to verify the information against a credible source.

Retrieval-augmented text generation, such as on JSTOR and scite.ai

Some AI tools work in conjunction with human-created works. Instead of just relying on the predictive model to respond directly, first a query is set to a collection of human-created items, then the predictive model is used to generate a summary of the relevant information from portions of the human-created items combined with the predictive model.
Generally these tools automatically display which item or items are being summarized to make it easier to compare the original human-created item to the AI-created description of it.
But even the AI-created summaries can be incorrect if the human-created items it uses aren't relevant to the question or if its summary misrepresents the original item or brings in too much predictive text that was never included in the original. It still pays to check against the original, and checking can be much faster because the link to the original to compare is right there.

Deliberate Misrepresentation

Fakes and Deep Fakes

People sometimes use generative AI for misrepresentation. Sometimes that's to claim authorship for work that is not their own, and sometimes that's to create truthful-seeming information that is not true.

Deep fakes: Sometimes GenAI tools are used intentionally to create false images, videos, and voice recordings, to mislead the audience into believing they are real materials. These types of "deep fakes" can be especially dangerous when they are used to misrepresent political leaders or historical events.

For more information about the unique challenges related to deep fakes, please check out the RadioLab segment "Breaking News"

RadioLab segment on Deep Fakes
For more information about the unique challenges related to deep fakes, please check out the RadioLab segment "Breaking News"
Copyright Office Report on Copyright and Artificial Intelligence, Part 1: Digital Replicas
In this report, the Copyright Office outlines the main existing legal frameworks that address digital replicas (also known as deep fakes) at the federal and state levels, explains why existing laws do not provide sufficient legal redress for those harmed by unauthorized digital replicas, and proposes the adoption of a new federal law.