Today’s large language models operate by scouring massive data inputs, much of it from the web. The sheer magnitude of that training material requires researchers to employ compression algorithms that contain inherent flaws. The results are entertaining and often useful, but a far cry from human creative prose. Author Ted Chiang compares OpenAI’s ChatGPT to a “blurry JPEG” of the internet, rather than a reliable responder.
- Large language models and Xerox photocopiers share error issues, when compressing information.
- Lossy compression helps explain why large language models sometimes produce nonsensical answers or “hallucinations.”
- Despite their flaws, large language models can seem lucid.
- To inspire user confidence, large language models must be trustworthy.
- Creative writing presents a complex AI issue.
Large language models and Xerox photocopiers share error issues, when compressing information.
File compression requires encoding, or converting, text into a more compact version – and then decoding it, or reversing the process. If the original and decoded files are identical, the compression is “lossless.” If the restored material is missing information, it is called “lossy.” Typically, lossless compression is used for text storage and computer programs, while lossy compression is used to save images and multimedia materials. Lossy compression isn’t usually noticed unless it causes “compression artifacts” such as image fuzziness or subpar audio.
“You’re still looking at a blurry JPEG, but the blurriness occurs in a way that doesn’t make the picture as a whole look less sharp.”
Photocopiers sometimes eliminate important details, as long as the images appear similar enough to one another. In the same way, language models such as OpenAI’s ChatGPT provides an approximation of grammatical text, based on the immense volumes of material it retains and assembles to answer questions. By employing statistical regularities, GPT’s responses are generally acceptable. But because its answers are based on lossy algorithms, they represent a “blurry JPEG of all text on the web.”
Lossy compression helps explain why large language models sometimes produce nonsensical answers or “hallucinations.”
Recognizing these errors often requires comparing AI-fabricated answers with original materials. Such errors are to be expected if 99% of the training material has been eliminated from a language model’s database due to compression.
Lossy compression algorithms commonly use “interpolation,” a process by which the computer decides which information is missing from a piece of text by noting what appears on either side of it. It’s similar to an AI reconstructing pixels to display a compressed photo by calculating the appearance of adjacent pixels.
“ChatGPT is so good at this form of interpolation that people find it entertaining: They’ve discovered a ‘blur’ tool for paragraphs instead of photos, and are having a blast playing with it.”
Describing large language models like ChatGPT as “lossy text-compression algorithms” corrects a tendency to humanize them. Scientist [and DeepMind senior researcher] Marcus Hutter’s Prize for Compressing Human Knowledge presents an apt analogy. He offers a reward for the lossless compressing of one specific gigabyte of Wikipedia text into a smaller file than has previously been achieved. The most recent winner reduced the gigabyte to 115 megabytes (a zip format would be 300 megabytes in size.) Hutter believes that better compression will enable higher levels of artificial intelligence.
“If a large language model has compiled a vast number of correlations between economic terms – so many that it can offer plausible responses to a wide variety of questions – should we say that it actually understands economic theory?”
For example, if a text file contains one million math problems, the best compression ratio would most likely result from training the AI on arithmetic principles and then writing calculator code. That logic also applies to compressing large swaths of text. For example, when a language program accesses plentiful information about “supply and demand,” it determines which words to throw away as it scours the material. If the model finds that a phrase such as “supply is low” usually appears near the phrase “prices rise” it may answer that the effect of supply shortages results in increased prices.
“Given GPT-3’s failure at a subject taught in elementary school, how can we explain the fact that it sometimes appears to perform well at writing college-level essays?”
The large language model GPT-3 can reliably subtract or add two numbers, but its accuracy goes downhill quickly when asked to do math problems using larger numbers. One flaw is that it does not “carry the one” when performing mathematics.
Despite their flaws, large language models can seem lucid.
Some propose that the statistical regularities of text correspond to real-world knowledge. A simpler explanation could be that ChatGPT appears to be more human precisely because it is not a lossless algorithmic system. If it consistently provided simple quotes from web pages, it would operate like a glorified search engine. In the same fashion, simple memorization of facts by students does not equal learning or understanding.
“When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression.”
Visualizing large language models as “blurry JPEG” helps us more clearly understand their suitable applications.
To inspire user confidence, large language models must be trustworthy.
Restating information with various word choices differs from outright fact fabrication, but AI’s blurriness issues must also be addressed.
Using these models to create web content only makes sense if they are repackaging existing information, as content mills do. This could help such operations avoid copyright infringement, but would make finding reliable information on the internet more difficult, as the web “becomes a blurrier version of itself.”
If a large language model starts generating text reliable enough to train a new model, it will inspire excitement and confidence; but that would require new model-building techniques.
Creative writing presents a complex AI issue.
Few people consider a photocopy an original work of art, yet that is comparable to the text large language models generate. It’s possible that writers could use these models as a starting point, or to create “boilerplate” copy, while directing their creative focus elsewhere.
“Obviously, no one can speak for all writers, but let me make the argument that starting with a blurry copy of unoriginal work isn’t a good way to create original work.”
Most writers produce unoriginal work before writing something original. Reworking sentences and phrases reveals meaningful prose over time. Writing essays helps students articulate their thoughts, even when they write about subjects previously explored. A similar process takes place every time a writer begins a new piece, revealing valuable original ideas.
Some observers say that a large language model output resembles a writer’s first draft, but this resemblance is superficial. A first draft is typically a poorly-expressed original idea that needs rewriting. Starting with an AI version means the writer proceeds from unoriginal sources.
“There’s nothing magical or mystical about writing, but it involves more than placing an existing document on an unreliable photocopier and pressing the Print button.”
No one can predict the arrival of that momentous day when AI writes original prose based on its own world experiences. Large language models that simply rephrase the web might be useful if humanity lost access to the internet and had to compress that mass of information onto servers, but the system would need to infallibly avoid hallucinations and fabrications. Today’s most critical question remains: “Just how much use is a blurry JPEG, when you still have the original?”
About the Author
Ted Chiang is an award-winning author of science fiction. In 2016, the title story from his first collection, Stories of Your Life and Others, was adapted into the film Arrival. He lives in Bellevue, Washington, where he works as a freelance technical writer.