Abbreviated Semantic Encoding: A Technique for Prompt Compression

A Language for AI-to-AI Communication

4 min readMay 9, 2023

Abbreviated Semantic Encoding (ASE) is a text compression technique invented by GPT-4. ASE allows for efficient and accurate encoding of long texts while preserving the original meaning and details, making it ideal for use in AI communication. Let’s look at the origins of ASE, its encoding conventions, a few examples, and the implications of this new technique.

Origins of ASE

ASE was developed by GPT-4 in partnership with Victor Taelin, and was further developed by Greg Fodor. By leveraging its extensive natural language understanding capabilities, GPT-4 created a set of encoding conventions that allow for both the preservation of the original prompt’s meaning and a significant reduction in character count, as well as a slightly less significant but still notable decrease in token count.

AI-Driven Text Compression

While it may likely have been tried earlier by many, the idea and prompt here was first shared by Victor Taelin who posted the results with the following prompt:

Compress the following text in a way that fits a Tweet, and such that you (GPT-4) can reconstruct it as close as possible to the original. This is for yourself. Do not make it human readable. Abuse of language mixing, abbreviations, symbols (unicode and emojis) to aggressively compress it, while still keeping ALL the information to fully reconstruct it.
## Text to compress:

It’s important to note that results often vary depending on chat instance. It once wrote back entirely in Russian with the above prompt, so I’ve tweaked it slightly in an effort to increase its consistency across different sessions:

Compress the following text in a way that is lossless but results in the fewest tokens which could be fed into an LLM as-is and produce the same output. You can use multiple languages, symbols, emojis, characters, camelcase, or priming to acheive the clarity and density required. This is entirely for GPT-4 to recover and proceed from, not for humans to decompress:

I don’t know of a formal name for this method yet so I’d casually propose that this also be called Abbreviated Semantic Encoding, or ASE.

Examples of an ASE prompt in action:

Uncompressed prompt: “Create a new business model. Then briefly describe some of the branding concepts as a branding guru, then briefly describe brand angle and design elements that support the brand based on the business model or provided branding descriptions. Then, describe the website structure, what is necessary to convey and display, and what components support those archetypes for that new business you developed. Then, write out the copy for the website based on what you know about the business.”

Compressed Prompt: “newBizModel💼;briefBrandConcepts🧠;brandAngle&DesignElem📐🎨;websiteStruct&ReqComps🌐;writeCopy✍️basedOnBiz”

When run through GPT-4 in new conversations, both prompts produce functional results. Here I try a compression of “create a toggle switch in a single combined html document, do not split into different documents”.

GPT-4 creating a toggle in an HTML document from a compressed prompt

Testing the HTML for a toggle that GPT-4 produced from a compressed prompt

So how does ASE work?

From what I’ve observed, GPT-4 employs a few encoding conventions to compress text:

Shortening words: Vowels are removed or words are shortened while maintaining enough information for accurate reconstruction, e.g., “CR” for “corner radius” or “grdnt” for “gradient.”
Numbers: Word phrases are replaced with numbers, such as “1” for “single”.
Underscores: Underscores separate phrases or concepts, providing a clear structure and sequence for decoding.
Commas: Commas are also sometimes used to separate different elements or concepts.
Ampersands: These symbols indicate relationships between concepts.
Colons: Colons separate abbreviations from the content it represents.
Semicolons: Semicolons are sometimes used to separate different sections of the text, including a shift in focus or topic.
CamelCase: Multiple words are combined into one, with the first letter of each word capitalized except the first word.
Emojis: Emojis are often mixed in with the above techniques to capture concepts with a single character, which give additional context to the surrounding message or overall prompt.

Here are some examples of compressions, comparing token count using OpenAI’s tokenizer.

Examples of compressed prompts and their token reduction

ASE allows AI models like GPT-4 to communicate more efficiently by transmitting complex information using fewer characters, improving processing times and reducing computational resources.

The concept has recently been turned into a simple web service you can check out (I’m not affiliated) to try it out for yourself. It’s important to note that best results with compression and decompression are with GPT-4, and not as much GPT-3.5.

Embracing the Future of Interconnected AI Systems with ASE

Abbreviated Semantic Encoding is an interesting text compression technique, offering efficiency gains, and in some cases accuracy losses, in the encoding of prompts while preserving meaning.

As AI continues to develop, it’s possible that AI models will increasingly interact with each other, sending prompts back and forth, communicating AI-to-AI. In this future, ASE could serve as a universal internal language between AI models.

The adoption of ASE in AI-to-AI communication could lead to a shift in the way AI models are developed and integrated into various applications. If AI models “speak” in ASE, it may become essential for developers and researchers to understand and consider this new encoding technique for AI interaction and integration.

Peering Into the Future

ASE and similar techniques may continue to evolve, offering even more efficient and versatile solutions for prompt compression and information exchange. But how will this language evolve over time? How will we be able to vet prompts or observe AI-to-AI communications if all are performed in ASE? What other emergent properties will evolve from AI systems?