Sign in Get started

Tech Insight

How much can we save through compression?

How much can we save through compression?

Estimating the cost savings from model compression.

Jun 25

‘Breaking Down’ tokenizers in LLMs

‘Breaking Down’ tokenizers in LLMs

An introduction to tokenizers and their implications in language models.

May 9

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

Accelerating LLM inference by pruning redundant transformer blocks

May 7

Accuracy Degradation in AI Compression: Myth or Truth?

Accuracy Degradation in AI Compression: Myth or Truth?

Clarifying the misunderstandings in AI model compression

Apr 22

Are you getting everything out of your GPUs?

Are you getting everything out of your GPUs?

At the 2024 GTC event, Nvidia CEO Jensen Huang got on the stage to deliver his keynote speech, in which he divulged the newest GPU…

Apr 2

4 Types of AI Compression Methods You Should Know

4 Types of AI Compression Methods You Should Know

AI model compression for acceleration is essential. The question is HOW? Here are 4 key methodologies.

Mar 20

Things to check if your business utilizes AI

Things to check if your business utilizes AI

Do I need to COMPRESS my AI model? : the short answer is “Yes” — and here’s why.

Mar 13

About SqueezeBits Team BlogLatest StoriesArchiveAbout MediumTermsPrivacyTeams