Homepage
Open in app
Sign in
Get started
Team
Product
Tech Insight
Career
Tech Insight
How much can we save through compression?
How much can we save through compression?
Estimating the cost savings from model compression.
Semin Cheon
Jun 25
‘Breaking Down’ tokenizers in LLMs
‘Breaking Down’ tokenizers in LLMs
An introduction to tokenizers and their implications in language models.
Semin Cheon
May 9
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
Accelerating LLM inference by pruning redundant transformer blocks
Semin Cheon
May 7
Accuracy Degradation in AI Compression: Myth or Truth?
Accuracy Degradation in AI Compression: Myth or Truth?
Clarifying the misunderstandings in AI model compression
Semin Cheon
Apr 22
Are you getting everything out of your GPUs?
Are you getting everything out of your GPUs?
At the 2024 GTC event, Nvidia CEO Jensen Huang got on the stage to deliver his keynote speech, in which he divulged the newest GPU…
Semin Cheon
Apr 2
4 Types of AI Compression Methods You Should Know
4 Types of AI Compression Methods You Should Know
AI model compression for acceleration is essential. The question is HOW? Here are 4 key methodologies.
Semin Cheon
Mar 20
Things to check if your business utilizes AI
Things to check if your business utilizes AI
Do I need to COMPRESS my AI model? : the short answer is “Yes” — and here’s why.
Semin Cheon
Mar 13
About SqueezeBits Team Blog
Latest Stories
Archive
About Medium
Terms
Privacy
Teams