Semin Cheon – Medium

Semin Cheon

Semin Cheon
in
SqueezeBits Team Blog

How much can we save through compression?

Estimating the cost savings from model compression.

Jun 26

How much can we save through compression?

Jun 26

Semin Cheon
in
SqueezeBits Team Blog

‘Breaking Down’ tokenizers in LLMs

An introduction to tokenizers and their implications in language models.

May 10

‘Breaking Down’ tokenizers in LLMs

May 10

Semin Cheon
in
SqueezeBits Team Blog

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

Accelerating LLM inference by pruning redundant transformer blocks

May 7

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

May 7

Semin Cheon
in
SqueezeBits Team Blog

Accuracy Degradation in AI Compression: Myth or Truth?

Clarifying the misunderstandings in AI model compression

Apr 23

Accuracy Degradation in AI Compression: Myth or Truth?

Apr 23

Semin Cheon
in
SqueezeBits Team Blog

Are you getting everything out of your GPUs?

At the 2024 GTC event, Nvidia CEO Jensen Huang got on the stage to deliver his keynote speech, in which he divulged the newest GPU…

Apr 2

Are you getting everything out of your GPUs?

Apr 2

Semin Cheon
in
SqueezeBits Team Blog

4 Types of AI Compression Methods You Should Know

AI model compression for acceleration is essential. The question is HOW? Here are 4 key methodologies.

Mar 21

4 Types of AI Compression Methods You Should Know

Mar 21

Semin Cheon
in
SqueezeBits Team Blog

Things to check if your business utilizes AI

Do I need to COMPRESS my AI model? : the short answer is “Yes” — and here’s why.

Apr 19

Things to check if your business utilizes AI

Apr 19

Semin Cheon

Semin Cheon

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams