Homepage
Open in app
Sign inGet started

FriendliAI Tech & Research Blog

Supercharge Generative AI Serving: cut costs with the fastest serving engine on the market

Accelerating LLM Training with Memory-Balanced Pipeline Parallelism

Accelerating LLM Training with Memory-Balanced Pipeline Parallelism

Go to the profile of FriendliAI Tech & Research
FriendliAI Tech & Research
Jul 12
PeriFlow’s Enriched Coverage for Sought-After LLMs: MPT, LLaMA, and Dolly

PeriFlow’s Enriched Coverage for Sought-After LLMs: MPT, LLaMA, and Dolly

We have some exciting news to share!
Go to the profile of FriendliAI Tech & Research
FriendliAI Tech & Research
Jul 2
Get an Extra Speedup of LLM Inference with Integer Quantization on PeriFlow

Get an Extra Speedup of LLM Inference with Integer Quantization on PeriFlow

At FriendliAI, our top priority is to deliver a serving system with the best performance. We are excited to introduce a new feature that…
Go to the profile of FriendliAI Tech & Research
FriendliAI Tech & Research
Jun 26
Fine-tuning and Serving CodeGen, a Code Generation Model, with PeriFlow

Fine-tuning and Serving CodeGen, a Code Generation Model, with PeriFlow

CodeGen, unveiled in 2022 by Salesforce, is a language model that allows users to create programs with natural language instead of having…
Go to the profile of FriendliAI Tech & Research
FriendliAI Tech & Research
Jan 16
Save on Training Costs of Generative AI with PeriFlow

Save on Training Costs of Generative AI with PeriFlow

Generative AI is already widely used for chatbots, translation, code generation, summarization, image generation, and much more. Thanks to…
Go to the profile of FriendliAI Tech & Research
FriendliAI Tech & Research
Oct 31, 2022
Serve generative AI models like T5 faster than ever with PeriFlow (32.8x faster for T5–3B)

Serve generative AI models like T5 faster than ever with PeriFlow (32.8x faster for T5–3B)

In our previous blog articles (#1, #2), we showed the performance gain of PeriFlow (aka Orca) on GPT3, a popular generative AI model. Orca…
Go to the profile of FriendliAI Tech & Research
FriendliAI Tech & Research
Oct 7, 2022
PeriFlow: How Good is it on Small Models?

PeriFlow: How Good is it on Small Models?

We showed the dramatic performance gain (cost saving) of PeriFlow (aka Orca) running large-scale generative models like GPT 175B, thanks to…
Go to the profile of FriendliAI Tech & Research
FriendliAI Tech & Research
Aug 3, 2022
About FriendliAILatest StoriesArchiveAbout MediumTermsPrivacyTeams