MLearning.ai
Published in

MLearning.ai

BLOOM 176B — how to run a real LARGE language model in your own cloud?

Many of us use GPT-3 or other LLMs in a SaaS way, hosted by their vendors. But how is it like to run a model of the size of GPT-3 in your own cloud?

Model size

Hosting Setup

Robot in a rain of flowers.

Boot the model

A brain made out of flowers.

Use the model

smr_client.invoke_endpoint(
EndpointName=endpoint_name,
Body=json.dumps(
{
"input": "The BLOOM large language model is a",
"gen_kwargs": {
"min_length": 5,
"max_new_tokens": 100,
"temperature": 0.8,
"num_beams": 5,
"no_repeat_ngram_size": 2,
},
}
),
ContentType="application/json",
)["Body"].read().decode("utf8")

--

--

Data Scientists must think like an artist when finding a solution when creating a piece of code. ⚪️ Artists enjoy working on interesting problems, even if there is no obvious answer ⚪️ linktr.ee/mlearning 🔵 Follow to join our 28K+ Unique DAILY Readers 🟠

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Maximilian Vogel

Machine learning, large language models, NLP enthusiast and speaker. Co-founder BIG PICTURE.