使用 Hugging Face 的Pipeline來實現本地端文字轉圖片(Text-to-Image)，進行圖片生成

Weibert Weiberson

7 min readMay 27, 2024

藉由Hugging Face可以讓我們在使用AI上更為方便，它上面有很多Models可以使用，簡直是現在AI工程師的救星!!(如果把Hugging Face和LangChain串一起會更強大)

本文以文字轉圖片為例，教導大家Hugging Face 的Pipeline。

Github Repository — weitsung50110/Huggingface_Langchain_kit

一個專門使用Hugging Face和Langchain的github repo

本文是使用到裡面的diffuser.py檔案。

Huggingface_Langchain_kit/diffuser.py at master · weitsung50110/Huggingface_Langchain_kit

Contribute to weitsung50110/Huggingface_Langchain_kit development by creating an account on GitHub.

github.com

目錄
Hugging Face的使用主要有兩種方法 
程式碼講解教學 
程式執行的命令介紹 
成果

目前Hugging Face的使用主要有兩種方法

1.使用Serverless Inference API，需要註冊登入，來產生API_TOKEN，好處是可以線上執行，不會佔據本地資源，但會有速率限制，而且未來有可能會限制越來越多(?)

2. 本文是使用地端的方式來生成圖片，所以如果你沒有Hugging Face的帳號，也可以無痛執行~ 但缺點就是會佔據一些本地資源。

Hugging Face已經把很多模型幫大家分類完成，因此可以從裡面挑取自己想要使用的模型即可!

本文是挑選runwayml/stable-diffusion-v1–5來使用。

程式碼講解教學

1. 以下是我們會使用到的套件

from diffusers import StableDiffusionPipeline
import torch
import argparse

比較多人想到Hugging Face比較會想到Transformers，但其實還有很多其他的套件，像是diffusers也是他們家的。

2. 定義使用模型的主要函式

# 定義 main 函數，接收一個參數 output_filename
def main(prompt, output_filename):
    # 設定模型 ID
    model_id = "runwayml/stable-diffusion-v1-5"

    # 從預訓練模型載入 Stable Diffusion 管道
    pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float32)
    pipe = pipe.to("cpu")  # 使用 CPU 運行

    # 定義生成圖像的提示詞
    # prompt = "taiwanese handsome boy"

    # 生成圖像
    image = pipe(prompt).images[0]

    # 將圖像保存到指定的文件名
    image.save(output_filename)

如果你有GPU可以改成GPU。

pipe = pipe.to("cpu")

像是nvidia類的，會需要設定cuda，詳情未來有需要可以教導大家:)

# 生成圖像
image = pipe(prompt).images[0]

# 將圖像保存到指定的文件名
image.save(output_filename)

這邊的output_filename和prompt都還沒有定義，我們會在初始函式中定義。

3. 設定初始函式

# 檢查程式是否以主程式運行
if __name__ == "__main__":
    # 創建參數解析器
    parser = argparse.ArgumentParser(description="Generate an image with Stable Diffusion and save it.")

    # 添加 --prompt 參數，用於指定生成圖像的提示詞
    parser.add_argument("--prompt", type=str, required=True, help="The prompt for generating the image.")
    # 添加 --output 參數，用於指定輸出文件名
    parser.add_argument("--output", type=str, required=True, help="The output filename for the generated image.")

    # 解析命令列參數
    args = parser.parse_args()

    # 呼叫 main 函數並傳遞解析到的輸出文件名
    main(args.prompt, args.output)

使用argparse的好處是，你可以直接在cmd輸入你想要匯入程式中的變數

而不需要當要更動變數的值時，要再次改動程式碼。

main(args.prompt, args.output)

使用這行程式把值傳到main函式中。

4. 程式執行的命令介紹

python diffuser.py --output 輸入圖片名稱.png --prompt "輸入你想要的prompt"

5. 成果

"a cartoon of Taiwanese boy"
"a cartoon of Japanese boy"
"a cartoon of Korean boy"

"a handsome japanese boy at the age around 17 in the '90s"

"a beautiful japanese girl at the age around 17 in the '80s"

"a Taiwanese handsome boy with blonde hair"
"a Japanese handsome boy with blonde hair"
"a Korean handsome boy with blonde hair"

"a Beautiful Japanese idol at the age around 17 in the '80s"

"a handsome Japanese idol at the age around 17 in the '90s"

🥰繼續學習🥰>

使用 Hugging Face 的 Transformers 庫來實現 BERT 模型的訓練微調（fine-tuning），以進行垃圾郵件的辨識分類。

使用 BERT 模型的預訓練版本 “bert-base-uncased”，初始化了 Trainer，設定了訓練相關的參數，包括訓練集、驗證集、計算評估等。使用 trainer.train() 開始訓練模型，同時設置了提前停止訓練的機制

medium.com