I recently deployed ComfyUI on a Linux server and encountered some small pits to record.
If you encounter the error: AttributeError: ‘NoneType’ object has no attribute ‘tokenize’, I suggest you read this article.
In addition, if you can’t find Triple CLIP Loader, you can also read this article! !
ComfyUI is an open source, node-based graphical user interface (GUI) for generating images. It uses diffusion models such as Stable Diffusion to allow users to build image generation workflows by connecting nodes with different functions. Each node represents a specific functional module, such as loading a model, entering a prompt word, setting a sampler, etc. So it is equivalent to visualizing the entire process of calling the model. Just modify the nodes on the graph.
If you have experienced some machine learning and deep learning websites, you must be familiar with this operation method.
ComfyUI’s Github link: https://github.com/comfyanonymous/ComfyUI
If you are also a Linux server, the same is true for windows, then let’s continue to look at the installation process:
According to the description in GitHub (below)
The first step is to open a new Python virtual environment with Anaconda, and the Python version is 3.12.
The second step is to Git clone this project
The third step is to put the model file ckpt or safetensor into the models/checkpoints folder
Stabilityai has provided the latest stable diffusion 3.5 model on huggingface. You need to provide your information first, get the permission to obtain the model, and then go to the Files and versions column to find your model: https://huggingface.co/stabilityai/stable-diffusion-3.5-large/tree/main; if you find that you cannot download it, it means that you have not applied for access.
Then download this sd3.5_large.safetensors
Step 4: Put the VAE files into the models/vae folder
The VAE files are at: https://huggingface.co/stabilityai/stable-diffusion-3.5-large/tree/main/vae
After downloading these two files, just put them into the folder.
After completing these four steps, if you have an NVIDIA GPU such as 4090 or A100, you can continue with the next two steps of pip install. If it is an AMD GPU, people have also written command lines, so just look at the webpage.
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu124
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu124
The last step on the GitHub page is to install the requirements.
pip install -r requirements.txt
Logically speaking, you can then start the server.
If you want to expose the web page to a certain port, you need to search online for how to do it. I set it up on the platform where I rented the server. Then I ran python main.py — -port XXXX. That’s it.
Then!! It reported an error here, saying: AttributeError: ‘NoneType’ object has no attribute ‘tokenize’
Then I started looking for how to do it, but I didn’t see anything about clip on the homepage. I saw a post in the issue: https://github.com/comfyanonymous/ComfyUI/issues/5388
In this post, a guy posted another post: https://blog.comfy.org/sd3-5-comfyui/
Sure enough, I found three links to clipXXX.safetensors. If you can’t download it directly, it’s because you haven’t applied for permission.
Clip files are at: https://huggingface.co/stabilityai/stable-diffusion-3-medium/tree/main/text_encoders
In the Stable Diffusion model, CLIP files play a vital role, mainly used for the association and alignment of text and images. CLIP (Contrastive Language–Image Pre-training) is a multimodal model developed by OpenAI that can map text and images into the same latent space. It uses contrastive learning to make text and images describing the same thing closer in the latent space, while unrelated text and images are farther away. In the Stable Diffusion model, the main function of CLIP is: the text input by the user (such as prompt words) will be encoded through the text encoder in the CLIP model to generate text embeddings. These embeddings capture the semantic information of the text, making it easier for the model to understand the user’s input. Then, these embeddings will be used as conditional inputs for the generative model. The generative model uses these embeddings to guide it to generate images that match the text description.
No wonder it is the clip.tokenize(text) function, it turns out to be doing text embedding. Suddenly I think it’s interesting, isn’t it?
OK, if you have downloaded the clip files at this time, remember to put them in the models/clip folder!
Then, you can restart the server.
You may still get an error at this time. . .
Then you should look at this post: https://github.com/comfyanonymous/ComfyUI/issues/4868
It turned out that my clip files were not loaded well! ! ! I searched several times, and they all mentioned a TripleCLIPLoader. Then I saw the way to click from Node Templates to this TripleCLIPLoader in the picture below, so I started to explore.
As a result, I found that there was only one Manage under my Node Templates. Sorry, this is my first time using ComfyUI and I am not familiar with it:
Then I explored the sidebar of ComfyUI, and damn, I found Node Library!!! Although I didn’t find it directly, you can search for this Triple CLIP Loader.
After adding, you can modify the three clip files and connect the lines between nodes. If you don’t know how to do it, you can try to use your sixth sense for electronic products that you have cultivated since childhood. Connection method:
Then you can run it and click Queue.
Then you will find that there is a task running
Then I drew it!!
Finally, I wish you success!!!