Zefiro an LLM and datasets for the Italian Language

Alessandro Ercolani
2 min readJan 13, 2024

--

In European tradition, a zephyr is a light wind or a west wind, named after Zephyrus, the Greek god or personification of the west wind.

Zefiro is a fine-tuned version of the Mistral model for the Italian language, sponsored by Business Operating System and an adaptation of the Zephyr model by Huggingface.

Model Details

Zefiro is a porting of the Zephyr model to the Italian language using the wonderful recipes from alignment-handbook . It has also taken inspiration and insights from the Llamantino model developed by Università di Bari. For the implementation we combined different approaches from the two models mentioned but also from the wonderful community of open source.

Model description

I have also released a quantized version zefiro-7b-beta-ITA-v0.1-GGUF that can run on CPU based hardware, using fantastic libraries as llama.cpp or various LLM based GUI like Ollama, lm-studio.

In the next release I will try to improve the model using a DPO strategy and release a tuned version of smaller and bigger models like Mixtral and phi-2. I’m also investigating how to evaluate and compare the output models and the different strategies.

Honestly, I’m also looking for sponsored computation (GPU) capacity for training and releasing more datasets and models. If you know someone that can help spread the word …

--

--