Hassle-free LLM Fine-tuning with FriendliAI and Weights & Biases
In this article, we will demonstrate how to fine-tune generative AI models using Friendli Dedicated Endpoints along with Weights & Biases. Friendli Dedicated Endpoints provide streamlined fine-tuning and deployment of generative AI models. Friendli Dedicated Endpoints is available on the Friendli Suite, which serves as a platform for serving custom large language models (LLMs).
Weights & Biases (W&B) is a popular AI developer platform that supports developers to train and fine-tune models, and manage them from experimentation to production throughout the LLM application development process. W&B offers ML teams a way to track and version their experiments automatically, providing convenience in discovering and reproducing various ML experiments and pipelines.
Benefits of Using W&B with Friendli Dedicated Endpoints
Integrating W&B for fine-tuning on Friendli Dedicated Endpoints offers a smooth and impactful user experience. This integration allows teams to effortlessly monitor and collaborate on their fine-tuning activities. All it takes is sharing the W&B API key and project name to get started.
In this article, we will explain how to create and monitor fine-tuning jobs on Friendli Dedicated Endpoints with Weights & Biases. Follow along with this how-to guide to learn how to:
- Upload and manage your datasets efficiently on Friendli Dedicated Endpoints.
- Initiate and monitor fine-tuning jobs with W&B.
- Launch and monitor multiple jobs and share their reports.
By the end of this guide, you will understand how you can effectively fine-tune your LLMs by using Friendli Dedicated Endpoints and W&B.
Simplifying Fine-tuning for Generative AI
In the realm of generative AI, a “one-size-fits-all” approach rarely suffices. Fine-tuning models using enterprise-specific data is a critical step for AI application development, yet the process can often be cumbersome.
Developers repetitively face time-consuming challenges: setting up GPUs, uploading datasets, adjusting hyperparameters, tracking progress, and evaluating models. These tasks can detract from the primary goal of fine-tuning, making the process more complex than necessary.
This is where the integration of Friendli Dedicated Endpoints with W&B comes into play. Fine-tuning on Friendli Dedicated Endpoints streamlines setup and management, allowing developers to concentrate on enhancing model performance. Consequently, businesses can achieve tailored solutions that provide a competitive edge in the market.
How to upload your model and dataset
To access your W&B model artifacts via Friendli Dedicated Endpoints, configure your W&B API key in your user settings in the Friendli Suite. For detailed instructions on uploading your model as a W&B model artifact, check out our previous blog post on the W&B and FriendliAI integration.
Navigate to the ‘Datasets’ section within your dedicated endpoints project page to upload your fine-tuning dataset. Enter the dataset name, then either drag and drop your .jsonl training and validation files or browse for them on your computer. If your files meet the required criteria, the blue ‘Upload’ button will be activated, allowing you to complete the process.
You can access our example dataset ‘FriendliAI/gsm8k’ on Hugging Face and explore some of our quantized generative AI models on our Hugging Face page.
How to create your fine-tuning job
This section demonstrates the process of creating a fine-tuning job in your dedicated endpoints project. After selecting a project, navigate to the fine-tuning page to see an overview of all your fine-tuning jobs. Press the blue ‘New Job’ icon at the far right to create a new job.
Then, a page will appear where you can configure your new fine-tuning experiment. You’ll need to enter the following information:
- Job name
- Model
- Dataset
- W&B project name
- Hyperparameters
Below is an example of a newly configured fine-tuning job named W&B test
. Select ‘Weights & Biases’ as the base model. If you haven’t integrated your W&B API key with your Friendli Suite account yet, you’ll be asked to enter it here. Provide the full name of the W&B model artifact and verify that your integrated W&B account has access to the selected model.
Next, choose the dataset that you have previously uploaded to the endpoint project.
Then, enter a W&B project name to monitor your fine-tuning job. If you provide a project name that already exists, your job will be added to that project. Otherwise, a new W&B project will be automatically created in your integrated W&B account.
Lastly, enter the training hyperparameters for fine-tuning your model. Since we support LoRA fine-tuning, you can also configure the related parameters. Once all the values are entered, click the blue ‘Create’ button to proceed.
Hurray! The ‘W&B test’ fine-tuning job has been launched! After the initialization finishes, the job will transition to the training state. You can monitor the job’s progress in a few minutes as shown below.
You can now switch over to the W&B project site to see important metrics like training loss, mean token accuracy, and more. Details on monitoring your fine-tuning job will be covered in the next section.
How to monitor your fine-tuning job
This section explains how to monitor your fine-tuning job on the W&B platform. In addition, we will discuss the W&B report feature, a helpful tool that allows developers to share insights on the training process.
First, log in to your integrated W&B account and find the relevant project. We will choose the ‘W&B project,’ as it is the project name we previously assigned when launching the fine-tuning job.
You can then view a panel with real-time graph visualizations on key training metrics. Some useful metrics you can monitor include gradient norm, learning rate, and training loss.
W&B enables you to write and share reports on training progress directly on the platform. We created an example ‘Llama 3 8B Fine-tuning Report’ that points out a supposedly alarming accuracy drop early in the fine-tuning process.
W&B can alert you when training accuracies fall below a specified threshold, minimizing the need to constantly monitor training metrics or worry about crashes. If you decide to terminate the fine-tuning job early, you can do so from our Friendli Dedicated Endpoints fine-tuning page by pressing the ‘Cancel’ icon.
How to duplicate your fine-tuning job
Last but not least, if you want to rerun the training with a different configuration, you can duplicate the job to create a new fine-tuning experiment by pressing the ‘Duplicate’ icon.
When launching multiple fine-tuning jobs in a single W&B project, you can view and compare all the graphs together to identify the best fine-tuning configuration or compare different models. For instance, observe how the duplicated job, which uses a different initial learning rate and batch size, is yielding faster training results below!
Deploying the Fine-tuned Model
The steps to deploy the fine-tuned model are equivalent to how you would deploy a custom model on Friendli Dedicated Endpoints. For further information, please refer to our documentation to launch a model, or our blog post for more detailed information on directly deploying a W&B model artifact on Friendli Dedicated Endpoints.
Conclusion
Fine-tuning LLMs with FriendliAI and Weights & Biases is a user-friendly way to customize your AI models. By integrating Friendli Dedicated Endpoints with W&B, you can easily manage and monitor your fine-tuning workflows.
This guide has shown you how to upload datasets, set up fine-tuning jobs, and monitor progress. The combination of Friendli Dedicated Endpoints and W&B’s tracking tools ensures you achieve optimal results with minimal effort. At FriendliAI, we are motivated to provide reliable fine-tuning services, supporting your AI development with innovative solutions.