Search…
Fine-tuning
How to train large language models for your use case.

Introduction

Fine-tuning, an approach to transfer learning, is a technique to create a new model that's specific to your use case. Fine-tuning lets you get more out of your models by providing:
  1. 1.
    Many more examples of your task than can fit in a prompt
  2. 2.
    Token savings due to shorter prompts
  3. 3.
    Lower latency requests
Large language models are pre-trained on a vast amount of text from the open internet. When given a prompt with a few examples, they can often understand what task you are trying to perform and generate a useful completion. This is called "few-shot learning".
Fine-tuning improves on few-shot learning by training on many more examples than can fit in a prompt, letting you achieve better results on a wide number of tasks. Once a model has been fine-tuned, you won't need to provide examples in the prompt anymore. This saves costs and enables lower-latency requests.
At a high level, fine-tuning involves the following steps:
  1. 1.
    Prepare training data
  2. 2.
    Train a new fine-tuned model
  3. 3.
    Use your fine-tuned model

Prepare training data

Your dataset is how you teach your model what you'd like it to say.
Your dataset should be a JSONL file, where each line is a prompt-completion pair representing an example of your task. We recommend fine-tuning each model for a single, specific task.
{"prompt": "<prompt text>", "completion": " <ideal generation><|endoftext|>"}
{"prompt": "<prompt text>", "completion": " <ideal generation><|endoftext|>"}
{"prompt": "<prompt text>", "completion": " <ideal generation><|endoftext|>"}
...
Each training example should consist of a single input example (prompt) and its associated output (completion).
{"prompt": "Topic: Why is Wellness Important for College Students?\nOutline:", "completion": " 1. There are many ways to answer this question: Why is wellbeing important in schools?\n2. Traditionally schools have focused on students\u2019 current needs or short-term victories.\n3. Fostering and teaching wellbeing is a way to show students and educators we care about them and want to support them by enabling them with short- and long-term benefits.\n4. But why does wellbeing need to be specifically prioritized in schools?\n5. Gratitude, hope, and emotional regulation, have been found to improve academic performance across several areas.\n6. Students are not the only ones who benefit from wellbeing on the agenda.\n7. We have the opportunity to formally teach our students to maximize their potential, and prioritizing their wellbeing is a good place to start.<|endoftext|>"}
{"prompt": "Topic: How Fast Can You Lose 20 Pounds\nOutline:", "completion": " 1. Losing 20 pounds can make a serious difference in your health.\n2. Safe weight loss comes at a rate of up to 1 to 2 pounds per week.\n3. Taking it slow will help you stay slim in the long run.\n4. Don't worry if your weight fluctuates a little.\n5. It's normal to go through a \"plateau\".\n6. The 10 to 40 weeks it will likely take to lose 20 pounds are just the beginning of your weight-loss journey.\n7. Staying active also keeps you slim, as well as boosts your mood, promotes good cardiovascular health and lowers your risk of osteoporosis.<|endoftext|>"}
{"prompt": "Topic: Why Modern Fashion is Bad\nOutline:", "completion": " 1. For several years the fashion industry has been under fire for the problems it creates.\n2. Production of our clothes happens most of the time in developing countries. Wages are low, material costs less and there are fewer regulations.\n3. After clothes are produced and shipped to stores and warehouses in developed countries they need to be sold.\n4. Before collections were created once a year, then 4 times a year and nowadays 7 times a year.\n5. These new collections mean we replace our clothes regularly for new ones that look more trendy.\n6. Reusing clothes is a great option to increase their lifespan\n7. The harsh reality is that very little of our clothes get recycled, mainly because they are all made from different materials.<|endoftext|>"}
...
Including instructions in the prompt will typically result in better model quality.
{"prompt": "Write an outline for a blog post based on the following topic. The outline should start with an introduction and end with a conclusion. The outline bullet points should be diverse and related to the topic.\nTopic: Why is Wellness Important for College Students?\nOutline:", "completion": " 1. There are many ways to answer this question: Why is wellbeing important in schools?\n2. Traditionally schools have focused on students\u2019 current needs or short-term victories.\n3. Fostering and teaching wellbeing is a way to show students and educators we care about them and want to support them by enabling them with short- and long-term benefits.\n4. But why does wellbeing need to be specifically prioritized in schools?\n5. Gratitude, hope, and emotional regulation, have been found to improve academic performance across several areas.\n6. Students are not the only ones who benefit from wellbeing on the agenda.\n7. We have the opportunity to formally teach our students to maximize their potential, and prioritizing their wellbeing is a good place to start.<|endoftext|>"}
{"prompt": "Write an outline for a blog post based on the following topic. The outline should start with an introduction and end with a conclusion. The outline bullet points should be diverse and related to the topic.\nTopic: How Fast Can You Lose 20 Pounds\nOutline:", "completion": " 1. Losing 20 pounds can make a serious difference in your health.\n2. Safe weight loss comes at a rate of up to 1 to 2 pounds per week.\n3. Taking it slow will help you stay slim in the long run.\n4. Don't worry if your weight fluctuates a little.\n5. It's normal to go through a \"plateau\".\n6. The 10 to 40 weeks it will likely take to lose 20 pounds are just the beginning of your weight-loss journey.\n7. Staying active also keeps you slim, as well as boosts your mood, promotes good cardiovascular health and lowers your risk of osteoporosis.<|endoftext|>"}
{"prompt": "Write an outline for a blog post based on the following topic. The outline should start with an introduction and end with a conclusion. The outline bullet points should be diverse and related to the topic.\nTopic: Why Modern Fashion is Bad\nOutline:", "completion": " 1. For several years the fashion industry has been under fire for the problems it creates.\n2. Production of our clothes happens most of the time in developing countries. Wages are low, material costs less and there are fewer regulations.\n3. After clothes are produced and shipped to stores and warehouses in developed countries they need to be sold.\n4. Before collections were created once a year, then 4 times a year and nowadays 7 times a year.\n5. These new collections mean we replace our clothes regularly for new ones that look more trendy.\n6. Reusing clothes is a great option to increase their lifespan\n7. The harsh reality is that very little of our clothes get recycled, mainly because they are all made from different materials.<|endoftext|>"}
...
Make sure to:
  1. 1.
    Never end prompts with a trailing whitespace (" ").
  2. 2.
    End prompts with a fixed separator to inform the model when the prompts ends and the completion begins. The separator you'll want to use depends on the use case and prompt structure. In the above example, the separator is \nOutline:. The separator should not appear elsewhere in any prompt.
  3. 3.
    Always start completions with a whitespace (" ") unless your fixed separator ends with a newline ("\n") in which case start completions without a whitespace. Starting completions correctly produces better results due to the tokenization of the models.
  4. 4.
    Always end completions with <|endoftext|> which acts as a built-in stop sequence, allowing the model to learn when to stop completions.
Two things will affect fine-tuned model quality the most:
  1. 1.
    Dataset quality
  2. 2.
    Dataset size
First, and most importantly, your dataset should only consist of high quality examples showing ideal task performance. Bad examples or noisy, inconsistent data will result in poor performance no matter how many examples you have. Assuming each training example is of desired quality, then the more data you have, the better.
Fine-tuning can solve a variety of problems, and the optimal way to approach it will depend on your specific use case. See more guidelines and examples.
Fine-tuning UL2-20B? Review best practices for training this model that differ from the standard guidelines above.

Train a fine-tuned model

Once your dataset is prepared, training can be done in the dashboard. To begin, navigate to "Fine-tune (on the sidebar) -> Fine-tune". Then complete the following steps:
  1. 1.
    Upload your dataset or select a previously used dataset. Your dataset must include at least 100 training examples. Once uploaded, your dataset will be analyzed to validate proper format and count the number of training examples.
  2. 2.
    Select the model you want to fine-tune. Choose from one of our pre-trained models or fine-tune a previously fine-tuned model.
  3. 3.
    Give your model a name, description, and set hyperparameters. Control the learning process for your models with different parameters.
  4. 4.
    Set test prompts. We recommend extracting test prompts from your dataset which automatically removes a few random examples to use as test examples. You can also manually add test prompts.
Before training, you can optionally sync your fine-tuning jobs with Weights & Biases. Copy your Weights & Biases API Key and paste it in Settings -> Integrations. Now you can monitor training progress and loss charts for all your fine-tuning jobs.

Use a fine-tuned model

When your fine-tuning job is completed, the status will change to Completed. You will also be able to view each checkpoints' completions on the test prompts.
When you've found a checkpoint that performs well on your task, you can add to your "Models" list to begin using it in the playground or via API.
When using the model, you should format your prompts in the same way as you did in the training dataset. For example, take the following dataset:
{"prompt": "Company name: Nike\nProduct description: Nike Air Jordan is an American brand of basketball shoes, athletic, casual, and style clothing produced by Nike. Founded in Chicago, Air Jordan was created for Hall of Fame former basketball player Michael Jordan during his time with the Chicago Bulls.\nBlog idea:", "completion": "Jordan Brand History - A blog about the history of the brand.<|endoftext|>"}
{"prompt": "Company name: Apple\nProduct description: The iPhone is a smartphone made by Apple that combines a computer, iPod, digital camera and cellular phone into one device with a touchscreen interface. The iPhone runs the iOS operating system, and in 2020 when the iPhone 12 was introduced, it offered up to 256 GB of storage and a 12-megapixel camera.\nBlog idea:", "completion": "Are You running an iPhone 12? Here\u2019s How to Back Up Your Photos - A blog about backing up photos from your phone.<|endoftext|>"}
{"prompt": "Company name: Zevo Insect\nProduct description: Nobody likes having insects around the house, and the old fly swatter doesn\u2019t really make much of difference with these persistent pests. Zevo is a plug and decimate solution. The multi-spectrum light technology attracts the critters and traps them on a sticky film backing. No chemicals. No gimmicks. And no need to go anywhere near the insects.\nBlog idea:", "completion": "How to Make Fruit Flies Disappear: A blog all about the Zevo and how it can make your life better.<|endoftext|>"}
...
When making a Completions request to this fine-tuned model, the prompt should look like:
body = {
"text": f'Company name: {company_name}\nProduct description: {product_description}\nBlog idea:',
"top_p": 1,
"top_k": 40,
"temperature": 0.8,
"stop_tokens": ["\n", "Company name:", "Product description:", "Blog idea:"]
...
}
Since each completion in the training set ends with <|endoftext|> the model should be proficient at stopping generations, but it's often helpful to add other stop sequences to consistently end the completion how you'd like.

Hyperparameters

Hyperparameters are parameters that are used to control the learning process. Tweaking these parameters for fine-tuning can often lead to a model that produces higher quality completions. In particular, you may want to configure the following:
Epochs
Checkpoints
Learning rate multiplier
Batch size
Prompt loss
Test prompts

Example datasets

Text classification
Conditional text generation
Open-ended text generation

Exceptions

There are some exceptions to fine-tuning depending on the base model you're starting from.
T5-20B (UL2)
OPT
Last modified 1mo ago
Copy link
On this page
Introduction
Prepare training data
Train a fine-tuned model
Use a fine-tuned model
Hyperparameters
Example datasets
Exceptions