Search
⌃K

Models

Explore large language models offered on the Forefront platform.
The Forefront platform gives you the ability to fine-tune and use open source large language models like:
Model
Parameters
Creator
GPT-J
6 billion
EleutherAI
CodeGen
16 billion
Salesforce
GPT-NeoX
20 billion
EleutherAI
T5
20 billion
Google
OPT
30 billion
Meta
OPT (coming soon)
66 billion
Meta
BLOOM (coming soon)
176 billion
BigScience Research Workshop
These large language models are pre-trained on a vast amount of text from the Internet. Pre-training involves taking a mathematical model with random mathematical parameters (weights) and iteratively adjusting those weights in response to differences between the output generated by the model and some point of comparison showing an expected output. For large language models, the most common method of training is next word prediction over enormous amounts of text, like the Pile.
In general, the more parameters a model has, the better it will perform on language tasks. Models can be improved through fine-tuning, whereby a base model is trained on a set of prompt-completion pairs showing ideal task performance.

Exceptions

Most models on the Forefront platform function identically. However, there are some exceptions depending on the model you're using.
T5-20B (UL2)
T5-20B is a 20 billion parameter encoder-decoder model. Best practices to use this model are slightly different than decoder-only models:
  1. 1.
    Always start prompts with [S2S] . Note the whitespace after the text.
  2. 2.
    Always end prompts with <extra_id_0>. Note the whitespace prior to the text.
An example prompt would look like
{
"prompt": "[S2S] <prompt text> <extra_id_0>",
...
}
Newlines ("\n") are not apart of the UL2 token vocabulary which means that the model will have no understanding of newlines nor be able to generate newlines.
Last modified 7mo ago