ChatGPT/GPT-3/GPT-4 models guide
What is a model?
A model is a prediction engine, usually specific to a certain kind of problem. You will find models for guessing the weather, the stock market, championships, what is contained in a picture etc. What they have in common is that you provide them with an input such as today’s weather, and you get a prediction as an output, such as tomorrow’s weather, usually with some kind of confidence score.
Predictions made by models can be more or less accurate and reliable. Until recently, there was no good model to predict “what comes next” after a “text input”.
OpenAI invented a new kind of models, called generative pre-trained transformers (GPT) that have changed the game: they are capable to “continue” input text in most situations, on par or even better, and certainly faster than the average human could do.
Since “text input” is such a broad scope, these GPT models cover a wide array of tasks such as Q&A, following editing and formatting instructions, or even writing code.
So a GPT model is a prediction engine for text.
What is the difference between a model and an AI?
There is no difference. A model is a technical word to say an “AI”. So basically, choosing a model is equivalent to choosing an AI. Unlike the human brain, models or AIs tend to be highly specialized for a specific set of tasks or inputs. Depending on your task at hand (whether you’re working with images, audio, video or text), you will want to choose a different AI or model.
What parameters should be considered when you choose a model?
You will generally want to consider the following parameters:
- Accuracy: how good the model is at the task you want it to complete, which can vary greatly depending on your specific context
- Speed: how fast you get an output from the model
- Cost: how much each task you give it costs
What GPT models are available on OpenAI?
There are 5 major base GPT models for working with text that are available from OpenAI through their API:
Model name | Technical name | Model family | Price per 1000 tokens | Max tokens | Available for fine-tuning |
GPT-4 32k | gpt-4-32k | GPT-4 | USD 0.0600 (prompt)
USD 0.1200 (completion) | 32768 | No |
GPT-4 | gpt-4 | GPT-4 | USD 0.0300 (prompt)
USD 0.0600 (completion) | 8192 | No |
GPT-3.5 16k | gpt-3.5-turbo-16k | GPT-3.5 | USD 0.0030 (prompt)
USD 0.0040 (completion) | 16384 | No |
GPT-3.5 | gpt-3.5-turbo | GPT-3.5 | USD 0.0015 (prompt)
USD 0.0020 (completion) | 4096 | No |
Davinci | text-davinci-003 | GPT-3 | USD 0.0200 | 4096 | Yes |
Curie | text-curie-001 | GPT-3 | USD 0.0020 | 2049 | Yes |
Babbage | text-babbage-001 | GPT-3 | USD 0.0005 | 2049 | Yes |
Ada | text-ada-001 | GPT-3 | USD 0.0004 | 2049 | Yes |
Last updated March 11th, 2023 from openai.com/pricing
What’s the difference between GPT-3, GPT-3.5 and GPT-4 models?
GPT-3 models are “instruct” models that are meant to generate text with a clear instruction. They are not optimized for conversational chat. The best GPT-3 model is text-davinci-003 but it is the most expensive by far.
GPT-3.5 models (ChatGPT) were first released on March 1st, 2023. They are built on top of GPT-3 models and optimized for conversational chat. However, in the vast majority of cases, they are just as good with instructions as text-davinci-003. GPT-3.5 results can be too “chatty” or “creative” in some cases.
GPT-4 models are the latest breed of OpenAI models, released on March 14th, 2023.
- GPT-4 models are multimodal: they can take both text and image inputs.
- GPT-4 models can solve much more complex problems thanks to advance reasoning capabilities, and are typically much better at maths than previous models.
- GPT-4 models can use twice to eight times more tokens in their context than GPT-3 and GPT-3.5 models.
- GPT-4 models are however significantly more expensive than ChatGPT, with prompts being 15 to 30 times more expensive, and completions from 30 to 60 times more expensive.
Which model to choose in GPT for Sheets and Docs?
The answer is ChatGPT ( gpt-3.5-turbo) in the vast majority of cases.
It is 10 times cheaper and we observe 3 times faster than Davinci (text-davinci-003).
For that reason it is the default model in all GPT for Sheets functions as well as GPT for Docs. You should always start experimenting with this one at first.
You can specify another model if:
- you find that ChatGPT is “too creative”, this can happen in classification or extraction use cases notably. In this case, we recommend using text-davinci-003
- you want a higher rate limit (GPT3 models have higher rate limits than GPT3.5)
- you want to use a fine-tuned model
How to specify a model in GPT for Sheets and Docs?
If you want to use a non-default model in GPT for Sheets, simply specify it as the last parameter of your function. When you specify the model, make sure to wrap it in quotation marks. For example "gpt-4".
If you want to use a non-default model in GPT for Docs, simply select it in the sidebar dropdown.
What is a fine-tuned model?
A fine-tuned model is a base model that was trained (fine-tuned) for a specific task by providing it some examples of inputs and expected outputs. You usually need between a few hundreds and a few thousands of examples to fine-tune a model.
You can learn how to fine-tune a model here.
When should I use a fine-tuned model?
Fine-tuned model can typically do only one thing, so you should use a fine-tuned model only and only if you need a specific task to be performed in very high volumes.
If you are in such as situation, then using a fine-tuned model will reduce costs, increase speed and rate limits, as you can use one of the very cheap base models such as text-ada-001.
A typical use-case is if you want the format of the output to follow very strict guidelines that are best explained by examples.
What are the costs of using a fine-tuned model?
There are two costs:
- a one-time training cost
- a pay-as-you-go usage cost
Model name | Technical name | Training price for 1000 tokens | Fine-tuned usage price for 1000 tokens |
Davinci | text-davinci-003 | USD 0.0300 | USD 0.1200 |
Curie | text-curie-001 | USD 0.0030 | USD 0.0120 |
Babbage | text-babbage-001 | USD 0.0006 | USD 0.0024 |
Ada | text-ada-001 | USD 0.0004 | USD 0.0016 |