OpenAI API response time tracker

The chart below tracks the response times of the main OpenAI models (GPT-4, GPT-3.5, GPT-3) API. The response times are measured by generating a maximum of 512 tokens at a temperature of 0.7 every 10 minutes in 3 locations. The maximum response time is capped at 60 seconds but could be higher in reality.

GPT for Work

OpenAI API servers are usually under extremely heavy load and response times are on average very high (bad) for gpt-4 and gpt-3.5-turbo (for openai free trial accounts), and lower (good) for other models.

How to get a faster response time?

  • Choose a model with a faster response time (text-davinci-003 is usually fast and comparable in output quality to gpt-3.5-turbo, but is 10x more expensive)
  • Try again outside of peak hours
  • Reduce your max_tokens parameter
  • Use the Google Docs integration if you want to generate longer content such as blog posts


Disclaimer: we are not affiliated with OpenAI. Check the official OpenAI status page.