OpenAI is launching a new API option called Flex processing, which offers lower prices for the company's AI models in exchange for slower response times and "occasional resource unavailability," TechCrunch reports.

Flex processing was introduced in beta for the recently launched o3 and o4-mini models. It focuses on lower-priority tasks and "non-productive" scenarios such as model testing, data enrichment, and asynchronous workflows, according to OpenAI.

This mode cuts the cost of the API in half.

  • For the o3 model, Flex mode costs $5 per million input tokens (~750,000 words) and $20 per million output tokens, while the standard price is $10 and $40, respectively.
  • For o4-mini, Flex prices are reduced to $0.55 per million input tokens and $2.20 per million output tokens (versus the standard $1.1 and $4.4).

OpenAI is launching Flex as the cost of advanced AI models continues to rise and cheaper models continue to enter the market. For example, last week Google unveiled Gemini 2.5 Flash, a logical reasoning model that is as good as DeepSeek R1 in terms of performance, but has a lower input token price.

In a letter to customers, OpenAI also announced that developers at levels 1-3 on the company's scale will have to undergo a new identity verification process to gain access to o3. The levels will be determined based on the user's spending on OpenAI services.

Also, access to the summary of model logic and API support will be limited for those who have not passed the verification. Earlier, the ChatGPT developer explained that the verification was introduced to prevent abuse and violations of the service usage policy.