Generative AI Service Pricing

Foundational models can be consumed on demand, where you pay per character based on the length of the prompt and the response from the model (except for the embedding models, where the response from the model isn’t accounted for). In the table below, a transaction = a character and 10,000 transactions = 10,000 characters.

Additionally, you can host private replicas of foundational models and create fine-tuned models on dedicated AI clusters. Dedicated AI clusters come in two types: hosting and fine-tuning. You create a hosting cluster by assigning AI units to it based on the model you want to host and the expected call volume to the model. Fine-tuning clusters require two AI units of the specific model you want to fine-tune. Once you create a fine-tuned model in a fine-tuning cluster, you can host it on your hosting cluster.

Dedicated AI clusters require a minimum commitment of 744 unit-hours (per cluster) for hosting models. Fine-tuning clusters require a minimum of 1 unit-hour.

Currency

Cost estimator

OCI Generative AI

Service	Unit price	Unit
Cohere
Oracle Cloud Infrastructure Generative AI - Cohere Rerank - Dedicated		Cluster Hour
Oracle Cloud Infrastructure Generative AI - Large Cohere		10,000 Transactions
Oracle Cloud Infrastructure Generative AI - Small Cohere		10,000 Transactions
Oracle Cloud Infrastructure Generative AI - Embed Cohere		10,000 Transactions
Oracle Cloud Infrastructure Generative AI - Large Cohere - Dedicated		AI unit per hour
Oracle Cloud Infrastructure Generative x - Small Cohere - Dedicated		AI unit per hour
Oracle Cloud Infrastructure Generative AI - Embed Cohere - Dedicated		AI unit per hour
Meta
Oracle Cloud Infrastructure Generative AI - Meta Llama 4 Scout		10,000 Transactions
Oracle Cloud Infrastructure Generative AI - Meta Llama 4 Maverick		10,000 Transactions
Oracle Cloud Infrastructure Generative AI - Large Meta		10,000 transactions
Oracle Cloud Infrastructure Generative AI - Meta Llama 3.1 405B		10,000 transactions
Oracle Cloud Infrastructure Generative AI - Meta Llama 3.2 90B Vision		10,000 transactions
Oracle Cloud Infrastructure Generative AI - Large Meta - Dedicated		AI unit per hour
xAI
Oracle Cloud Infrastructure Generative AI - xAI -Grok 4 Code -Grok-Code-Fast-1-Input Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 4 Code Grok-Code-Fast-1- Cached Input Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI -Grok 4 Code - Grok-Code-Fast-1-Output Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 4 Fast - Input Tokens less than 128K Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 4 Fast - Input Tokens greater than 128K Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 4 Fast - Cached Input Tokens less than 128K Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 4 Fast - Cached Input Tokens greater than 128K Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 4 Fast - Output Tokens less than 128K Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 4 Fast - Output Tokens greater than 128K Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 3 or Grok 4 - Input Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 3 or Grok 4 - Cached Input Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI – Grok 3 or Grok 4 - Output Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 3 Mini - Input Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 3 Mini - Cached Input Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 3 Mini - Output Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 3 Fast - Input Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 3 Fast - Cached Input Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 3 Fast - Output Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 3 Mini Fast - Input Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 3 Mini Fast - Cached Input Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - xAI - Grok 3 Mini Fast - Output Tokens		1,000,000 Tokens
Google
Oracle Cloud Infrastructure Generative AI - Google - Gemini 2.5 Pro - Input Tokens - Text, Image, Audio, and Video less than 200K input tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - Google -Gemini 2.5 Pro - Input Tokens - Text, Image, Audio, and Video greater than 200K input tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - Google - Gemini 2.5 Pro - Output Tokens - Text Output less than 200K input tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - Google - Gemini 2.5 Pro - Output Tokens - Text Output greater than 200K input tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - Google - Gemini 2.5 Flash GA - Input Tokens - Text, Image, and Video		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - Google - Gemini 2.5 Flash GA - Input Tokens - Audio		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - Google - Gemini 2.5 Flash GA - Output Tokens - Text		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - Google - Gemini 2.5 Flash Lite - Input Tokens - Text, Image, and Video		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - Google - Gemini 2.5 Flash Lite - Input Tokens - Audio		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - Google - Gemini 2.5 Flash Lite - Output Tokens - Text		1,000,000 Tokens
OpenAI
Oracle Cloud Infrastructure Generative AI - OpenAI - gpt-oss-120b - Input Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - OpenAI - gpt-oss-120b - Output Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - OpenAI - gpt-oss-20b - Input Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - OpenAI - gpt-oss-20b - Output Tokens		1,000,000 Tokens
Oracle Cloud Infrastructure Generative AI - OpenAI - Dedicated		AI Unit Per Hour
Custome Model
Oracle Cloud Infrastructure Generative AI - Model Import		AI Unit Per Hour

A transaction is a character. 10,000 transactions = 10,000 characters

Dedicated AI clusters require a minimum commitment of 744 unit-hours (per cluster) for hosting models. Fine-tuning clusters require a minimum of 1 unit-hour.