How to beginner · 3 min read

Vertex AI pricing

Quick answer

Google Vertex AI pricing depends on the type of service used, including training, batch prediction, and online prediction. Costs vary by model size, compute resources, and usage volume, with detailed pricing available on the Google Cloud Vertex AI pricing page. You pay for compute time, storage, and data processed, with some free tier usage available.

PREREQUISITES

Python 3.8+
Google Cloud account with billing enabled
Google Cloud SDK installed and configured
pip install google-cloud-aiplatform

Setup

Install the google-cloud-aiplatform Python package and set up authentication with your Google Cloud project and billing enabled.

Enable Vertex AI API in Google Cloud Console.
Set environment variable GOOGLE_APPLICATION_CREDENTIALS to your service account JSON key path.

bash

pip install google-cloud-aiplatform

Step by step

This example shows how to create a Vertex AI client and retrieve pricing information programmatically is not directly available via API, so you typically refer to the pricing page. Instead, here is how to create a client and run a simple prediction, which will incur costs based on usage.

python

from google.cloud import aiplatform
import os

# Set your Google Cloud project and location
PROJECT_ID = os.environ.get('GOOGLE_CLOUD_PROJECT')
LOCATION = 'us-central1'

# Initialize Vertex AI client
client = aiplatform.gapic.PredictionServiceClient()

# Example: Prepare a prediction request (replace with your model details)
endpoint = client.endpoint_path(PROJECT_ID, LOCATION, 'YOUR_ENDPOINT_ID')

instances = [{"content": "Hello, Vertex AI!"}]

response = client.predict(endpoint=endpoint, instances=instances)
print("Prediction response:", response.predictions)

# Note: Pricing depends on model type, instance hours, and data processed.

output

Prediction response: [...]

Common variations

Vertex AI pricing varies by:

Training: Charged per training hour and machine type.
Online prediction: Charged per prediction hour and compute resources.
Batch prediction: Charged per data volume processed.
Custom models vs. prebuilt: Different pricing tiers apply.

Use the google-cloud-aiplatform SDK for batch jobs or custom training with different machine types to control costs.

Pricing component	Description
Training	Billed per hour based on machine type and scale.
Online prediction	Billed per hour of deployed model and compute.
Batch prediction	Billed per GB of data processed.
Storage	Charged for model and dataset storage.

Troubleshooting

If you see unexpectedly high costs:

Check your deployed model instance count and machine types.
Review batch prediction data size and frequency.
Use Google Cloud Billing reports to analyze usage.
Set budgets and alerts in Google Cloud Console.

For quota errors, verify your project limits and request increases if needed.

✅

Key Takeaways

Vertex AI pricing depends on training, prediction, and storage usage.
Use appropriate machine types and scale to optimize costs.
Monitor usage with Google Cloud Billing and set budgets to avoid surprises.

Verified 2026-04 · gemini-2.5-pro, gpt-4o

Verify ↗