How to beginner · 3 min read

Vertex AI pricing

Quick answer
Google Vertex AI pricing depends on the type of service used, including training, batch prediction, and online prediction. Costs vary by model size, compute resources, and usage volume, with detailed pricing available on the Google Cloud Vertex AI pricing page. You pay for compute time, storage, and data processed, with some free tier usage available.

PREREQUISITES

  • Python 3.8+
  • Google Cloud account with billing enabled
  • Google Cloud SDK installed and configured
  • pip install google-cloud-aiplatform

Setup

Install the google-cloud-aiplatform Python package and set up authentication with your Google Cloud project and billing enabled.

  • Enable Vertex AI API in Google Cloud Console.
  • Set environment variable GOOGLE_APPLICATION_CREDENTIALS to your service account JSON key path.
bash
pip install google-cloud-aiplatform

Step by step

This example shows how to create a Vertex AI client and retrieve pricing information programmatically is not directly available via API, so you typically refer to the pricing page. Instead, here is how to create a client and run a simple prediction, which will incur costs based on usage.

python
from google.cloud import aiplatform
import os

# Set your Google Cloud project and location
PROJECT_ID = os.environ.get('GOOGLE_CLOUD_PROJECT')
LOCATION = 'us-central1'

# Initialize Vertex AI client
client = aiplatform.gapic.PredictionServiceClient()

# Example: Prepare a prediction request (replace with your model details)
endpoint = client.endpoint_path(PROJECT_ID, LOCATION, 'YOUR_ENDPOINT_ID')

instances = [{"content": "Hello, Vertex AI!"}]

response = client.predict(endpoint=endpoint, instances=instances)
print("Prediction response:", response.predictions)

# Note: Pricing depends on model type, instance hours, and data processed.
output
Prediction response: [...]

Common variations

Vertex AI pricing varies by:

  • Training: Charged per training hour and machine type.
  • Online prediction: Charged per prediction hour and compute resources.
  • Batch prediction: Charged per data volume processed.
  • Custom models vs. prebuilt: Different pricing tiers apply.

Use the google-cloud-aiplatform SDK for batch jobs or custom training with different machine types to control costs.

Pricing componentDescription
TrainingBilled per hour based on machine type and scale.
Online predictionBilled per hour of deployed model and compute.
Batch predictionBilled per GB of data processed.
StorageCharged for model and dataset storage.

Troubleshooting

If you see unexpectedly high costs:

  • Check your deployed model instance count and machine types.
  • Review batch prediction data size and frequency.
  • Use Google Cloud Billing reports to analyze usage.
  • Set budgets and alerts in Google Cloud Console.

For quota errors, verify your project limits and request increases if needed.

Key Takeaways

  • Vertex AI pricing depends on training, prediction, and storage usage.
  • Use appropriate machine types and scale to optimize costs.
  • Monitor usage with Google Cloud Billing and set budgets to avoid surprises.
Verified 2026-04 · gemini-2.5-pro, gpt-4o
Verify ↗