Vertex AI pricing
Quick answer
Google Vertex AI pricing depends on the type of service used, including training, batch prediction, and online prediction. Costs vary by model size, compute resources, and usage volume, with detailed pricing available on the Google Cloud Vertex AI pricing page. You pay for compute time, storage, and data processed, with some free tier usage available.
PREREQUISITES
Python 3.8+Google Cloud account with billing enabledGoogle Cloud SDK installed and configuredpip install google-cloud-aiplatform
Setup
Install the google-cloud-aiplatform Python package and set up authentication with your Google Cloud project and billing enabled.
- Enable Vertex AI API in Google Cloud Console.
- Set environment variable
GOOGLE_APPLICATION_CREDENTIALSto your service account JSON key path.
pip install google-cloud-aiplatform Step by step
This example shows how to create a Vertex AI client and retrieve pricing information programmatically is not directly available via API, so you typically refer to the pricing page. Instead, here is how to create a client and run a simple prediction, which will incur costs based on usage.
from google.cloud import aiplatform
import os
# Set your Google Cloud project and location
PROJECT_ID = os.environ.get('GOOGLE_CLOUD_PROJECT')
LOCATION = 'us-central1'
# Initialize Vertex AI client
client = aiplatform.gapic.PredictionServiceClient()
# Example: Prepare a prediction request (replace with your model details)
endpoint = client.endpoint_path(PROJECT_ID, LOCATION, 'YOUR_ENDPOINT_ID')
instances = [{"content": "Hello, Vertex AI!"}]
response = client.predict(endpoint=endpoint, instances=instances)
print("Prediction response:", response.predictions)
# Note: Pricing depends on model type, instance hours, and data processed. output
Prediction response: [...]
Common variations
Vertex AI pricing varies by:
- Training: Charged per training hour and machine type.
- Online prediction: Charged per prediction hour and compute resources.
- Batch prediction: Charged per data volume processed.
- Custom models vs. prebuilt: Different pricing tiers apply.
Use the google-cloud-aiplatform SDK for batch jobs or custom training with different machine types to control costs.
| Pricing component | Description |
|---|---|
| Training | Billed per hour based on machine type and scale. |
| Online prediction | Billed per hour of deployed model and compute. |
| Batch prediction | Billed per GB of data processed. |
| Storage | Charged for model and dataset storage. |
Troubleshooting
If you see unexpectedly high costs:
- Check your deployed model instance count and machine types.
- Review batch prediction data size and frequency.
- Use Google Cloud Billing reports to analyze usage.
- Set budgets and alerts in Google Cloud Console.
For quota errors, verify your project limits and request increases if needed.
Key Takeaways
- Vertex AI pricing depends on training, prediction, and storage usage.
- Use appropriate machine types and scale to optimize costs.
- Monitor usage with Google Cloud Billing and set budgets to avoid surprises.