How to deploy ML model to Google Cloud
Quick answer
To deploy an ML model to Google Cloud, package your trained model and upload it to Google Cloud Storage, then create a model resource on AI Platform and deploy a version pointing to your model files using the
gcloud CLI or Google Cloud Console. Use Vertex AI for managed deployment with autoscaling and monitoring.PREREQUISITES
Python 3.8+Google Cloud account with billing enabledGoogle Cloud SDK installed and configuredgcloud CLI authenticated (run `gcloud auth login`)Model saved in a supported format (e.g., TensorFlow SavedModel, PyTorch TorchScript)Google Cloud Storage bucket created
Setup Google Cloud environment
Install and configure the Google Cloud SDK, authenticate your account, and create a Cloud Storage bucket to hold your model files.
bash
# Install Google Cloud SDK (if not installed)
curl https://sdk.cloud.google.com | bash
exec -l $SHELL
# Authenticate your account
gcloud auth login
# Set your project ID
gcloud config set project YOUR_PROJECT_ID
# Create a Cloud Storage bucket (replace BUCKET_NAME)
gsutil mb gs://BUCKET_NAME output
Creating gs://BUCKET_NAME/...
Step by step deployment
Save your trained model locally, upload it to Cloud Storage, then deploy it on Vertex AI using the gcloud CLI.
bash
# Upload your model directory to Cloud Storage
# Replace MODEL_DIR and BUCKET_NAME
gsutil cp -r MODEL_DIR gs://BUCKET_NAME/model/
# Deploy model to Vertex AI
# Replace MODEL_NAME and REGION
gcloud ai models upload \
--region=us-central1 \
--display-name=MODEL_NAME \
--artifact-uri=gs://BUCKET_NAME/model/ \
--container-image-uri=us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-8:latest
# Create an endpoint
ENDPOINT_ID=$(gcloud ai endpoints create --region=us-central1 --display-name=MODEL_NAME-endpoint --format='value(name)')
# Deploy model to endpoint
gcloud ai endpoints deploy-model $ENDPOINT_ID \
--region=us-central1 \
--model=$(gcloud ai models list --region=us-central1 --filter="displayName=MODEL_NAME" --format='value(name)') \
--display-name=MODEL_NAME-deployment \
--machine-type=n1-standard-4 output
Uploading model... Model uploaded successfully. Endpoint created: projects/PROJECT_ID/locations/us-central1/endpoints/ENDPOINT_ID Model deployed to endpoint.
Common variations
- Use different container images for PyTorch or custom models.
- Deploy using the Vertex AI Python SDK for programmatic control.
- Enable autoscaling and traffic splitting on endpoints.
- Use asynchronous batch prediction for large datasets.
python
from google.cloud import aiplatform
# Initialize Vertex AI client
aiplatform.init(project='YOUR_PROJECT_ID', location='us-central1')
# Upload model
model = aiplatform.Model.upload(
display_name='MODEL_NAME',
artifact_uri='gs://BUCKET_NAME/model/',
serving_container_image_uri='us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-8:latest'
)
# Deploy model to endpoint
endpoint = model.deploy(
machine_type='n1-standard-4',
min_replica_count=1,
max_replica_count=3
)
print(f'Deployed model to endpoint: {endpoint.resource_name}') output
Deployed model to endpoint: projects/PROJECT_ID/locations/us-central1/endpoints/ENDPOINT_ID
Troubleshooting common issues
- If you get permission errors, ensure your IAM roles include Vertex AI Admin and Storage Object Admin.
- Model upload fails? Check your model format matches the container image requirements.
- Deployment times out? Verify your region and machine type availability.
- Use
gcloud ai operations listto monitor deployment status.
Key Takeaways
- Use Google Cloud Storage to host your model artifacts before deployment.
- Deploy models on Vertex AI for managed serving with autoscaling and monitoring.
- Use the gcloud CLI or Vertex AI SDK for flexible deployment workflows.
- Ensure correct IAM permissions and model format to avoid deployment errors.