How to Intermediate · 4 min read

How to deploy ML model to Google Cloud

Quick answer
To deploy an ML model to Google Cloud, package your trained model and upload it to Google Cloud Storage, then create a model resource on AI Platform and deploy a version pointing to your model files using the gcloud CLI or Google Cloud Console. Use Vertex AI for managed deployment with autoscaling and monitoring.

PREREQUISITES

  • Python 3.8+
  • Google Cloud account with billing enabled
  • Google Cloud SDK installed and configured
  • gcloud CLI authenticated (run `gcloud auth login`)
  • Model saved in a supported format (e.g., TensorFlow SavedModel, PyTorch TorchScript)
  • Google Cloud Storage bucket created

Setup Google Cloud environment

Install and configure the Google Cloud SDK, authenticate your account, and create a Cloud Storage bucket to hold your model files.

bash
bash
# Install Google Cloud SDK (if not installed)
curl https://sdk.cloud.google.com | bash
exec -l $SHELL

# Authenticate your account
gcloud auth login

# Set your project ID
gcloud config set project YOUR_PROJECT_ID

# Create a Cloud Storage bucket (replace BUCKET_NAME)
gsutil mb gs://BUCKET_NAME
output
Creating gs://BUCKET_NAME/...

Step by step deployment

Save your trained model locally, upload it to Cloud Storage, then deploy it on Vertex AI using the gcloud CLI.

bash
bash
# Upload your model directory to Cloud Storage
# Replace MODEL_DIR and BUCKET_NAME
gsutil cp -r MODEL_DIR gs://BUCKET_NAME/model/

# Deploy model to Vertex AI
# Replace MODEL_NAME and REGION

gcloud ai models upload \
  --region=us-central1 \
  --display-name=MODEL_NAME \
  --artifact-uri=gs://BUCKET_NAME/model/ \
  --container-image-uri=us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-8:latest

# Create an endpoint
ENDPOINT_ID=$(gcloud ai endpoints create --region=us-central1 --display-name=MODEL_NAME-endpoint --format='value(name)')

# Deploy model to endpoint
gcloud ai endpoints deploy-model $ENDPOINT_ID \
  --region=us-central1 \
  --model=$(gcloud ai models list --region=us-central1 --filter="displayName=MODEL_NAME" --format='value(name)') \
  --display-name=MODEL_NAME-deployment \
  --machine-type=n1-standard-4
output
Uploading model...
Model uploaded successfully.
Endpoint created: projects/PROJECT_ID/locations/us-central1/endpoints/ENDPOINT_ID
Model deployed to endpoint.

Common variations

  • Use different container images for PyTorch or custom models.
  • Deploy using the Vertex AI Python SDK for programmatic control.
  • Enable autoscaling and traffic splitting on endpoints.
  • Use asynchronous batch prediction for large datasets.
python
python
from google.cloud import aiplatform

# Initialize Vertex AI client
aiplatform.init(project='YOUR_PROJECT_ID', location='us-central1')

# Upload model
model = aiplatform.Model.upload(
    display_name='MODEL_NAME',
    artifact_uri='gs://BUCKET_NAME/model/',
    serving_container_image_uri='us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-8:latest'
)

# Deploy model to endpoint
endpoint = model.deploy(
    machine_type='n1-standard-4',
    min_replica_count=1,
    max_replica_count=3
)

print(f'Deployed model to endpoint: {endpoint.resource_name}')
output
Deployed model to endpoint: projects/PROJECT_ID/locations/us-central1/endpoints/ENDPOINT_ID

Troubleshooting common issues

  • If you get permission errors, ensure your IAM roles include Vertex AI Admin and Storage Object Admin.
  • Model upload fails? Check your model format matches the container image requirements.
  • Deployment times out? Verify your region and machine type availability.
  • Use gcloud ai operations list to monitor deployment status.

Key Takeaways

  • Use Google Cloud Storage to host your model artifacts before deployment.
  • Deploy models on Vertex AI for managed serving with autoscaling and monitoring.
  • Use the gcloud CLI or Vertex AI SDK for flexible deployment workflows.
  • Ensure correct IAM permissions and model format to avoid deployment errors.
Verified 2026-04 · Vertex AI, gcloud ai models upload, Google Cloud Storage
Verify ↗