Debug Fix beginner · 3 min read

Vertex AI error codes reference

Quick answer

Common Vertex AI error codes include 400 Bad Request for invalid inputs, 401 Unauthorized for authentication failures, 403 Forbidden for permission issues, 429 Too Many Requests for rate limiting, and 500 Internal Server Error for server-side problems. Handling these requires proper request validation, authentication setup, and retry logic.

ERROR TYPE api_error

⚡ QUICK FIX

Add exponential backoff retry logic around your API call to handle RateLimitError automatically.

Why this happens

Vertex AI API errors occur due to invalid requests, authentication failures, permission restrictions, rate limits, or internal server issues. For example, sending malformed JSON or missing required parameters triggers 400 Bad Request. Using expired or missing credentials causes 401 Unauthorized. Calling APIs without proper IAM roles results in 403 Forbidden. Exceeding quota limits leads to 429 Too Many Requests. Server errors return 500 Internal Server Error.

Typical error output from the API looks like:

{
  "error": {
    "code": 429,
    "message": "Quota exceeded",
    "status": "RESOURCE_EXHAUSTED"
  }
}

python

import vertexai
from vertexai.generative_models import GenerativeModel

vertexai.init(project=os.environ["GOOGLE_CLOUD_PROJECT"], location="us-central1")
model = GenerativeModel("gemini-2.0-flash")

# Example of a malformed request causing 400 error
response = model.generate_content("")  # Empty prompt triggers error

output

google.api_core.exceptions.BadRequest: 400 Bad Request: Request contains an invalid argument.

The fix

Validate inputs before sending requests, ensure authentication is correctly configured with Application Default Credentials or service account keys, and verify IAM permissions. Implement exponential backoff retry logic for 429 Too Many Requests errors to handle rate limits gracefully.

Example code with retry and error handling:

python

import time
import os
import vertexai
from vertexai.generative_models import GenerativeModel
from google.api_core.exceptions import TooManyRequests, Unauthorized, Forbidden, BadRequest

vertexai.init(project=os.environ["GOOGLE_CLOUD_PROJECT"], location="us-central1")
model = GenerativeModel("gemini-2.0-flash")

prompt = "Explain quantum computing"

max_retries = 5
for attempt in range(max_retries):
    try:
        response = model.generate_content(prompt)
        print(response.text)
        break
    except TooManyRequests:
        wait_time = 2 ** attempt
        print(f"Rate limit hit, retrying in {wait_time} seconds...")
        time.sleep(wait_time)
    except (Unauthorized, Forbidden) as auth_err:
        print(f"Authentication or permission error: {auth_err}")
        break
    except BadRequest as bad_req:
        print(f"Invalid request: {bad_req}")
        break
    except Exception as e:
        print(f"Unexpected error: {e}")
        break

output

Explain quantum computing in simple terms...

Preventing it in production

Use robust input validation to avoid 400 Bad Request.
Configure authentication with service accounts or Application Default Credentials correctly to prevent 401 Unauthorized.
Assign proper IAM roles to avoid 403 Forbidden.
Implement exponential backoff retries for 429 Too Many Requests to handle quota limits.
Monitor API usage and set alerts for quota exhaustion.
Use fallback models or cached responses to maintain availability during 500 Internal Server Error incidents.

Related errors

Error	Cause	Quick fix
400 Bad Request	Malformed request or invalid parameters	Validate inputs before sending requests
401 Unauthorized	Missing or invalid credentials	Set up authentication with valid service account or ADC
403 Forbidden	Insufficient IAM permissions	Assign correct IAM roles to the service account
429 Too Many Requests	Quota exceeded or rate limit hit	Add exponential backoff retry logic
500 Internal Server Error	Server-side issue	Retry after delay or use fallback mechanisms

✅

Key Takeaways

Validate all request inputs to prevent 400 errors in Vertex AI API calls.
Configure authentication and IAM permissions correctly to avoid 401 and 403 errors.
Implement exponential backoff retries to handle 429 rate limit errors gracefully.

Verified 2026-04 · gemini-2.0-flash

Verify ↗