Error in deploying docker container on vertex AI

rajmaniShukla · March 17, 2025, 8:32am

Hi everyone,

I am trying to deploy a YOLO custom container model on Vertex AI using containerization. I have successfully:

Built and tested the Docker image locally.
Verified that the API (FastAPI + YOLO) runs correctly in the container.
Successfully deployed and tested the same image on Cloud Run.

However, when deploying on Vertex AI as a custom container model, I am facing issues.

Setup Details:

Base Image: pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime
(for GPU support)
Serving Framework: FastAPI + Uvicorn + Gunicorn
Hardware Target: Trying to use n1-standard-4 with NVIDIA p4 GPU
Docker CMD: The container runs startup.sh which loads models and starts the FastAPI server.
Inference Request Format: Expects base64 encoded images in application/json.

Deployment Steps:

Built the Docker image locally and tested it with docker run.
Pushed the image to Google Artifact Registry.

Created a Vertex AI Model using:

- name: Upload Mobile Model
        run: |
          EXISTING_MODEL=$(gcloud ai models list --region=$REGION --filter="displayName=freshpet-mobile" --format="value(name)" --limit=1)
          if [ -z "$EXISTING_MODEL" ]; then
            gcloud ai models upload --region=$REGION --display-name=freshpet-mobile --container-image-uri=$IMAGE_NAME_MOBILE
          else
            echo "Mobile model already exists, skipping upload."
          fi

      - name: Wait for Mobile Model to be Registered
        run: |
            timeout=$TIMEOUT_SECONDS
            start_time=$(date +%s)
            while true; do
              MODEL_MOBILE_NAME=$(gcloud ai models list --region=$REGION --filter="displayName=freshpet-mobile" --format="value(name)" --limit=1)
              if [ -n "$MODEL_MOBILE_NAME" ]; then
                echo "Model Registered: $MODEL_MOBILE_NAME"
                echo "MODEL_MOBILE_NAME=$MODEL_MOBILE_NAME" >> $GITHUB_ENV
                break
              fi
              if [ $(( $(date +%s) - start_time )) -gt $timeout ]; then
                echo "Timeout waiting for model registration." && exit 1
              fi
              sleep 10
            done

Created an Endpoint and deployed the model.

- name: Deploy Mobile Model to Endpoint
        run: |
          ENDPOINT_MOBILE_ID=$(gcloud ai endpoints list --region=$REGION --filter="displayName=freshpet-mobile-endpoint" --format="value(name)" --limit=1)
          if [ -z "$ENDPOINT_MOBILE_ID" ]; then
            ENDPOINT_MOBILE_ID=$(gcloud ai endpoints create --region=$REGION --display-name=freshpet-mobile-endpoint --format="value(name)")
          fi
          gcloud ai endpoints deploy-model $ENDPOINT_MOBILE_ID \
            --region=$REGION \
            --model=$MODEL_MOBILE_NAME \
            --display-name=mobile-container-deploy \
            --machine-type=$MACHINE_TYPE \
            --accelerator=count=1,type=$GPU_TYPE \
            --min-replica-count=$MIN_REPLICAS \
            --enable-access-logging \
            --autoscaling-metric-specs=$AUTOSCALING_METRIC

Issues Faced:

The model fails to load on Vertex AI (while it works perfectly on Cloud Run).
No logs appear in the Vertex AI endpoint, making debugging difficult.
Screenshot from 2025-03-17 13-59-47.png1355×506 123 KB
When I try to send a request, I get 503 Service Unavailable or Container Failed to Start errors.

Questions:

Does Vertex AI require additional configurations for FastAPI-based custom containers?
How can I enable GPU support correctly inside the Vertex AI container?
Is there a different logging mechanism I should use to debug why the container is failing?
Are there specific health check requirements for Vertex AI containers?

Any help would be greatly appreciated! Thanks in advance.

rajmaniShukla · March 17, 2025, 10:43am

error mail

Hello Vertex AI Customer,

Due to an error, Vertex AI was unable to deploy model “freshpet-mobile@1”.
Additional Details:
Operation State: Failed with errors
Resource Name:
projects/1061052074258/locations/us-central1/models/5571072585824731136
Error Messages: Model server never became ready. Please validate that your
model file or container configuration are valid. Model server logs can be
found at

Topic		Replies	Views
Deploy model on vertex ai endpoint Custom ML & MLOps vertex-ai-platform	0	42	January 31, 2024
torch.cuda.is_available() returns False on Vertex AI Custom ML & MLOps automl , vertex-ai-platform , vertex-ai-model-registry	2	124	June 8, 2023
FastAPI + Docker Deployment Fails on Vertex AI with 404 Error (GET /v1/endpoints/.../deployedModels/ Custom ML & MLOps vertex-ai-platform , vertex-ai-model-registry , vertex-ai-workbench	1	78	April 23, 2025

Error in deploying docker container on vertex AI

Setup Details:

Deployment Steps:

Issues Faced:

Questions:

AI Suggested topics