FastAPI + Docker Deployment Fails on Vertex AI with 404 Error (GET /v1/endpoints/.../deployedModels/

Hi, I’m trying to deploy a FastAPI project using a Dockerfile to Vertex AI via gcloud CLI (models upload, endpoints create, deploy-model). I’ve successfully uploaded the model and created the endpoint.

However, the deployment keeps failing even though the code runs correctly locally. Below are the code snippets:

Docker file :

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 8080

CMD [“uvicorn”, “main:app”, “–host=0.0.0.0”, “–port=8080”]

main.py file :

import uvicorn
from typing import Optional, List
from fastapi import FastAPI, HTTPException, status
from fastapi.responses import JSONResponse
from pydantic import BaseModel, ValidationError

app = FastAPI()

class PredictionRequest(BaseModel):
instances: List[float]
parameters: Optional[dict] = None

class PredictionResponse(BaseModel):
predictions: List[float]

@App .get(“/health”, status_code=status.HTTP_200_OK)
def health_check():
return JSONResponse(content={“message”: “healthy”})

@App .post(“/predict”, response_model=PredictionResponse)
async def predict(request: PredictionRequest):
try:
instances = request.instances
parameters = request.parameters
predictions = predict_function(instances, parameters)
return {“predictions”: predictions}
except ValidationError as e:
raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY, detail=str(e))

def predict_function(instances: List[float], parameters: Optional[dict]) → List[float]:
return [x ** 2 for x in instances]

if name == “main”:
uvicorn.run(app, host=“0.0.0.0”, port=8080)

Issue:

Once deployed, Vertex AI keeps pinging this URL:

GET /v1/endpoints/1443848982781493248/deployedModels/6477727675065565184 HTTP/1.1 404 Not Found

This happens repeatedly every few seconds for about 45 minutes, and the deployment eventually fails.

Log examples:

INFO: 10.0.1.33:38712 - “GET /v1/endpoints/1443848982781493248/deployedModels/6477727675065565184 HTTP/1.1” 404 Not Found

INFO: Shutting down
INFO: Application shutdown complete.

I’m unsure why this basic FastAPI setup, which works fine locally, consistently fails during Vertex AI deployment. Any guidance or suggestions would be greatly appreciated!

Thanks in advance

Hi @suryaprakash_b ,

Welcome to Google Cloud Community!

A 404 error usually suggests an invalid URL, where Vertex AI is trying to access the resource based on the configuration but fails to reach the target URL.

Here are some suggestions you can try:

  • Carefully examine your code, check for any typographical error and inconsistencies on your code. Some programming languages like python are case sensitive.

  • Ensure your /health endpoint is working and accurately reflects the status of your application. Verify that all dependencies are available and troubleshoot your /health endpoint to check if the issue is related to the network configuration or the application itself.

  • Specify a health check route during deployment, as you have defined a /health endpoint. You can refer to this documentation as a reference for defining a health route.

  • Make sure your model deployment and endpoint are configured correctly, including the correct region and zone.

You might also find this documentation helpful for deploying a model using the gcloud CLI or Vertex AI API.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.