Hi, I’m trying to deploy a FastAPI project using a Dockerfile to Vertex AI via gcloud CLI (models upload, endpoints create, deploy-model). I’ve successfully uploaded the model and created the endpoint.
However, the deployment keeps failing even though the code runs correctly locally. Below are the code snippets:
Docker file :
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8080
CMD [“uvicorn”, “main:app”, “–host=0.0.0.0”, “–port=8080”]
main.py file :
import uvicorn
from typing import Optional, List
from fastapi import FastAPI, HTTPException, status
from fastapi.responses import JSONResponse
from pydantic import BaseModel, ValidationError
app = FastAPI()
class PredictionRequest(BaseModel):
instances: List[float]
parameters: Optional[dict] = None
class PredictionResponse(BaseModel):
predictions: List[float]
@App .get(“/health”, status_code=status.HTTP_200_OK)
def health_check():
return JSONResponse(content={“message”: “healthy”})
@App .post(“/predict”, response_model=PredictionResponse)
async def predict(request: PredictionRequest):
try:
instances = request.instances
parameters = request.parameters
predictions = predict_function(instances, parameters)
return {“predictions”: predictions}
except ValidationError as e:
raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY, detail=str(e))
def predict_function(instances: List[float], parameters: Optional[dict]) → List[float]:
return [x ** 2 for x in instances]
if name == “main”:
uvicorn.run(app, host=“0.0.0.0”, port=8080)
Issue:
Once deployed, Vertex AI keeps pinging this URL:
GET /v1/endpoints/1443848982781493248/deployedModels/6477727675065565184 HTTP/1.1 404 Not Found
This happens repeatedly every few seconds for about 45 minutes, and the deployment eventually fails.
Log examples:
INFO: 10.0.1.33:38712 - “GET /v1/endpoints/1443848982781493248/deployedModels/6477727675065565184 HTTP/1.1” 404 Not Found
…
INFO: Shutting down
INFO: Application shutdown complete.
I’m unsure why this basic FastAPI setup, which works fine locally, consistently fails during Vertex AI deployment. Any guidance or suggestions would be greatly appreciated!
Thanks in advance