Hi,
I have successfully deployed a custom-trained model on Vertex AI. It includes a /health endpoint for health checks and waits for a response from the request. However, when the process is running, it takes about 2-3 minutes to load the model files, which results in the following JSON response. Despite this, the backend continues to execute and stream the logs until the process is complete.
AIP_HEALTH_ROUTE=os.environ.get('AIP_HEALTH_ROUTE', '/health')
@app.route(AIP_HEALTH_ROUTE, methods=["GET"])
def health():
return jsonify({'health': 'ok'})
{
"error": {
"code": 503,
"message": "Took too long to respond when processing endpoint_id: 175231367141916672, deployed_model_id: 546162609888428032",
"status": "UNAVAILABLE"
}
}​
Below message is from the logs.
{
"insertId": "jzha6yg14hht6n",
"jsonPayload": {
"message": "10.0.1.65 - - [28/Jan/2025 01:13:24] \"GET /health HTTP/1.1\" 200 -"
},
"resource": {
"type": "aiplatform.googleapis.com/Endpoint",
"labels": {
"location": "us-west1",
"endpoint_id": "175231367141916672",
"resource_container": "projects/786116219701"
},
"timestamp": "2025-01-28T01:13:24.744819164Z",
"severity": "ERROR",
"labels": {
"replica_id": "predictor-resource-pool-5120759902087675904-75f9dbbb7-c5p9l",
"deployed_model_id": "<546162609888428032423>"
},
"logName": "projects/<project_id>/logs/aiplatform.googleapis.com%2Fprediction_container",
"receiveTimestamp": "2025-01-28T01:13:25.208217740Z"
}
Thanks in advance