Currently, I want to deploy a Triton server to Vertex AI endpoint. However I received this error message.
“failed to start Vertex AI service: Invalid argument - Expect the model repository contains only a single model if default model is not specified”
Is this mean that the Triton server deploy only support one model? It is different from what I have read in this document about concurrent model execution
Cool ! so i understand that you can only use one model at a time.
For information i was able to run one model but the way we query the vertex ai endpoint doesn’t allow us to choose a specific model. So i guess that using Triton with multiple model is not supported for now ?
It’s mainly an issue of shared memory size not customizable when running a vertex ai online predictions. Have you been able to customize the “shm-size” parameters ?