Triton on Vertex AI does not support multiple models?

anonymous · August 25, 2022, 2:15pm

Currently, I want to deploy a Triton server to Vertex AI endpoint. However I received this error message.

“failed to start Vertex AI service: Invalid argument - Expect the model repository contains only a single model if default model is not specified”

Is this mean that the Triton server deploy only support one model? It is different from what I have read in this document about concurrent model execution

https://cloud.google.com/vertex-ai/docs/predictions/using-nvidia-triton

Eduardo_Ortiz · September 7, 2022, 8:44pm

The error message suggest that you haven’t selected a default model.

anonymous · October 17, 2022, 2:18pm

Hi, I have the same issue and I couldn’t find how to set a default model. Could you please link a guide about it or explain how to do that? Thanks

mohammedtameem · April 26, 2023, 6:54am

As specified in the documentation, ensure that you provide the flag

--container-args='--strict-model-config=false'

While importing it into model registry as follows:

gcloud ai models upload \
  --region=LOCATION \
  --display-name=DEPLOYED_MODEL_NAME \
  --container-image-uri=LOCATION-docker.pkg.dev/PROJECT_ID/getting-started-nvidia-triton/vertex-triton-inference \
  --artifact-uri=MODEL_ARTIFACTS_REPOSITORY \
 **--container-args='--strict-model-config=false'**

anonymous · July 21, 2023, 7:51pm

Hi @Eduardo_Ortiz ,

Can you provide documentation on how we can set the default model for a triton ensemble?

I did not see any references to this in these Vertex AI docs, and it doesn’t seem like “default model” is an Nvidia Triton concept??

anonymous · July 21, 2023, 8:23pm

Looks like we can set the default model for vertex ai via the vertex-ai-default-model flag (source code).

I.e.,

tritonserver --model-repository $MODEL_REPO --vertex-ai-default-model={DEFAULT_MODEL}

mbtki · July 28, 2023, 1:02pm

Cool ! so i understand that you can only use one model at a time.

For information i was able to run one model but the way we query the vertex ai endpoint doesn’t allow us to choose a specific model. So i guess that using Triton with multiple model is not supported for now ?

mbtki · September 6, 2023, 4:21pm

Still not possible to use a ensemble model ? it doesn’t work for now

anonymous · September 7, 2023, 10:36pm

I was able to set up an ensemble model.

See my comment here: https://www.googlecloudcommunity.com/gc/AI-ML/Triton-on-Vertex-AI-does-not-support-multiple-models/m-p/614554/highlight/true#M2424

mbtki · September 8, 2023, 7:15am

It’s mainly an issue of shared memory size not customizable when running a vertex ai online predictions. Have you been able to customize the “shm-size” parameters ?

There is an open ticket VertexAI does not allocate enough shared memory to run Triton containers [278045294] - Visible to Public - Issue Tracker (google.com)

anonymous · September 8, 2023, 3:38pm

No I have not. Not ideal at all.

To work around, I shrank shared memory usage via backend-config flag. I.e. --backend-config=python,shm-default-byte-size=15728640

Again, not ideal, especially given default shm-size is quite small

Topic		Replies	Views
Does Vertex AI support multi model endpoints Custom ML & MLOps vertex-ai-platform	8	122	July 25, 2023
Does Vertex AI support multiple model instances in Same Endpoint Node. Custom ML & MLOps automl , vertex-ai-platform	1	42	October 14, 2022
Vertex AI Model Deployment Error Custom ML & MLOps vertex-ai-model-registry	3	76	February 21, 2023

Triton on Vertex AI does not support multiple models?

AI Suggested topics