Gemini-1.5-flash 32k Context UK Only

Hi,

As listed in the ML processing page for Gemini models https://cloud.google.com/vertex-ai/generative-ai/docs/learn/data-residency the gemini-1.5-flash (32k) is available specifically for UK processing.

I am wondering if it is a 100% guarantee that if europe-west2 is specified as the vertex ai region and the countTokens < 32k it can guarantee usage. I would expect a different model name like “gemini-1.5-flash-32k” instead of using the same model name as the EU Multi-region one.

In my testing passing > 32k token count still runs (presumably in the multi-region) when I would prefer to have the ability to choose the 32k model only and fail any requests exceeding 32k.

Can you provide any information on this?