Vertex AI Generative AI: Is inference location guaranteed within region for Pay-as-you-go?

I’m evaluating Google Cloud Vertex AI (Generative AI) for a product
that requires strict data residency in Japan.

I understand from official documentation that:

  • Data at rest remains in the selected region
  • Regional endpoints exist (e.g. asia-northeast1)
  • Global endpoints do not guarantee processing location

However, I cannot find any official documentation that clearly states
whether inference processing is guaranteed to stay within the region
when using Pay-as-you-go pricing.

Some community articles claim that:

  • Pay-as-you-go does NOT guarantee regional processing
  • Provisioned Throughput (GSU) is required to guarantee it

But I cannot find an official Google document that explicitly says this.

Questions:

  1. Is inference processing guaranteed to stay within the selected region
    when using Pay-as-you-go with a Regional Endpoint?
  2. Is there any official documentation that states Pay-as-you-go does NOT
    guarantee regional processing?
  3. Does Provisioned Throughput change processing location guarantees,
    or only capacity/latency characteristics?

Any official clarification or links to authoritative documentation
would be greatly appreciated.

2 Likes