I’m evaluating Google Cloud Vertex AI (Generative AI) for a product
that requires strict data residency in Japan.
I understand from official documentation that:
- Data at rest remains in the selected region
- Regional endpoints exist (e.g. asia-northeast1)
- Global endpoints do not guarantee processing location
However, I cannot find any official documentation that clearly states
whether inference processing is guaranteed to stay within the region
when using Pay-as-you-go pricing.
Some community articles claim that:
- Pay-as-you-go does NOT guarantee regional processing
- Provisioned Throughput (GSU) is required to guarantee it
But I cannot find an official Google document that explicitly says this.
Questions:
- Is inference processing guaranteed to stay within the selected region
when using Pay-as-you-go with a Regional Endpoint? - Is there any official documentation that states Pay-as-you-go does NOT
guarantee regional processing? - Does Provisioned Throughput change processing location guarantees,
or only capacity/latency characteristics?
Any official clarification or links to authoritative documentation
would be greatly appreciated.