I’m trying to use Vertex AI to generate embeddings.
Making a request to POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/textembedding-gecko:predict works as per this documentation.
However, I can’t seem to retrieve embeddings from a model running in a region closer to asia.
I have tried the following API call, POST https://asia-east1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/asia-east1/publishers/google/models/textembedding-gecko:predict but I get a 404 error saying Publisher Mode not found
Do I need to do something to enable Vertex AI API in other regions?
Good day @arshadali172 ,
Welcome to Google Cloud Community!
Based on the documentation, as of now, you can only get text embeddings in the us-central1 region, please note that some models or services may not be available in some regions in this case asia-east1, that is why you are getting a 404 error. For example, when you are tuning a language model, the supported tuning locations are only in the regions us-central1 and europe-west4. https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models
Hope this helps!
I used embeddings for Vertex AI Matching Engine. The Matching Engine is located at the VPC in us-central1. So, maybe you need to create a peering service.
There is a notebook called central-1.sdk_matching_engine_for_indexing.ipynb that suggests to do the following:
Create a VPC network
gcloud compute networks create {VPC_NETWORK} --bgp-routing-mode=regional --subnet-mode=auto --project={PROJECT_ID}
Add necessary firewall rules
gcloud compute firewall-rules create {VPC_NETWORK}-allow-icmp --network {VPC_NETWORK} --priority 65534 --project {PROJECT_ID} --allow icmp
gcloud compute firewall-rules create {VPC_NETWORK}-allow-internal --network {VPC_NETWORK} --priority 65534 --project {PROJECT_ID} --allow all --source-ranges 10.128.0.0/9
gcloud compute firewall-rules create {VPC_NETWORK}-allow-rdp --network {VPC_NETWORK} --priority 65534 --project {PROJECT_ID} --allow tcp:3389
gcloud compute firewall-rules create {VPC_NETWORK}-allow-ssh --network {VPC_NETWORK} --priority 65534 --project {PROJECT_ID} --allow tcp:22
Reserve IP range
gcloud compute addresses create {PEERING_RANGE_NAME} --global --prefix-length=16 --network={VPC_NETWORK} --purpose=VPC_PEERING --project={PROJECT_ID} --description=“peering range”
Set up peering with service networking
Your account must have the “Compute Network Admin” role to run the following.
gcloud services vpc-peerings connect --service=servicenetworking.googleapis.com --network=“ucaip-haystack-vpc-network” --ranges=“ucaip-haystack-range” --project=“amiable-venture-335811”
However, I don’t know if the gRPC service will allow API calls from outside us-central1