I’ve been working on calculating the costs associated with running a vector search system, and I’d love to get your thoughts and suggestions on more cost-effective alternatives.
reference:https://cloud.google.com/vertex-ai/pricing#vectorsearch
My Current Setup:
Number of Records: 10,000 - json records with description key
Embedding Dimensions: 768
Machine Type: e2-standard-16 (is e2-standard-2 sufficient for my current setup)
Cost per Node Hour: $0.75
Cost Breakdown:
Data Size Calculation:
Data Size = 10,000 records × 768 dimensions × 4 bytes = ~0.0286 GiB
Given that this cost seems quite high, especially for smaller-scale projects, I’m looking for recommendations on how to reduce these expenses. Are there any alternative setups, different machine types, or other optimizations that could help bring down the overall cost?
Any suggestions or experiences you can share would be greatly appreciated!
Let’s explore some strategies to optimize costs for Google Cloud’s vector search solutions.
Data Size Calculation - As you mention ~0.0286 GiB was your data size being said this may be categorized to shards_size_small, the machine types that you can use to deploy your index (using public endpoints or using VPC endpoints) depends on the shard size of the index. Based on the image below machine type support you may use e2-standard-2.
@McMaco , Thank you for the above info .The insight wrt machine selection based on shard size is really helpful.But my concern is regarding serving cost which is 24730
I checked the pricing calculator, but there is no information on serving costs. The documentation mentions serving costs as being calculated on a 24730 days basis, which seems quite expensive (no pay-as-you-go model). Given this, the costs appear to be high, especially for small-scale projects.
1)Does vertex ai matching engine service offer any pay-as-you-go serving cost options ?
For the above setup, what would be the total cost including serving?
Any alternative vector db setups with in gcp that could help reduce the overall cost?
The estimated monthly serving cost is directly related to the number of nodes used in the console. To learn more about configuration parameters that affect cost, see Configuration parameters which affect recall and latency.
2. I used a pricing calculator given the data you’d mention, you can double check it for reference.
3. To reduce the overall cost by using alternative GCP products is a great idea. But we need to keep in mind that every product has its own composition to become suitable in our needs. We have Alloy DB and BigQuery for vector implementation.
If you’d like to talk to a support representative about a Cloud Billing question or a billing-related issue, visit our Cloud Billing Support page for contact options.