1 request per pod

magenti · February 10, 2023, 3:14pm

I implemented a pipeline in GKE with several pods communicating each other. One of these pod (call it “A”) can send multiple requests to another pod (“B”).

The service implemented in B is quite slow. This is why I would like to ensure that B scales horizontally so that it is guaranteed that in each replica of B there is always one and only one service instance running. In any moment in time, there should be as many pods active as active requests.

I tried to implement a HPA logic on memory or cpu however, if the requests to B are too close to each other in time, the requests are directed to the same pod.

How do I make sure that every pod serves one and only one request?

Topic		Replies	Views
spinoff pod/container for every request Serverless Applications gke	1	0	March 28, 2023
Would GKE fit in my use case - Unique Application scaling needs Serverless Applications gke	2	11	November 12, 2024
How Can I Configure Auto-scaling for My Google Kubernetes Engine (GKE) Cluster? Serverless Applications gke-enterprise	1	20	August 21, 2024

1 request per pod

AI Suggested topics