Hey guys, I have problem similar to described here. From time to time our consumer receive message 60 seconds old.
How we found the issue. We covered the whole path of message from pub/sub to publishing downstream with App Metrics and Grafana dashboards.
We are using Google.Cloud.PubSub.V1 nuget pack and our app is written in .net.
What I have checked so far:
- Consumer app back pressure - no, at the time for every occurrence there is relatively low consumption rate. Some times even the rate of messages are 5 to 10 per minute.
- Application locking issue that might delay processing - it does not look to be the case. We measure processing delay, p99 is within ms.
- Ordering of subscription is disabled
- Exactly one is disabled as well
- I have access to pub/sub metrics - I manage to find correlation between, 60 seconds delays in our consumption lag and “expired_ack_deadlines_count” measurement of google pub sub.
- I decrease “AckDeadline” in SubscriberClient.Settings from 60 to 10 seconds and did not see any spikes bigger than 10 seconds. Which is not ideal for us but it seems affected those delays.
all of this makes me believe message stuck in the Subscription Client and is never processed until expired and redelivered.
Any advice here? Any advice what else I could check that can give as a clue?
Hi @PetarVG ,
Welcome to Google Cloud Community!
Based on what you’ve found, the correlation between expired_ack_deadlines_count and the 60-second delays suggests that messages might be getting stuck in the Subscription Client until they expire and are redelivered.
Here are a few additional things you might want to check:
- Parallel Pull Count: If your client is using multiple streams, messages might be held in the client library until some outstanding messages are acknowledged. Try setting Parallel Pull Count to 1 and see if that changes the behavior.
- Max Ack Extension: If the max ack extension is set too low, messages might expire before they can be processed. You could check if adjusting this setting helps.
- Transient Server-Side or Network Issues: Sometimes, messages might be sent from the server but fail to be delivered in time due to network issues, causing the ack deadline to expire.
- ModifyAckDeadlineRequest: You might want to explore whether modifying the ack deadline dynamically based on message processing time could help.
Reducing the AckDeadline to 10 seconds noticeably decreased the delay spikes, indicating that the problem is likely connected to how messages are retained before processing. You might find this case helpful, as other users have experienced similar issues and shared valuable insights.
Additionally, there is an existing bug that can cause message processing delays by leaving messages unacknowledged until the ack deadline expires, leading to delays of approximately 60 seconds. You can also check the release notes for any latest updates or new features related to Pub/Sub.
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.