I have a mongocdc template that is running yet i am unable to experience realtime in my bigquery table. Even thought the data is written to bigquery table, it is way behind the actual time (like even weeks behind). How do i resolve such issue with pubsub? To improve this issue I have tried to improve Publisher Throughput through parallelization and i can’t see improve for the moment. (perhaps i am wrong in my computation)
Finally, could anyone interpret the pubsub monitoring graph below for me ?
Note : From the source I received 5s data from at least 84000 devices.
There are several things you can do to optimize Pub/Sub publisher throughput and reduce latencies:
Batching: Pub/Sub client libraries are designed to send multiple messages in a batch to reduce the number of network calls. Ensure that you are publishing multiple messages in a batch rather than one by one.
Enable Flow Control: The client libraries for Publisher provide options for setting flow control settings. Flow control limits the number of outstanding messages (the maximum number of messages that have been sent but not yet acknowledged) and maximum outstanding bytes. Adjusting these settings might help manage the memory usage of your publisher application and can possibly lead to better throughput.
Tune Machine Types: If you are running the publisher on Google Cloud, you may want to experiment with different machine types to find the one that best suits your workload. Some machine types may have more CPU, memory, or network capacity and can handle more throughput.
Optimize Network Settings: Network settings like TCP parameters and send/receive buffers can also affect the performance of the publisher. You might want to look into tuning these parameters for optimal performance.
Upgrade client libraries: Ensure you are using the latest version of the Google Cloud Pub/Sub client libraries. Newer versions often include performance improvements and bug fixes.
Asynchronous Publishing: If you’re not already doing so, consider using asynchronous publishing. This allows the publisher client to continue sending messages as it waits for acknowledgements for earlier ones, potentially improving throughput.