Hi, when I run GCP batch jobs with cloud logging, there is only logging available when a task stopped running (either failed or succeeded), how can I see the log as the task is running? This would be very helpful since
1.it allows us to monitor the running of tasks
2.if a task fails with exit status 137 (memory usage issue), all intermediate logging will be lost. If we can see the intermediate logs as the task is running, this can be prevented.
For task running logging, in your runnable, you can change you code to print logs you wanted to show on cloud logging to stdout/stderr. Batch will forward all the stdou/stderr to cloud logging immediately.For example, if you are running a shell script, just do echo âthe log you want to printâ will do the trick.
For this issue, you can print out more logs in your running code. But this do require changing your code.
For the CLOUD_LOGGING out of order issue, if you are running multiple tasks in parallel, different tasks logging could be present intertwined. If thatâs not the case, could you give me your jobUID and region, I can do some check for you.
The task I am running is a python executable and in my python script, my statement such as print(âhello worldâ) still only appears after the task completes. Is this expected? I thought print(âhello worldâ) already prints by default to stdout
2.I am referring that, for the same task the logs are out of order.
Okay, Could I know what type of runnable are you using? Container or Script. If you can share the jobUID and region or any fake job spec, thatâll be really helpful for root causing the issue.
Looks like there is some buffering mechanism in python. By default, Python buffers its stdout. When the buffer fills up or when the program finishes (and the buffer is flushed), youâll see the output.
To work around I can think of the following options
Option 1: add environment variable âPYTHONUNBUFFERED=1â to the Batch job spec like the following example
Thanks a lot for the reply! And I am very glad that the solution works, really appreciate it! Just make sure I understand the cause, is this a GCP Batch specific issue? If I run the python program locally or run a docker locally, the prints are immediate without setting any extra flag.
Sorry missed your question. when I use docker run docker_image I can reproduce the bufferred behavior (Iâm running on a linux, different kernal may also have different behaviors). Did you run docker using different OS? And currently all the VM supported by Batch is using Linux.
Also I tried if I use docker run -it docker_image, Iâll get the output in real time. Because -it will treat docker in the interactive mode. And the buffer size in that mode is very small.