gcloud batch jobs list --filter

Does anyone know how to list just failed jobs via gcloud batch jobs list? I’ve tried:

gcloud batch jobs list --location=us-west1 --sort-by="~createTime" --filter="status=FAILED" --limit 5

and:

gcloud batch jobs list --location=us-west1 --sort-by="~createTime" --filter="status.state=FAILED" --limit 5

…but neither work.

1 Like

I gave up on trying to use the filter mechanisms of both gcloud and the python library… Here’s how I do a basic search…

import arrow
from google.api_core.retry import Retry
from google.cloud import batch_v1 as batch

older_than = arrow.utcnow().shift(days=-7)

client = batch.BatchServiceClient()
request = batch.ListJobsRequest(
    parent=f'projects/{YOUR_GCP_PROJECT_ID}/locations/{YOUR_GCP_REGION}',
    filter='status.state="FAILED"',
    page_size=1000,
)
for job in client.list_jobs(request, retry=Retry()):
    if arrow.get(job.update_time.rfc3339()) < older_than:
        # do something
        pass
1 Like

Does this work?

gcloud batch jobs list --location=us-west1 --sort-by="~createTime" --filter='Status.State="FAILED"' --limit 5
1 Like

@aegolden how is a user supposed to know that --filter='Status.State="FAILED"' is the solution? The cli docs state:

--filter=EXPRESSION
Apply a Boolean filter EXPRESSION to each resource item to be listed.
If the expression evaluates True, then that item is listed. For more
details and examples of filter expressions, run $ gcloud topic filters.
This flag interacts with other flags that are applied in this order:
--flatten, --sort-by, --filter, --limit.

…and gcloud topic filters doesn’t seem to provide much help on what fields can actually be used, and how to format those fields (e.g., the capital “S” requirement in “Status.State”).

Even GPT4 doesn’t get it right: gcloud batch jobs list --location=[LOCATION] --filter='status:FAILED'

1 Like

Also, how does one get the log for any of the jobs returned by gcloud batch jobs list? The following does not return anything:

gcloud logging read "resource.type=batch_job AND resource.labels.job_id='YOUR_JOB_ID'" --limit 50 --format "table(timestamp, textPayload)"
1 Like

Closest I’ve gotten with getting logs/state/config of a job with gcloud is

gcloud batch jobs describe BATCH_JOB_NAME --location=GCP_LOCATION --project GCP_PROJECT_ID

Anymore more than that and I go use the google-cloud-logging python package

1 Like

I wanted to see if Google’s Gemini could help with such questions, but even though it is a Google product, it seems to do quite poorly at answering even simple questions about the GCP API.

Q: Here’s how you can use the GCP API to list failed Batch jobs:

A:


import datetime
from google.cloud import batch_v1

# Set your project ID and region
project_id = 'your-project-id'
region = 'your-region'

# Create Batch client
batch_client = batch_v1.BatchClient(client_options={'api_endpoint': f'{region}-batch.googleapis.com:443'})

# List failed jobs
parent = f"projects/{project_id}/locations/{region}"
filter = "status = \"FAILED\""
jobs = batch_client.list_jobs(parent=parent, filter=filter).jobs

# Print the names of failed jobs
print("Failed jobs:")
for job in jobs:
print(job.name)

Error:

Traceback (most recent call last):
File "/workspaces/gcp_llm/bin/gcp_get_batch_logs.py", line 101, in <module>
main(args)
File "/workspaces/gcp_llm/bin/gcp_get_batch_logs.py", line 85, in main
gemini_code(); exit()
File "/workspaces/gcp_llm/bin/gcp_get_batch_logs.py", line 70, in gemini_code
batch_client = batch_v1.BatchClient(client_options={'api_endpoint': f'{region}-batch.googleapis.com:443'})
AttributeError: module 'google.cloud.batch_v1' has no attribute 'BatchClient'

Even if I edit the code to use batch_v1.BatchServiceClient, the code is still incorrect:

Traceback (most recent call last):
File "/workspaces/gcp_llm/bin/gcp_get_batch_logs.py", line 101, in <module>
main(args)
File "/workspaces/gcp_llm/bin/gcp_get_batch_logs.py", line 85, in main
gemini_code(); exit()
File "/workspaces/gcp_llm/bin/gcp_get_batch_logs.py", line 75, in gemini_code
jobs = batch_client.list_jobs(parent=parent, filter=filter).jobs
TypeError: list_jobs() got an unexpected keyword argument 'filter'
1 Like