I was able to reproduce the issue previously reported with Google Cloud Batch, even without using Cromwell, by running the following command:
gcloud beta batch jobs submit job-mgqwh0ci1 --location us-central1 --config - <<EOD
{"name":"projects/project_id/locations/us-central1/jobs/job-mgqwh0ci1","taskGroups":[{"taskCount":"1","parallelism":"1","taskSpec":{"computeResource":{"cpuMilli":"1000","memoryMib":"512"},"runnables":[{"container":{"imageUri":"quay.io/aptible/hello-world:latest","entrypoint":"","volumes":[]}}, {"container":{"imageUri":"public.ecr.aws/amazonlinux/amazonlinux:latest","entrypoint":"","volumes":[]}}, {"container":{"imageUri":"gcr.io/google-containers/busybox","entrypoint":"","volumes":[]}}, {"container":{"imageUri":"us-docker.pkg.dev/cloudrun/container/hello-job","entrypoint":"","volumes":[]}}],"volumes":[]}}],"allocationPolicy":{"instances":[{"policy":{"provisioningModel":"STANDARD","machineType":"e2-medium"}}]},"logsPolicy":{"destination":"CLOUD_LOGGING"}}
EOD
The generated internal setup script fails to configure the Google Artifact Registry properly because it inserts a newline instead of a comma in the Docker registry list section. This causes a syntax error at line 37 of the automatically generated script.
Excerpt from the faulty script:
REGISTRIES=gcr.io
us-docker.pkg.dev
It should instead be:
REGISTRIES=gcr.io,us-docker.pkg.dev
This syntax error prevents the script from correctly registering the Artifact Registry (us-docker.pkg.dev) as a valid Docker credential helper target. As a result, pulling images from Artifact Registry fails with an authentication or configuration error.
Reproduction steps:
-
Run the provided gcloud beta batch jobs submit command.
-
Observe that the Batch job fails during container initialization.
-
Inspect the job logs — note the incorrect line break in the Docker credential helper configuration.
Expected behavior:
The REGISTRIES variable should include all registries separated by commas, e.g.:
REGISTRIES=gcr.io,us-docker.pkg.dev
Impact:
This bug prevents Google Cloud Batch from pulling images hosted in Google Artifact Registry when more than one registry is configured.
Environment:
-
Service: Google Cloud Batch (Beta)
-
Region: us-central1
-
Example job ID: job-mgqwh0ci1
-
Reproduced without Cromwell
Suggested fix:
Ensure that the script generating the REGISTRIES variable joins multiple registries using commas rather than newlines.
The whole script:
# This script helps to install docker credential helper for jobs using GCR or
# GAR images.
# It has been tested successfully with the following operating systems:
# - Debian GNU/Linux 10 (buster, amd64 built on 20220317, supports Shielded VM features)
# - Debian GNU/Linux 11 (bullseye, amd64 built on 20220822, supports Shielded VM features)
# - Debian GNU/Linux 11 (bullseye, arm64 built on 20230912, supports Shielded VM features)
# - CentOS 7 (x86_64 built on 20220406, supports Shielded VM features)
# - Rocky 8 (x86_64 built on 20240111, supports Shielded VM features)
# It may or may not work correctly with other operating systems and versions.
docker_credential_gcr_installed() {
echo \"[BATCH Docker Credential GCR Installation]: Checking for existing docker-credential-gcr installation.\"
docker_credential_gcr_version=$(docker-credential-gcr version 2> /dev/null || echo \"\")
if [ ! -z \"$docker_credential_gcr_version\" ]; then
echo \"[BATCH Docker Credential GCR Installation]: Found docker-credential-gcr with version $docker_credential_gcr_version.\"
return 0
fi
echo \"[BATCH Docker Credential GCR Installation]: No docker-credential-gcr found, will install.\"
return 1
}
checkDockerImagePullError() {
result=$?
if [ $result != 0 ]; then
echo \"[Batch Action] Docker image pull error.\"
return $result;
fi
}
configure_docker_credential_helper() {
OSID=\"$(. /etc/os-release && echo \"$ID\")\"
OSVERSION=\"$(. /etc/os-release && echo \"$VERSION_ID\")\"
if [ ! \"$OSID\" = \"debian\" ] && [ ! -f /etc/centos-release ] && [ ! -f /etc/rocky-release ] && ! grep -qi cos /etc/os-release; then
echo '[Batch Docker Credential Helper Installation] Warning: this is not a tested operating system, the installation may fail.'
fi
REGISTRIES=gcr.io
us-docker.pkg.dev
if [ -f /etc/centos-release ]; then
if [ ! -x /usr/bin/python3 ]; then
# CentOS 7 does not have pre-installed python3.
yum install -y python3
fi
fi
# docker-credential-gcr is pre-installed in Batch custom images.
if ! grep -qi cos /etc/os-release; then
if ! docker_credential_gcr_installed; then
MACHINE=\"$(uname -m)\"
if [ \"$OSID\" = \"rocky\" ] && [ \"${OSVERSION%%.*}\" = \"8\" ]; then
# Use python3.9 as Rocky Linux 8 pre-installed python3.6 which does not work with CLOUD SDK.
echo '[Batch Docker Credential Helper Installation]: Install python3.9 for Rocky Linux 8.'
dnf install -y python39
CLOUDSDK_PYTHON=/usr/bin/python3.9 gsutil cp gs://batch-agent-prod-us/docker-credential-gcr-tool/docker-credential-gcr-$MACHINE.tar.gz docker-credential-gcr.tar.gz
else
CLOUDSDK_PYTHON=/usr/bin/python3 gsutil cp gs://batch-agent-prod-us/docker-credential-gcr-tool/docker-credential-gcr-$MACHINE.tar.gz docker-credential-gcr.tar.gz
fi
tar -xzf docker-credential-gcr.tar.gz
chmod +x docker-credential-gcr
cp docker-credential-gcr /usr/local/bin/
if [ -x /usr/local/bin/docker-credential-gcr ]; then
echo \"[BATCH Docker Credential Helper]: Docker Credential Helper installed.\"
else
echo \"[BATCH Docker Credential Helper]: Docker Credential Helper installation failed.\"
return 1 # Indicate failure
fi
fi
if [ ! -z $REGISTRIES ]; then
PATH=$PATH:/usr/local/bin docker-credential-gcr configure-docker --registries=$REGISTRIES
fi
PATH=$PATH:/usr/local/bin docker pull gcr.io/google-containers/busybox
if ! checkDockerImagePullError; then return 1; fi
PATH=$PATH:/usr/local/bin docker pull public.ecr.aws/amazonlinux/amazonlinux:latest
if ! checkDockerImagePullError; then return 1; fi
PATH=$PATH:/usr/local/bin docker pull quay.io/aptible/hello-world:latest
if ! checkDockerImagePullError; then return 1; fi
PATH=$PATH:/usr/local/bin docker pull us-docker.pkg.dev/cloudrun/container/hello-job
if ! checkDockerImagePullError; then return 1; fi
else
if [ ! -z $REGISTRIES ]; then
sudo HOME=/var/lib/google docker-credential-gcr configure-docker --registries=$REGISTRIES
fi
sudo HOME=/var/lib/google docker pull gcr.io/google-containers/busybox
if ! checkDockerImagePullError; then return 1; fi
sudo HOME=/var/lib/google docker pull public.ecr.aws/amazonlinux/amazonlinux:latest
if ! checkDockerImagePullError; then return 1; fi
sudo HOME=/var/lib/google docker pull quay.io/aptible/hello-world:latest
if ! checkDockerImagePullError; then return 1; fi
sudo HOME=/var/lib/google docker pull us-docker.pkg.dev/cloudrun/container/hello-job
if ! checkDockerImagePullError; then return 1; fi
fi
}
# Try a given function with n times including retries with exponential back off.
# function signature: retry retryTimes description functionName
retry() {
local tries=$1
local count=0
local description=$2
local wait=1
until \"$3\"; do
exit=$?
# If failed, wait 3 ** count seconds until next retry.
wait=$(($wait*3))
count=$(($count + 1))
if [ $count -lt $tries ]; then
echo \"[Batch Action] $description exited $exit (tried $count/$tries), retrying in $wait seconds.\"
sleep $wait
else
echo \"[Batch Action] $description exited $exit (tried $count/$tries), no more retries left.\"
return $exit
fi
done
echo \"[Batch Action] $description succeeded.\"
return 0
}
retry 4 \"Docker credential helper\" configure_docker_credential_helper