Cloud Run create instance with a less than 1 req/sec

Hi all, we’re seeing weird behavior with Cloud Run instances, creating instances with less than 1 req/sec. We tested with many configurations using the YAML file and revisions interface, but we still got the same behavior.

Currently, the minimum of instances has been set to 1, and the max is 60. The worst part is when starting a new instance, all requests return a 429 error there is no instance available… With 18 instances running…

So, this is our configuration:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: <hidden>
  namespace: '<hidden>'
  selfLink: /apis/serving.knative.dev/v1/namespaces/<hidden>/services/<hidden>
  uid: <hidden>
  resourceVersion: <hidden>
  generation: 26
  creationTimestamp: <hidden>
  labels:
    cloud.googleapis.com/location: us-west1
  annotations:
    run.googleapis.com/client-name: cloud-console
    serving.knative.dev/creator: <hidden>@cloudbuild.gserviceaccount.com
    serving.knative.dev/lastModifier: <hidden>
    client.knative.dev/user-image: gcr.io/<hidden>/github.com/<hidden>/<hidden>:<hidden>
    run.googleapis.com/description: '<hidden>'
    run.googleapis.com/ingress: all
    run.googleapis.com/ingress-status: all
spec:
  template:
    metadata:
      name: <hidden>-00026-wis
      annotations:
        run.googleapis.com/client-name: cloud-console
        client.knative.dev/user-image: gcr.io/<hidden>/github.com/<hidden>/<hidden>:<hidden>
        autoscaling.knative.dev/minScale: '1'
        autoscaling.knative.dev/maxScale: '60'
        run.googleapis.com/cpu-throttling: 'false'
        run.googleapis.com/startup-cpu-boost: 'true'
    spec:
      containerConcurrency: 1000
      timeoutSeconds: 3600
      serviceAccountName: <hidden>-compute@developer.gserviceaccount.com
      containers:
      - image: gcr.io/<hidden>/github.com/<hidden>/<hidden>:<hidden>
        ports:
        - name: http1
          containerPort: 8080
        env:
        - name: <hidden>
          value: <hidden>
        - name: <hidden>
          value: <hidden>
        - name: <hidden>
          valueFrom:
            secretKeyRef:
              key: latest
              name: <hidden>
        - name: <hidden>
          valueFrom:
            secretKeyRef:
              key: latest
              name: <hidden>
        resources:
          limits:
            cpu: 1000m
            memory: 4Gi
  traffic:
  - percent: 100
    latestRevision: true
status:
  observedGeneration: 26
  conditions:
  - type: Ready
    status: 'True'
    lastTransitionTime: '<hidden>'
  - type: ConfigurationsReady
    status: 'True'
    lastTransitionTime: '<hidden>'
  - type: RoutesReady
    status: 'True'
    lastTransitionTime: '<hidden>'
  latestReadyRevisionName: <hidden>-00026-wis
  latestCreatedRevisionName: <hidden>-00026-wis
  traffic:
  - revisionName: <hidden>-00026-wis
    percent: 100
    latestRevision: true
  url: <hidden>
  address:
    url: <hidden>

Finally, sometimes the value of the idle instance goes to 1 when no request, being that it’s in 1 instance as a minimum.

What we’re doing wrong?

According to the error message you are getting, which is:

Show More

“429: No available container instances

The following error occurs during serving:

HTTP 429

The request was aborted because there was no available instance.

The Cloud Run service might have reached its maximum container instance

limit or the service was otherwise not able to scale to incoming requests.

This might be caused by a sudden increase in traffic, a long container startup time or a long request processing time.”

It is suggested to check the “Container instance count” indicator for your service and, if necessary, raise it if your consumption is getting close to the maximum. Consult the “max instance” settings, and ask for a quota increase in case you require more instances.