Document AI Custom Extractor – training fails with “Evaluation ... had no global metrics” using FormField parent/child schema for empty fields

Hi everyone,
I’m building a Custom Extractor on Google Document AI to support an iOS app that fills empty PDF forms. I’ve hit a persistent training error and would love some guidance from the DocAI team or anyone who has modeled similar schemas.


Goal

I want the Custom Extractor to:

  • Detect each form field on an empty PDF.
  • Return:
    • the field’s label (“Full Name”, “Date of Birth”, etc.),
    • the field type (text / checkbox / signature / date…),
    • the value bounding box (where the user writes/signs/checks),
    • the label bounding box.

The iOS client already knows how to overlay and fill fields if it gets these bounding boxes, so the Custom Extractor is effectively a layout detector for fields.


Schema I’m using

Custom Extractor schema (Workbench):

  • Parent entity: FormField

    • Method: EXTRACT
    • Occurrence: Optional multiple
    • Description: A single empty or filled input field on a PDF form. Includes its value region and optional label nearby.
  • Child entities (properties) of FormField:

    • label
      • Data type: Plain text
      • Method: Extract
      • Occurrence: Required once
      • Description: visible label text near the field.
    • fieldType
      • Data type: Plain text
      • Method: Derive
      • Occurrence: Required once
      • Description: enum like text, number, checkbox, signature, etc.

I previously had:

- labelBoundingBox

- valueBoundingBox

as additional child properties (with only geometry in pageAnchor), but those caused dataset validation issues (normalized_vertices missing), so I removed them and moved the geometry to pageAnchor only.


Labeled Document structure

I generate labeled documents programmatically (to import as “pre-labeled documents”) based on my app’s FormTemplate model (labelRect/valueRect).

Each real field becomes exactly one FormField entity:


{

  "id": "current_name_last",

  "type": "FormField",

  "mentionText": "text Current Name-Last",

  "pageAnchor": {

    "pageRefs": \[

      {

        "page": 0,

        "layoutType": "VISUAL_ELEMENT",

        "boundingPoly": {

          "normalizedVertices": \[

            { "x": 0.42, "y": 0.19 },

            { "x": 0.60, "y": 0.19 },

            { "x": 0.60, "y": 0.23 },

            { "x": 0.42, "y": 0.23 }

          \]

        }

      }

    \]

  },

  "confidence": 1,

  "properties": \[

    {

      "id": "current_name_last_label",

      "type": "label",

      "mentionText": "Current Name-Last",

      "pageAnchor": {

        "pageRefs": \[

          {

            "page": 0,

            "layoutType": "VISUAL_ELEMENT",

            "boundingPoly": {

              "normalizedVertices": \[

                { "x": 0.10, "y": 0.19 },

                { "x": 0.30, "y": 0.19 },

                { "x": 0.30, "y": 0.23 },

                { "x": 0.10, "y": 0.23 }

              \]

            }

          }

        \]

      },

      "textAnchor": {

        "textSegments": \[

          { "startIndex": "169", "endIndex": "187" }

        \]

      },

      "confidence": 1

    },

    {

      "id": "current_name_last_fieldType",

      "type": "fieldType",

      "mentionText": "text",

      "pageAnchor": {

        "pageRefs": \[

          {

            "page": 0,

            "layoutType": "VISUAL_ELEMENT",

            "boundingPoly": {

              "normalizedVertices": \[

                { "x": 0.42, "y": 0.19 },

                { "x": 0.60, "y": 0.19 },

                { "x": 0.60, "y": 0.23 },

                { "x": 0.42, "y": 0.23 }

              \]

            }

          }

        \]

      },

      "textAnchor": {},

      "confidence": 1

    }

  \]

}

Invariants:

  • Exactly one FormField per visual field.
  • Exactly two children: label and fieldType.
  • FormField.pageAnchor = value area.
  • label.pageAnchor = label area.
  • fieldType.pageAnchor = currently reuses the value box, to satisfy geometry requirements.
  • layoutType = “VISUAL_ELEMENT” everywhere.
  • All normalizedVertices are in [0, 1] and form a polygon with 4 vertices.
  • If label text appears in OCR, label.textAnchor.textSegments points correctly into document.text.

I validate the JSON in my Python script so the only entity types emitted are:

  • FormField
  • label
  • fieldType

No more labelBoundingBox / valueBoundingBox types.


Dataset and environment

  • Project: (PII Removed by Staff)
  • Location: us
  • Processor: projects/(PII Removed by Staff)/locations/us/processors/9fd390ef14183b57
  • ~20 labeled docs in the dataset:
    • 10 empty templates
    • 10 filled versions of those templates
  • Split: training 16, test 4
  • Label counts in Workbench dataset UI:
    • FormField: 1140
    • label: 1140
    • fieldType: 1140

I successfully trained a different Custom Extractor earlier in the same project and region using the same PDFs but a much simpler schema, so project-level permissions / VPC / metadata server shouldn’t be an issue (but you can never be sure).


Errors

When I first added fieldType as a child without geometry, I got a clear dataset validation error in the operation metadata, e.g.:

Invalid document. field_name: "entities.page_anchor.page_refs.bounding_poly.normalized_vertices"
annotation_name: "FormField/fieldType"
num_fields_needed: "3", num_fields: "0"

I fixed that by giving fieldType.pageAnchor a proper boundingPoly.normalizedVertices (reusing the value bbox). After that, dataset validation errors disappeared.

However, training still fails with error code 13 and this message in Cloud Audit logs:

"status": {
  "code": 3,
  "message": "Evaluation with ID `cde-harvester-pipeline_0_0` had no global metrics; Failed to compute ProcessorVersion metadata. pv_id = \"af53767fd2d339f9\""
}
"methodName": "google.cloud.documentai.uiv1beta3.DocumentProcessorService.TrainProcessorVersion"

The operation itself looks like:

{
  "name": "projects/(PII Removed by Staff)/locations/us/operations/12633571671074354259",
  "done": true,
  "error": {
    "code": 13,
    "message": "Internal error encountered.",
    "details": []
  },
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.documentai.uiv1beta3.TrainProcessorVersionMetadata",
    "commonMetadata": {
      "state": "FAILED",
      "createTime": "2025-11-15T13:41:34.357306Z",
      "updateTime": "2025-11-15T13:47:40.973766Z",
      "resource": "projects/(PII Removed by Staff)/locations/us/processors/9fd390ef14183b57/processorVersions/af53767fd2d339f9"
    },
    "trainingDatasetValidation": {},
    "testDatasetValidation": {}
  }
}

So the training pipeline runs far enough to do evaluation, but:

“Evaluation … had no global metrics; Failed to compute ProcessorVersion metadata.”

This sounds like the evaluator couldn’t compute F1/precision/recall for any label type in the test split.


What I’ve already tried

  • Regenerated all labeled JSON from scratch and re-imported into a fresh dataset.
  • Validated via script that:
    • Only FormField, label, fieldType entity types exist.
    • Every entity and child has pageAnchor.pageRefs[0].boundingPoly.normalizedVertices with 4 points.
    • Polygons are normalized and within [0, 1].
  • Tried both Template-based and Model-based trainers.
  • Tried training with:
    • Only empty forms
    • Empty + filled pairs
  • Confirmed that another Custom Extractor with basic labels can train successfully in the same project/region.

At this point, the only consistent difference is the nested schema design (parent/child) and the way we model geometry vs. text.


Questions for the DocAI team \ community

  1. What exactly does “Evaluation with ID … had no global metrics” mean for Custom Extractor?
  • Does it imply that, for the evaluation split, there are zero valid ground-truth labels that the evaluator considers (e.g., because they’re all nested or derive-only)?
  • Or can it also be triggered by internal infra issues?
  1. Is a parent-child schema (FormField → [label, fieldType]) supported and recommended for this kind of layout task?
  • Are evaluation metrics computed only on leaf Extract text fields (like label)?
  • Are there constraints around how parent entities with properties participate in training/evaluation?
  1. Is there a recommended pattern or sample for modeling “field + label + geometry” for empty form fields?
    For example:
  • Should I avoid nested properties entirely and just have a single flat FormField label with text+geometry?
  • Is it acceptable to store extra semantics like fieldType in metadata instead of as a child entity?
  1. Is there a way to get more detailed evaluation logs for this processor version?
  • For example, per-label-type counts of ground-truth vs predicted instances, or whether the evaluator is ignoring certain types.

If it’s helpful, I can provide:

  • A minimal JSON example of one labeled document (with a few fields).
  • The Python script I’m using to generate the labeled Document JSON from my form templates.

Thanks a lot,
I’m trying to design a clean, reusable dataset format for form templates, but right now I feel like I’m fighting the evaluation pipeline without enough visibility into what it expects.